Replacing a disk in a zpool

Of late, I have been seeing a number of S.M.A.R.T errors from one of my drives in the zpool. The drives are raw device mapped into the FreeNAS virtual appliance giving them the appearance and functionality of real disks in the virtual appliance. This allows for FreeNAS to be able to collect and display S.M.A.R.T errors on the console:

Jul  6 18:04:57 x smartd[462]: Device: /dev/sdb [SAT], 8 Currently unreadable (pending) sectors
Jul  6 18:04:58 x smartd[462]: Device: /dev/sdb [SAT], 8 Offline uncorrectable sectors
[...]
Jul  7 16:34:58 x smartd[462]: Device: /dev/sdb [SAT], 16 Currently unreadable (pending) sectors (changed +8)
Jul  7 16:34:58 x smartd[462]: Device: /dev/sdb [SAT], 16 Offline uncorrectable sectors (changed +8)
[...]

The errors were initially at 2 sectors, and then gradually grew to 32 sectors at which point I decided to get a new drive. Replacing a drive meant that I needed to power down all the VMs on my server since the drive needed to be remapped as a raw device into the FreeNAS VM. Also, I needed to put the disk in the zpool into an offline state.

zpool replace dataVol gptid/

This command, once it completes, would have migrated the data from the drive that needs to be replaced and detached the drive from the dataVol volume.

After shutting down the server, I needed to be sure that the drive I was removing was the one that had been detached. I used the gpt command to obtain the SN of the disk. After replacing the drive, and powering on the server, I needed to mark the drive as a raw device that could be exposed to the FreeNAS VM. It seems that VSphere 5.5 does not allow for creating raw device mappings via the UI. The KB article describes a method involving creating the VMDK using vmkfstools and then using the UI to add an existing virtual disk. This virtual disk then shows up as a raw device to the VM.

vmkfstools -z /vmfs/devices/disks/ /.vmdk

The name of the RAW device is usually obtained from the serial number of the drive so it does help to keep this handy before putting the drive into the ESX server. After creating the raw device on ESX, this device needs to be added to the VM. This can be done when the VM is powered off by adding an existing virtual disk to the VM using the VSphere client.

Once the VM is powered on, the drive needs to be added to the degraded zpool. The data is accessible but the zpool cannot tolerate another drive failure since I had it configured as a RAIDZ1 zpool with 3 disks.

The ID of the old disk can be obtained via the zpool status command and the disk shows up in the offline state. Another helpful command is the zdb command for obtaining the GPT UUID of the disk. For adding the drive back to the zpool, use the zpool replace command to specify the disk to be replaced:

zpool replace dataVol gptid/ gptid/

This command starts the reformatting of the new disk and schedules the resilvering process. In my case, the resilvering took only about 2 hours for a total of 280GB of data present in the zpool.

Leave a Reply

Your email address will not be published. Required fields are marked *