Back to post index

Shuffling zpool disks
Tags: [zfs] [freebsd] [capablanca]
Published: 28 Dec 2016 16:08

capablanca’s three zpools were a mixture of WD green and red disks:

Yes, capablanca has all Western Digital disks - everyone has their favourite disk brand and mine for the longest time was WD. I’m inclined towards picking HGST or Toshiba these days due to Backblaze’s data but the current array of WD disks are still 100%.

However, I have had three WD Green disk failures over the years. Two had actually failed within 2 weeks of each other. Even if it is unscientific, I don’t trust them anymore.

I had done a fair amount of reorganizing my files (one step at a time of course) and had moved / deduplicated everything on the public_storage pool.

The plan was then to shuffle the disks around: I wanted to move the Red disk that public_storage was using into the archive pool and remove the Green disk.

Another objective of this shuffle was to switch to 4k blocks with ZFS. All my pools were using the smaller 512 byte sector size and I wanted to switch that.

Yet another objective was to use /dev/diskid instead of GPT labels. I wanted the same functionality as Linux’s /dev/disk/by-id directory.

So, let the great shuffle begin. The process was a little like the riddle where the farmer has to take lettuce, a rabbit, and a wolf across the river…

The first step was disk identification:

# glabel list | grep -B 3 zfs
Geom name: ada0
Providers:
1. Name: label/zfs2

Geom name: ada2
Providers:
1. Name: label/zfs1

Geom name: ada3
Providers:
1. Name: label/zfs4

Geom name: ada4
Providers:
1. Name: label/zfs3

Geom name: ada5
Providers:
1. Name: label/zfs5

# camcontrol identify /dev/ada0
serial number         WD-WMC300563174

# camcontrol identify /dev/ada2
serial number         WD-WMAZA9460416

# camcontrol identify /dev/ada3
serial number         WD-WCC4M2773993

# camcontrol identify /dev/ada4
serial number         WD-WMC300578369

# camcontrol identify /dev/ada5
serial number         WD-WMAZA0686209

The zpools had the following disk configuration:

archive:
    zfs1 -> ada2 -> WD-WMAZA9460416 (WDC WD20EARX)
    zfs2 -> ada0 -> WD-WMC300563174 (WDC WD20EFRX)

storage:
    zfs3 -> ada4 -> WD-WMC300578369 (WDC WD20EFRX)

public_storage:
    zfs4 -> ada3 -> WD-WCC4M2773993 (WDC WD20EFRX)
    zfs5 -> ada5 -> WD-WMAZA0686209 (WDC WD20EARS)

where:

The plan

Here’s the plan:

First, I destroyed the public_storage pool:

# zpool destroy public_storage

Now the disks were available for use.

For both disks, I zeroed out disk partiion, zfs, and glabel information. In my case:

# geli detach /dev/label/zfs5
# dd if=/dev/zero of=/dev/label/zfs5 bs=1m count=10
# dd if=/dev/zero of=/dev/ada5 bs=1m seek=1907719

This gives ZFS the whole disk.

Now, /dev/diskid has the same functionality as /dev/disk/by-id in Linux: entries show up as /dev/diskid/DISK-WD-WCC4M2773993 in the case of one of the Red disks. But entries in /dev/diskid have the odd quirk of disappearing when referred to by another name. For example, if I used /dev/ada3 instead of /dev/diskid/DISK-WD-WCC4M2773993, then the DISK-WD-WCC4M2773993 entry would disappear and I would need to reboot.

So, I rebooted, then took the Red disk from the public_storage pool and made the tank zpool:

# zpool create tank /dev/diskid/DISK-WD-WCC4M2773993.eli
# zpool status tank

  pool: tank
 state: ONLINE
  scan: none requested
config:

    NAME                               STATE     READ WRITE CKSUM
    tank                               ONLINE       0     0     0
      diskid/DISK-WD-WCC4M2773993.eli  ONLINE       0     0     0

errors: No known data errors

I verified that tank is using the correct sector size:

# zdb -C tank | grep ashift
            ashift: 12

Note that this requires using a sector size of 4096 with geli init.

Next, I added public_storage’s Green disk WMAZA0686209 to archive, and waited for the resilver:

# zpool attach archive label/zfs1.eli /dev/diskid/DISK-WD-WMAZA0686209.eli
# zpool status archive

  pool: archive
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Dec 13 00:35:38 2016
        3.48M scanned out of 1.30T at 396K/s, (scan is slow, no estimated time)
        3.28M resilvered, 0.00% done
config:

    NAME                                 STATE     READ WRITE CKSUM
    archive                              ONLINE       0     0     0
      mirror-0                           ONLINE       0     0     0
        label/zfs1.eli                   ONLINE       0     0     0
        label/zfs2.eli                   ONLINE       0     0     0
        diskid/DISK-WD-WMAZA0686209.eli  ONLINE       0     0     0  (resilvering)

Once done, I rsynced all the data from /archive to /tank. Those following along: don’t skip this step, it’s important :)

Then, I removed zfs2 (the Red disk) from archive:

# zpool offline archive label/zfs2.eli
# zpool detach archive label/zfs2.eli
# zpool status archive

  pool: archive
 state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
    still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
    pool will no longer be accessible on software that does not support feature
    flags.
  scan: resilvered 1.30T in 10h20m with 0 errors on Tue Dec 13 10:56:27 2016
config:

    NAME                                 STATE     READ WRITE CKSUM
    archive                              ONLINE       0     0     0
      mirror-0                           ONLINE       0     0     0
        label/zfs1.eli                   ONLINE       0     0     0
        diskid/DISK-WD-WMAZA0686209.eli  ONLINE       0     0     0

At this point, using /dev/label/zfs2 has made the corresponding /dev/diskid entry disappear, so I rebooted.

I attached WMC300563174 (zfs2) to tank:

# zpool attach tank diskid/DISK-WD-WCC4M2773993.eli diskid/DISK-WD-WMC300563174.eli
# zpool status tank

  pool: tank
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Dec 14 13:00:29 2016
        4.97M scanned out of 1.51T at 1018K/s, 442h5m to go
        4.67M resilvered, 0.00% done
config:

    NAME                                 STATE     READ WRITE CKSUM
    tank                                 ONLINE       0     0     0
      mirror-0                           ONLINE       0     0     0
        diskid/DISK-WD-WCC4M2773993.eli  ONLINE       0     0     0
        diskid/DISK-WD-WMC300563174.eli  ONLINE       0     0     0  (resilvering)

errors: No known data errors

After the resilver, zpool status showed:

  pool: archive
 state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
    still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
    pool will no longer be accessible on software that does not support feature
    flags.
  scan: resilvered 1.30T in 10h20m with 0 errors on Tue Dec 13 10:56:27 2016
config:

    NAME                                 STATE     READ WRITE CKSUM
    archive                              ONLINE       0     0     0
      mirror-0                           ONLINE       0     0     0
        label/zfs1.eli                   ONLINE       0     0     0
        diskid/DISK-WD-WMAZA0686209.eli  ONLINE       0     0     0

errors: No known data errors

  pool: tank
 state: ONLINE
  scan: resilvered 1.51T in 5h19m with 0 errors on Wed Dec 14 18:19:35 2016
config:

    NAME                                 STATE     READ WRITE CKSUM
    tank                                 ONLINE       0     0     0
      mirror-0                           ONLINE       0     0     0
        diskid/DISK-WD-WCC4M2773993.eli  ONLINE       0     0     0
        diskid/DISK-WD-WMC300563174.eli  ONLINE       0     0     0

errors: No known data errors

Resilvering tank took half the time for the exact same data! Possible explanations for this include:

I didn’t investigate further.

At this point /archive and /tank contained the same data. I renamed archive to oldarchive, and tank to archive by simply importing it with a different name:

# zpool export archive
# zpool export tank

# zpool import archive oldarchive
# zpool import tank archive

# zpool status

  pool: archive
 state: ONLINE
  scan: resilvered 1.51T in 5h19m with 0 errors on Wed Dec 14 18:19:35 2016
config:

    NAME                                 STATE     READ WRITE CKSUM
    archive                              ONLINE       0     0     0
      mirror-0                           ONLINE       0     0     0
        diskid/DISK-WD-WCC4M2773993.eli  ONLINE       0     0     0
        diskid/DISK-WD-WMC300563174.eli  ONLINE       0     0     0

errors: No known data errors

  pool: oldarchive
 state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
    still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
    pool will no longer be accessible on software that does not support feature
    flags.
  scan: resilvered 1.30T in 10h20m with 0 errors on Tue Dec 13 10:56:27 2016
config:

    NAME                                 STATE     READ WRITE CKSUM
    oldarchive                           ONLINE       0     0     0
      mirror-0                           ONLINE       0     0     0
        label/zfs1.eli                   ONLINE       0     0     0
        diskid/DISK-WD-WMAZA0686209.eli  ONLINE       0     0     0

All that is left to do now is destroy oldarchive, and either repurpose those disks for something else or recycle them.