Wednesday, April 20, 2016

migration - Inplace migrating ZFS RAIDZ with 3 drives to 4 disks, when pool has more than 1/3 of free space



On creating my RAID-Z pool on ZoL I assumed, I could easily just drop-in additional disks later on. Meanwhile I learned that this is yet not possible.




But... I had a similar problem on creating my initial pool. Only 4 free SATA ports, but an old RAID5 with three 2TB disks and a new RAIDZ1 with three 4TB disk. The solution was to a) degrade the RAID5 and b) build the initial RAIDZ with a sparse file as "virtual third drive", which was taken immediately taken offline after pool creation:




  1. Create sparse file: dd if=/dev/zero of=/zfs1 bs=1 count=1 seek=4100G

  2. Create the raidz pool: zpool create zfspool raidz /dev/disk1 /dev/disk2 /zfs1

  3. Immediately take off the sparse file: zpool offline zfspool /zfs1

  4. Migrate data to zfspool. Uninstall old RAID5 disks, add third, new 4TB disk

  5. Replace & resilver the sparse file in the pool with the actual, third drive:
    zpool replace zfspool /zfs1 /dev/disk3




This worked out really great! Now I learned that though ZFS does not directly support adding a single disk to RAIDz, but that it supports one-by-one replacing disks with larger ones.



So here is my plan. Does anybody see a flaw in it?




  • Buy a fourth 4TB disk and take one disk offline from the existing pool

  • Create 2x2TB paritions on these two, free disks.

  • Build a RAIDz out of these four "disks": 3x2TB = 6TB net storage.

  • For performance reason: Take one of the 2nd partition immediately offline


  • Migrate max. 6TB data to the new pool & destroy the old pool

  • Replace offline "2TB disk" with a real 4TB one of the old pool. Wait to resilver.

  • On the drive with 2 active partitions: Take the 2nd 2TB partitions offline and replace it with the second 4TB disk from the old pool. Wait for resilvering.

  • One-by-one: Take a remaining 2TB partitions offline, grow the partition with 4TB and re-add the disk the pool. Wait for resilvering.

  • Rinse & repeat for the very last 2TB disk/partition



Will this work? I know that I'm higher vulnerable to data loss due to the missing redundancy during the process, but I will have backup of the most important data. Just not enough for the whole 6TB payload.



And will ZFS automatically grow the pool to (3+1)x4TB = 12TB after the last step?



Answer



Ugly, but this would work.



Except when it doesen't;).




  • Be very careful when specifying the partitions and when replacing the disks

  • try it in am VM beforehand, setup the virtual disks like your hardware an dry run it 1 or 2 times.

  • make a scrub before you start and take a look at the S.M.A.R.T info from the disks. You would not try this with an already flakey disk.




Important: You better have a tested backup on another medium or machine before trying it!



Yes, ZFS will grow the pool if the last 2TB disk or partition is replaced with a 4TB one ( if you have autoexpand=on for the pool )



zpool get autoexpand $pool

zpool set autoexpand=on $pool



On a sidenote: you should not use RAID-Z on disks bigger than 2TB. Your chance of getting an error on resilvering when replacing a faulted disk is very high. Please consider RAID-Z2.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...