Saturday, February 24, 2018

virtual machines - ZFS: very big files + compression + snapshots



I backup several virtual disks (total = around 4 Tb), with several weeks of retention time.



I use 4 x 4 Tb disks in the computer dedicated to primary backup. The filesystem is ZFS RAIDZ2, so 8 Tb usable.
A secondary backup of 4 x 2 Tb disks (4 Tb usable) is on a separate building, storing last sunday's backup.




I manage the retention by doing snapshots: after each backup a snapshot is created on the primary backup filesystem. And the snapshots older than 90 days are deleted. The modified data amount is less than 4 Tb for 90 days, so everything is okay (in fact I have 30 last days + 9 previous weeks + 10 previous months, but this is not the point).



On the secondary backup I have only one backup. I plan to implement retention too.
I first thought to upgrade to 4 x 4 Tb disks (because of lack of space, I can't upgrade to 6 x 2 Tb) and do snapshots as in the primary backup.



Instead of upgrading hardware, what if I use ZFS compression + snapshots on the secondary backup?
Compression will lead to, say, 600 Gb free. Then snapshots will give retention of several days.



The saved virtual disks are updated with rsync, so only small parts are modified. So I think only small parts are "transmitted" to snapshots. But I don't find any source confirming this will work as I think.



Question: using ZFS on Linux with compression, will very big files with scattered modifications be snapshoted efficiently?



Answer



You should be using ZFS compression (with compression=lz4) by default these days. There's no good reason not to use it, except if you know that your data is not compressible.



Snapshots on compressed ZFS filesystems are still efficient and work with replication and/or rsync.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...