Sunday, January 7, 2018

Is it safe to send ZFS-snapshots incrementally ignoring some intermediate snapshots?






I'm using ZREP to replicate two servers with each other and each server contains one ZFS-pool containing two datasets as replication master and two sets as replication target. The master sets contain the system and VirtualBox-VMs of the local server, the replication targets the same from the other one.



Additionally I'm backing up all master sets per server to some NAS using rsync. The NAS is pretty slow and backup takes hours to succeed, so the implemented approach is to suspend VMs, create one snapshot, restore the VMs and let rsync run from the created snapshot. The important thing is that the manually created snapshot didn't follow the ZREP naming convention and has been destroyed directly after rsync finished again. At first, ZREP continued to operate concurrently, started by cron.





But from time to time it happened that ZREP got into some state not able to sync anymore. To resolve that issue, a coworker told me that he needed to delete snapshots and follow the process to initialise ZREP again all over. That problem got fixed by not let ZREP run in parallel with rsync and our own snapshot anymore in the end.




Sadly I lack the concrete details of that error and the coworker is not available anymore, but from his description it sounded like there was a problem with finding common ancestors of snapshots between replication master and target to sync incrementally. I think the error messages were something like the following:



cannot receive incremental stream: most recent snapshot of zfs-pool/vbox/tori does not match incremental source
cannot open 'zfs-pool/vbox/tori@zrep_0001b7': dataset does not exist




From my understanding of the docs and other questions, to successfully send snapshots incrementally, the sending master and receiving target need to share that one snapshot which is used as argument 1 to zfs send and that snapshots needs to additionally be the current one on the receiving target.




The second argument is an arbitrary newer snapshot, used by ZFS to calculate differences to the one snapshot master and target have in common and send those differences to the replication target. Because both share the same snapshot specified as argument 1, the differences make sense to the target and can simply be applied as is.



The arguments -i vs. -I from my understanding either lead to one logical snapshot getting send containing all calculated incremental data of the master side OR sending all intermediate snapshots containing their incremental changes. So e.g. -i leads to ONE new snapshot on the target always vs. -I might lead to N additional ones.



Creating and destroying intermediate snapshots between what gets provided as arg 1 and 2 to zfs send -i shouldn't be any problem, because ZFS always calculates differences only between those two provided arguments and doesn't care about any other intermediate snapshots. In case of ZREP that means in theory that as long as I'm not interfering with ZREP-managed snapshots, it shouldn't make any difference for that if additional snapshots get created during its operation or not. Simply because special ZREP-snapshots are available always, managed by ZREP and used to calculate differences for replication. So in theory, additionally creating snapshots for rsync and backup shouldn't be a problem at all.



Are those assumptions correct?






Is it safe in general to send ZFS-snapshots incrementally ignoring some intermediate ones? Or is it necessary to send ALL intermediate snapshots created ever to the replication target to net get out of sync or stuff? How do things depend on -i vs. -I?


Answer




Yes you still get all the data in-between but you just can't rewind to in-between.



If you have snapshot's 1,2 and 3 and the remote pool only has snapshot 1, you can give it snapshot 3 and skip 2.. it just won't be able to roll back to the '2' state. But the data will still be there.



The snapshots describe what was there at the time. So missing snapshot '2' on the remote pool, it's like you never took one at that point in time. It literally doesn't know about the '2' snapshot and what stuff looked like back then.




If you change your mind, you'll need to delete snapshot '3' on the remote pool and only then can you send '2', then '3' again.




https://www.reddit.com/r/zfs/comments/cfzdb3/is_it_safe_to_send_zfssnapshots_incrementally/euensuy/


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...