Thursday, August 21, 2014

mac osx - Cannot get ZFS pool created in OSX to import on Linux system




I'm experimenting with different file systems for use between a dual-booting Linux/OSX laptop which will act as a testing platform. Despite being a BSD variant, I've had a lot of trouble finding a compatible file system but have settled on the implementation by OpenZFS.



The latest OpenZFS 0.6.3-1 is installed on both systems. I initially created a pool from within OSX using /dev/disk0s6. Everything works fine in OSX with the drive mounting and being writable:



$ zpool status
pool: Data
state: ONLINE
scan: none requested
config:


NAME STATE READ WRITE CKSUM
Data ONLINE 0 0 0
disk0s6 ONLINE 0 0 0

errors: No known data errors

$ zpool import
pool: Data
id: 16636970642933363897

state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

Data ONLINE
disk0s6 ONLINE


But when I export the zpool and reboot into Linux, I cannot import the pool, even with -f:




$ zpool import -f Data
cannot import 'Data': one or more devices are already in use

$ zpool import
pool: Data
id: 16636970642933363897
state: UNAVAIL
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
see: http://zfsonlinux.org/msg/ZFS-8000-5E

config:

Data UNAVAIL insufficient replicas
ata-Corsair_Force_GT_135004090FF015790001 UNAVAIL


Rebooting into OSX shows that the pool is not corrupt and loads just fine.



What am I doing wrong?


Answer




After a lot of troubleshooting with some great people over at the OpenZFS GitHub, I can confirm that this is a bug.



The real problem is that I created the pool using the last partition of the disk which, on Linux, can be confused as corruption if the partition aligns closely enough with the end of the disk.



ZFS creates four labels on the target device for redundancy with two at the beginning and two at the end. When ZFS evaluates the disk after booting into Linux it will encounter /dev/sda first which will provide a partial match against the last two labels at the end of the disk (from the last partition). It then erroneously believes that the device is corrupt as there are no labels at the start of the disk.



The solution was to add a buffer of at least 10MB of free space at the end of the disk.



Full details can be found here:
https://github.com/zfsonlinux/zfs/issues/2742



No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...