I'm currently speccing out a new backup server. It's going to be running Windows Server 2012 R2 with Direct Attached Storage, and I'm considering using Storage Spaces rather than using a RAID card. (As a ZFS fan, I can definitely see the advantages of doing away with hardware RAID if done right.)
I expect my performance-critical workloads to be mostly sequential. Since this is bulk storage of sequential data, where performance of reads (backup restores) is more critical then performance of writes (taking backups), I thought parity would be a good fit to get dense and low-cost storage of backup data.
I've seen some concerning results in blogs, when it comes to write performance of parity disks, such as in this article by Derrick Wlodarz of Beta News, as well as in this white paper from Fujutsu.
When researching this more, it seems that blog article didn't test with dedicated journal disks, which can apparently can increase write performance on a parity storage space dramatically, according to this TechNet article. Unfortunately, neither of the two benchmarks I referred to earlier tested the impact of journal disks, but Microsoft claims they've seen a performance gain of 150%, which would for my application I think would put it right where I want the performance to be.
This is all good information, but there's one piece of the puzzle I've not been able to find. The SSDs in question are only to be used for journalling in a mirror, and from what I understand, they're just there to provide short-term stable storage of writes. As such, I don't expect them to have to be very large. At least, this is the conclusion I draw from working with ZFS and ZIL disks - the size is not critical, although in that case, larger disks may last longer under an intensive write load, since writes are spread across a larger disk.
I already understand that since everything written to the array will also be written to the journal, they need to be able to be written to at the desired rate. As Microsoft puts it:
Note that the throughput of the journal disks will now be the overall throughput limit to all parity spaces created on this specific storage pool and you might trade extra capacity for performance. In other words, ensure that dedicated journal disks are very fast and scale the number of journal disks with the number of parity spaces on the pool.
What I have however not been able to find: Are there any best practices specifically for choosing the appropriate size of the SSDs to be used as journal disks for a Storage Space parity array?
Answer
Mate, the answer is already in the quote. You may want to add several SSDs, not big SSDs.
You were right that the journal's simply a cache for parity. But that means it's completely useless for mirrored spaces. Even the WBC is just 1GB tops for any array by default, which you can override via PowerShell, but there's also a hard limit of 100GB.
Nowadays you can't even buy an SSD smaller than 120GB. Just get them and you'll be fine. That's what I did, too.
I can also prove this with numbers, check out my in-depth benchmarking series about this:
TL;DR parity spaces sucks even with dedicated journal, just not that much. In fact, it sucks even if it's a pure SSD array. It's a shame, really. Microsoft gives a lot of blah-blah about it, but really, if MD and ZFS can do it right, why can't they?
No comments:
Post a Comment