Hello al,. I moved four drives in a raidz1 from one TrueNAS box to another.
As part of decommissioning a different Unraid box, I added 4 more disks as another raidz1. These are striped together.
I know that one of these drives is SMR - an WD Red EFAX series. Resilvering after a failure took a good week and a bit.
My query to the experts here is:
I would like the TrueNAS server to run a bit faster. It’d be nice to get 400MB read out of the array when running a copy - I tend to get about 250MB/s average, which is ‘fast enough’ but sometimes when we’re all watching something different, there are complaints of ocassional frame dropping.
I also want to more easily add capacity but not 4 drives at a time as that’s prohibitively expensive, let alone time consuming with the resilvering ref’d above.
Therefore:
Could folk suggest ways to benchmark the individual VDEVs so I can understand where my bottleneck might be.
Is there value in replacing the SMR drive with a known CMR disk and leaving the pool as 2 4 drive raidz1s.
OR while replacing the SMR disk, reconfigure the array into 4 vdev, 2 drive mirror.
The data is mostly write once, read many (being the store for family photos, videos and assorted TV series, as well as the wife’s book keeping videos).
I am leaning toward replacing the SMR with CMR and reconfiguring when there’s a drive failure - the disks are getting on a bit - as mirrors but others know better than I. All advice welcome.
ZFS is a Copy-on-Write (CoW) filesystem with strong data integrity features, which makes it particularly incompatible with the nature of SMR drives:
High Random Write Workload: ZFS’s CoW nature means that when data is modified, it’s not overwritten in place. Instead, a new copy is written to a different, available location, and the metadata is updated to point to the new data. This generates a high volume of small, random writes and overwrites, which is the Achilles’ heel of SMR drives.
Performance Degradation and “Stalling”:
When the SMR drive’s CMR cache fills up due to sustained random writes (common during ZFS operations like scrubs, resilvers, or heavy I/O), the drive has to perform constant “read-modify-write” cycles to move data from the cache to the shingled regions.
This causes extreme slowdowns, often dropping performance to single-digit MB/s or even kilobytes per second. The drive can appear to “stall” or become unresponsive for extended periods (seconds or even tens of seconds per operation).
Drive Dropping from Pool/Array: Because SMR drives can become unresponsive for such long durations, ZFS may interpret this as a drive failure. The drive gets “kicked out” of the pool, leading to a degraded state.
Resilvering Nightmares: (as you noted already)
Resilvering involves reading data from healthy drives and writing it continuously to the new drive. This is a sustained, often random-ish write workload.
On SMR drives, a resilver can take an extremely long time, and the high I/O can cause the drive to stall repeatedly, leading to it being dropped from the pool. This significantly increases the risk of another drive failing during the resilver, potentially leading to complete data loss for the pool.
Data Integrity Concerns: While ZFS itself provides robust data integrity with checksums, the underlying unreliable and slow write behavior of SMR drives can lead to a higher risk of issues. If a drive is frequently stalling and being dropped, it increases the chances of problems, even if ZFS’s self-healing mechanisms try to compensate.
Lack of Communication (DM-SMR): Since DM-SMR drives don’t tell the operating system they are SMR, ZFS operates under the assumption it’s dealing with a CMR drive, which exacerbates the issues as it doesn’t try to optimize for SMR’s unique characteristics.
Hope this clears things up a bit because while SMR drives offer higher capacity at a lower price point, their performance limitations with random writes and the way they handle data overwrites make them fundamentally ill-suited for the demanding and CoW nature of ZFS. The risks of severe performance degradation, drive drop-outs, and potentially data loss during critical operations like resilvering far outweigh any cost savings.
Thank you to everyone for your advice. I’ve decided to replace the SMR with a CMR and for the time being, sit on the striped raidz1’s, but, when there is a failure in an array I’ll rebuild the array with 4 mirrored vdevs and up the capacity of one of the mirrors at the same time as I understand you can have differently sized mirror vdevs in a pool.
I might also go a bit mad and put in a metadata set as well for listing the very large file directories.
Just a minor note to Tom for his superb videos on ZFS and pointers on pools and operation. You teach, rather than tell and that’s rare, so thank you.