TrueNAS Pool Degraded

I’ve been watching Tom’s videos on TrueNAS for a while and my New Year’s resolution was to finally try and setup a server. I got 3 8TB Seagate Ironwolf (new) for 1 pool and an 18TB Seagate Exos (renewed) for a second pool to backup to daily or weekly.

Once everything was setup, I started copying all my files from my Synology to the Exos drive. It had a few errors and when I clicked retry it was able to finish copying. I moved the server to a second location. Then I tried copying all the files from the Exos to the Ironwolf pool and it again had some errors copying, but clicking retry won’t work. I ended up with about 80% of the files.

Over the last couple days the status for both pools has gone to “Degraded”. The used drive makes sense, but the 3 new drives surprises me. I’ve tried reconnecting all the cables. When I go to the TrueNAS console and use smartctl -a /dev/ada(1-4) all drives show errors under “raw_read_error_rate” (1-2 hundred million). I’ve tried copying 50GB of data from my laptop to the 3 Ironwolf pool over and over and never got a copy error. (UPDATE: I found and tried the command zpool status -v and see all the copied files have errors.)

What would you recommend I try or look at next? I’m 99% sure I’ll send the used drive back, unless I’m missing something and it’s just user error. I only have 2 weeks to figure out if the 3 Ironwolf drives are bad so I can send them back.

I’d also take recommendations for a better pool setup since this is my first time.

Thanks for any help you can give!

Run a scrub, see if that clears the errors. But if they keep coming back it may be some issue with the controller you have running the drives or some other hardware issue.

Thanks for the reply Tom. I tried running a scrub over the weekend and it didn’t fix the errors, it only found more corrupt files. I’m using an old gaming PC (Asus Strix Z390-E). Would getting a PCIe SATA card bypass the controller? I got a 6 pack of Benfei SATA cables for $9 (5 stars on Amazon with 34k reviews), is it possible they’re bad?

BTW, the last time I heard the word “Scrub”, it was in a TLC song!

I don’t know of any particular incompatibility with that board, but you could try cables as well. Loose connections will cause those issues. Or maybe the drives have issues.

Have you run a SMART test on that 18tb drive?

Yes, I ran a long SMART test through the GUI and the 18TB drive had around 100 million “raw_read_error_rate” when I used smartctl -a /dev/ada4. When I check the 3 8TB drives, they had 100-200 million “raw_read_error_rate”. This is my first time looking at SMART results, but it’s first on the list so it seems like it would be the most important.

Since my last post, I unplugged the 18TB drive and removed all pools. I made a new pool on the 3 8TB drives and 2 different datasets. I copied 100GB from my laptop to the first dataset with no problem and the speed was 114MB/s. Then I tried copying it from one dataset to the other and the speed fluctuated from 400MB/s to 50MB/s and I had to retry on several files.

I’ve ordered new SATA cables and a 4 port PCIe card. Just waiting for them to arrive. If that also has issues, then I’ll assume the 3 new 8TB drives are bad and send it all back to Amazon.

I got the new SATA cables and still have problems. So I had the idea to make individual pools with each 8TB drive. I’m able to transfer 100GB with no issues from my laptop to each pool, but when I try to copy from one pool to another I have errors. I thought maybe only 1 of the drives might be bad and that was causing the RAID to fail, but since copying between them all causes errors, it must be the controller on the motherboard right? Or maybe I got 3 bad drives?

The PCIe SATA card will arrive soon. Will that bypass the motherboard controller?

I got the PCIe SATA card and still get errors while transferring data between pools/drives. I’ve ordered an external USB dock to see if my laptop can read/write to each drive without errors.

If the external dock has no issues, is there a motherboard anybody can recommend that they’ve had good luck with?

Thanks for any help or recommendations you can give!

When the USB dock arrives, run the Seagate tools on the drives and see if it can find any issues. You’ll need the report for a warranty claim.

What card did you buy? Mainboard and processor and RAM might help too. Also how big is the power supply and could it be having a problem under the fairly small load you are giving it. I would like to think that with only 4 drives you could get away with a 250 watt supply, but you might need more.

Thanks for the reply Greg,

The 4 port SATA card is by Jesot. The power supply is from an old gaming PC, it’s an EVGA 1000W.

I got the USB dock and have been able to read/write files at 200MB/s with no issues. I’ll try running the Seagate tool now and see what it shows.

I have not worked with that drive controller, is it on the approved list for Truenas? Mostly I’ve been fortunate enough to use LSI brand and whatever is on board of the servers I’ve been using (often Intel RST).

Everything is good now.

The short version is, I got a new motherboard, cpu and ram and everything works now!

The long version is, I installed TrueNAS Core and when I tried to boot from the SSD, it wouldn’t work. So I updated the bios to the latest version and it still didn’t work. I tried disabling all the “secure boot” stuff and when I rebooted, it was all turned back on. So I updated the bios to the oldest version I could find and then TrueNAS core booted! The only problem was it didn’t recognize the 2.5Gb ethernet. So I installed TrueNAS Scale and everything is working now. I’ve done a solid hour of reading and writing with no errors.

Why would Core and Scale have different ethernet support?

Core is based on BSD and Scale is based on Debian Linux