Ever since i moved my Truenas to new hardware a couple weeks ago, its been locked up (frozen) 3 times when i come home. No connection to network drives, powering on the display shows the menu, but the normally blinking cursor is gone (indicating the machine is frozen). The last run, it ran for about a week.
Last night i powered it off to do some testing on the PSU, and when done, i powered it back on and it booted up normally, was even fine this morning. Came home this evening, frozen.
Right now, it isn’t doing anything except the drive shares (Jellyfin is disabled due to my network card issue - Will re-enable it when new NIC comes in and i can point it toward something more permanent than a USB dongle) so its not stressed at all.
Will it have saved any logs that may indicate any faults that occur right before it froze?
TrueNAS Scale ver 24.10.2.1 Motherboard: MSI Pro B550M-VC Wifi CPU : AMD 3950x RAM : x2 Crucial Pro RAM 64 GB Kit (2x32GB) DDR4 3200MHz (CP2K32G4DFRA32A) 128GB total Boot drive : Crucial P310 500GB NVMe Storage Drives : x2 SEAGATE IRONWOLF PRO 16TB NAS HDD PSU Corsair RM850x
Wonder if the new hardware pissed it off somehow? Granted, i did do a small update to it before transferring it, but previously it never had that issue on the old hardware.
Have you ran any memory and burn-in tests? Right now it can be anything. Need to narrow it down, rule out some things if possible.
There’s a chance some hints are in the logs. It’s not a guarantee if the log writes didn’t get the chance to be flushed to disk but it’s worth a shot looking.
Something else I encountered on especially older network cards is the TCP/IP hardware offload checksum handling bugs out and locks the card until reset. Not the whole system, as it appears to be in your case, but another avenue worth exploring by temporary disabling the hardware offloading and see if it makes any difference.
Edited to add: maybe you can send the logs out in real time to a different server. It may be a way to capture any information prior to the crash. Just something basic like Centralized Rsyslog don’t need to get too fancy.
I’ve been having the same issue since late in major version 24 and haven’t been able to track the cause down. It just freezes once every few days while idle (maybe during a snapshot or backup? I haven’t wanted to go so far as to turn those off to test).
The one thing I’ve found that made it significantly worse was adding an instance. I created an Ubuntu Noble instance, didn’t even do anything with it yet, and started getting crashes every few hours.
Obviously you should do all of the necessary hardware checks on a new build, but TrueNAS does seem to have a bug of some sort in it. Hopefully they get it fixed soon.