TrueNAS NFS Locks Up Under Heavy Load

I’ve had a TrueNAS server running since about last September and it’s been mostly stable. I had a couple of issues where NFS directories mounted on Ubuntu / Debian VMs took over 5 minutes to run a simple ls but I think I figured that out. I ended up storing some of my Proxmox VMs on the TrueNAS over NFS and around 1:30 this morning, Proxmox disconnected from TrueNAS for the first time. I noticed that starting at midnight, Veeam (which uses TrueNAS over NFS for backup storage) was pushing some heavy data over the network - around 2gbps. I tried to get Proxmox to reconnect but the connection attempts just hung. I tried manually mounting a random NFS share and that hung too. I ended up rebooting TrueNAS after Veeam was done, rebooted an ran fsck on my VMs that are stored on NFS. All was good and I went to bed. I woke up and found that it had happened again. Same thing with Veeam pushing heavy traffic, nfsd threads showing as D in top… So I repeated my steps from this morning again and everything is working for now. I limited the Veeam backup storage to 150 MB/s and increased the number of nfs threads to buy me some time to notice the issue if it happens again and gracefully power off my VMs before rebooting TrueNAS. I haven’t changed anything in months - everything’s been stable and then this happens twice in the span of a few hours. My girlfriend and I both mount our Steam drives over iSCSI on this server and when we play battlefield together we can push between 1 and 2 gbps and this has never happened. I haven’t made any changes to Veeam, Proxmox, or TrueNAS in months. Can anybody tell me what’s going on here? Any insight would be greatly appreciated.

I’ve attached some screenshots of logs and graphs during the incident.







Screenshot from 2025-05-10 03-25-19
Screenshot from 2025-05-10 03-27-58

I have ran into this same issue. I have xcpng running my SR over NFS to truenas and it will lock up truenas completely. I know it is this because I can shutdown all my VM’s and my truenas will stay up and running just fine. But anytime I turn on my VM’s it will crash in less than a day. I can’t find any logs or anything indicating it had an issue. Its like the kernel panics and leaves no trace as to why.

EDIT:
I did find this. They might have a fix coming.

Did your lockups seem to correspond with heavy load either from your hypervisors or from somewhere else? Did you see if your NFS threads were stuck in Uninterruptible Sleep? Do you remember how much RAM you had available in your TrueNAS system when that happened? I’m curious if there are any similarities that may help us point to the root cause.

I only had 16b of RAM in my TrueNAS but it’s been stable ever since I set it up back in October.

Thank you for the thread link! I’m going to give that a read when I get back from work. I looked through quickly and it looks like there may be some similarities between what they were discussing and what just happened to me.

It was hard to tell. It was random from a long time. Some days I can go a week and other seem to happen within a day. But I figured out it was NFS because If I didn’t run my VM’s, I didn’t have any issues. I didn’t get any metrics, but I am monitoring all these things now. So we will see.