Server shuts off when I'm not home

Recently I was able to finally build my NAS. Not knowing any better, I’ve been using the stock heatsink that came with the CPU. Every several weeks or so, I come home and the server is no longer running. At first I didn’t know what was happening, but then it started beeping while I was home. I looked up the error code and found the CPU was overheating. When I opened it up, I found the heatsink was a little loose. I haven’t assembled a computer in about two decades. I assumed I hadn’t mounted it correctly. I made sure that all four pins were clicked all the way in, but yesterday the server was powered down when I got home from work. I assumed the heatsink was loose again, but when I checked it, it was tight and secure.

Here are the specs for the server:

Motherboard: SUPERMICRO MBD-X11SSM-F-O Micro ATX LGA 1151 Intel C236
CPU: Intel Xeon E3-1225 V6
RAM: Crucial 16GB DDR4 SDRAM ECC Unbuffered DDR4 2666 (PC4 21300) Server Memory CT16G4WFD8266

I’m running the latest stable version of FreeNAS with only two jails (Plex and Nextcloud). I am happy to provide other details if needed. I’m not overclocking the server, and I barely put any load on it at all. I’d really like to get to the bottom of this shutdown issue so I can have a reliable server. Does anybody have a suggestion?

I have the same CPU and motherboard in a system and have not had overheating issues.
Is the server in an enclosed space that is heating up?
What chassis are you using?
What other fans do you have connected?
Is the CPU Fan plugged into the “A” fan port? (FANA is the CPU fan, FAN1 through FAN4 are for anything else)
In IPMI, what does it report for fan speed and CPU temperature (note that the temperature sensor reported in IPMI isn’t the inside the CPU, but rather is on the motherboard in the middle of the socket)

Did you apply the thermal grease between the CPU and the Heatsink?

Did you check the cpu usage? It’s possible there is a cryptominer if the FreeNAS has been compromised.

Is the server in an enclosed space that is heating up?

The server is in our open entertainment console with plenty of circulation.

What chassis are you using?

Fractal Design Define R5 Black

What other fans do you have connected?

Just the fans that came with the chassis. I think there’s three; all are plugged in an blowing.

Is the CPU Fan plugged into the “A” fan port?

Yes

The heatsink came with thermal paste pre-applied, so I didn’t think I needed any extra. I’ve checked the CPU usage and other stats. Everything is low as expected.

I checked the temperatures every day or two for a couple weeks after making sure the pins were clicked all the way around Thanksgiving. The CPU was always around 30C.

What IPMI and BIOS versions are you running? My stable system has IPMI 01.45 and BIOS 2.1a. I see the latest versions are 01.58 and 2.2a.

In the IPMI, under Server Health > Event Log, you should be able to see exactly why it shut down, as well as a history of any CPU health or other alerts.

Firmware Revision : 01.48
BIOS Version: 2.2
Redfish Version : 1.0.1

Server Health -> Event Log doesn’t show anything after 2019-08-04. I don’t know why; there appears to be room for 512 entries, and only 15 are listed. Maintenance -> System Event Log appears to have current logs, but none seem very interesting.

Is the overheating mesaje gone?
If yes…
Other things to check.

Loose power cable. (to the wall, and internally)
Are you using a UPS?
Check for leaked capacitors on the motherboard or power supply. (be careful, there are dangerous voltages inside power supplies and capacitors can hold charge for long time, even unplugged)
Reseat the memories.
Try only with half the memories and then with the other half.