PfSense ZFS faulted corrupted drive

I’m just posting this if by chance if someone runs into the same issue. Keep in mind I’m just a regular residential user and been using pfSense for about five years. I use one of those Qotom devices for my router. I installed it on two 64GB drives with the ZFS mirror option. This particular one I been running for about two years. I planned on using my other Qotom for a minecraft server but never got around to it.

Recently I ran into an issue. I had updated pfBlocker and it does its updating thing. A while later I noticed I couldn’t even get to the login page. It would time out but I was still able to get out on the web. I had no way to properly halt the system even right at the router. I did a hard power cycle and it did a continuous boot loop at the bios screen and showed an error of A2 on the bottom right of the screen. It would literally pop up on the BIOS screen for a second then reboot over and over. It turns out I had a bad drive. Was it the SSD or mSata? It was the mSata that was causing loop. I never saw anything like it and why did it just happen then at a startup?

Luckily the other drive took over once i pulled the faulty one. The drive is under warranty but that’s always a process. So i ordered a new one. It gets here and its time to go to work. I’m not familiar with at zfs at all. I had to go through the browser because when I SSH into the router, certain zpool commands were giving permission errors.
With the old drive removed I do a pool status and come to this page. That link served me no good whatsoever.

I shut it down and installed the new mSata drive. BIOS sees the drive but pfSense doesn’t. After a bit, I fired up PartedMagic and made sure the drive had a partition table. I boot up the router and it now sees the drive at least.

After I searched around to figure out what to do next but wasn’t having much luck. I finally came to this site that helped me get that new drive back online and resilvered. https://farrokhi.net/posts/2020/05/replacing-a-faulty-disk-in-zfs/

Now its time for a drink!

2 Likes

I can answer the why it happened at boot question… Pfsense runs most things in RAM so you might have a completely functioning system, do a reboot, and not get it back. They even specify in the update/upgrade instructions that you should reboot first, just to make sure you don’t have a hardware problem.

Now why the mirror didn’t automatically switch over, I have no idea, that’s what it was supposed to do which is why you configured it that way.

Thanks. I did know it mostly ran in RAM. I have the smart status on my dashboard and I’m in there often enough looking at traffic graphs. I would of assumed it would of issued a fail or caution on the status. Nothing is perfect. I get that for sure.