Reliable pfSense hardware

Hi,

I’ve installed many Netgate routers in remote places across Australia. These locations can take days to drive to. The power in these locations could be more reliable. Each site has a UPS but can suffer weekly or daily power loss.

Some of the SG1100 units (one 1 SG5100 & SG3100) I installed have become corrupt and need reinstallation. And more recently, a couple of the SG1100 units have had the eMMC fail.

I love pfSense, but my experience with hardware has shaken my faith in the product. I’ve not known so many issues. I’m close to giving up with pfSense. Right now, I need reliable routers vs more features.

Can anyone suggest bulletproof hardware?

eMMC flash memory typically does not have the same durability and fault tolerance as a SSD.
Understanding Multilayer SSDs: SLC, MLC, TLC, QLC, and PLC

  1. Use appliances that have some form factor of SSD/NVMe interface (mPCIe, M.2, SATA, etc.)
  2. Use name brand SLC or MLC flash SSDs . Stay away flash technologies that have lower P/E cycles.
  3. Make sure pfSense is monitoring the UPS directly as a NUT Master and configure it to shutdown last. Configure all other clients on same UPS as NUT Slaves and configure them to shutdown first.
  4. Make sure to routinely test the UPS and proactively replace batteries on a schedule.
  5. Set up automatic local and offsite backup of the pfSense config. I believe Netgate has something built-in. Or you can set up your own cron job to backup to a USB drive or NAS.
  6. For sites with critical needs, consider setting up pfSense HA.
  7. Consider some sort of solar backup for remote sites with unreliable power.
  8. Make sure there is adequate cooling and ventilation where the appliances are located.

I personally have never had an issue with appliances from https://protectli.com

I have more dead Protectli devices than I do Netgate devices. The devices that are not if you go with the 4200 on up they all have ZFS (or can be reloaded to use ZFS) which makes them much more tolerant to random power loss.

Cold pre-configured spares seems the safest option? Site fails, so swap in cold spare, setup the devices so you can VPN in and have a local person connect the borked unit to another NIC - separate subnet each unit with fixed IP on that subnet, to remote diagnose?

HA - power glitch could kill two devices I guess - I’ve seen PCs blow even though UPS protected - albeit with really dodgy power grids in Africa!

How remote Australia. Central AUS.
Are the locations these devices are in cooled. As in less than 25 deg c

So when the power goes down what happens to the temp in the locations??

We know that central Australia goes way over 45 deg C.

But yes i agree with [elvisimprsntr]

Get a good UPS and look at HA.

Some Netgate devices also come with the option for a M.2 ssd. Would this maybe be a more tolerant option than the eMMC? I also had a recent failing eMMC on a 6100 (with ZFS).

Edit: found some answers about SSD here: Netgate 6100 base or max? Logging and eMMC longevity?

Hi @Jaybird3043 welcome to the forum :grinning:. Very sorry to hear about your 6100. Yes the SSD option is worth it. Only a little more $$$ €€€ and it’s replaceable. eMMC is not and the unit is worthless if it fails. So SSD option is basically a kind of insurance.
I’m still running 2100 with 128GB SSD, very nice.

Hi @CableDude, thank you for your insights! I agree that it is a nice insurance and will definitely add it to my next device.

1 Like

I could probably add to the discussion to, I’ve had the best success using just standard off the shelf computer hardware. I’ve operated a HP Thin Client in a Hot + Humid shed in North Queensland for the last 3 years with now hardware failures - No Air-Con or AC. An SSD with cooling fan for airflow and a system that has adequate heat-sink paste is more than capable of running in hot environments.

I wouldn’t want to run a fanless system though in remote Australia (The like of Mt Isa/Julia Creek) - My thoughts are Airflow is important when there’s no AC available.

Given that all media will fail at some point and so one just has to make a plan for when that happens, is there any real world experience on making the boot media fault tolerant via mirroring in ZFS or otherwise?

I had tried using USB sticks for very simple low-write configs (primarily since a non-skilled user at a remote site can easily swap out to a cloned backup stick) but even premium sticks failed regularly so am looking for a balance between reliability, cost and recovery from failures without skilled intervention.

1 Like

This would be very nice.