Virtualize or not? Pros/Cons - OPNsense & TrueNAS

For my first HomeLab attempt, I ProxMox virtualized my OPNsense and TrueNAS machines.

The goal was to 1) make backup/restore easier and 2) to allow for moving VMs between hardware if the hardware was similar enough and could support the VMs.

In theory, this would allow me to easily move my OPNsense between two identical physical ChinoSpecial devices. For TruNAS I guess there isn’t much benefit to moving hosts but backup might be better with virtualization

Theory doesn’t always work out as well as hoped. I had trouble moving OPNsense between boxes because of network mappings. Even if I thought I set up both devices exactly the same.

Question of interest for me now…. Does the community see benefits in virtualizing specialized devices like OPNsense and TrueNAS or just running bare metal.

I’m thinking of trying out XCP-NG rather than ProxMox if that matters in the decision.

I think virtualizing those appliances, while possible, I don’t think it should be done due to unnecessary complexity and support. You’ll probably get other users on here to say they don’t have any issues and it’s been running for 20 years, blah blah. But I say don’t do it and save yourself the headaches and heartaches for your future self.

2 Likes

I’ve done this both ways and my strong preference is to run routing and storage on bare metal when that’s practical, though there are definitely situations where I think virtualization makes sense. For example, at sites where I’m limited to a single machine, I’d gladly virtualize my router and file sharing. In that case, I prefer letting the host handle the storage directly and filesharing via LXC or jails.

I don’t virtualize my firewall or storage. If an update to your hypervisor breaks the firewall then it’s a pain to get online and troubleshoot things. Storage is probably less risky, but there are things to consider:

I don’t virtualize my network simply because my family has a VLAN on my network and when my pfSense goes down, the wife and kids can’t watch TV, do homework, etc. Then I have to hear about it. My network is mostly set it and forget it at this point and I run it on bare metal.

Storage on the other hand is a mixed bag. Some of my storage is virtualized and some is not. My really important data sits on my Synology, and I try to never take that down. I also run an instance of OMV virtualized as a backup destination, and I run an instance of TrueNAS virtualized, that is mostly my storage for experiments and my K3S cluster.

So far, in my rookie experience, I was unable to achieve the supposed benefit that was sought after in virtualizing. That benefit was being able to migrate my OPNsense VM from one hardware to an identical hardware.

I now actually recall an instance where I had taken some of my ProxMox nodes down for one reason or another and the cluster fell below a quorum causing boot failures on my OPNsense node.

For the router, it seems like… solved = bare metal.

Thanks all

1 Like

If you run a cluster and on top of that you virtualize OPNsense and TrueNAS you’re not getting any of the benefits of clustering but all the drawbacks. Both are quite hardware dependent (network cards, disks, HBAs) at the Proxmox host hardware level.

Personally I run both virtualized for several years and never had any issues. Even moved OPNsense between different Proxmox nodes (standalone nodes, not clustered) at times with minimal hassle. The trick is to never passthrough the physical network cards and always present OPNsense with virtualized hardware. So long as both source and destination have the same SDNs it’ll be seamless.

The decision to run both virtualized for me came down to 3 things:

  • backups / snapshots / rollbacks with extreme ease; nobody wants a 2 hour downtime due to a botched update or another reason requiring a full reinstall + restoring configs, when you can achieve the same in 3 minutes with 2 clicks. If something goes horribly wrong during a hypervisor update (never happened, knock on wood, but it can) type a few commands and restore it to last known good version from PBS with 20 mins tops of downtime.
  • power saving + better hardware utilization. The nodes that have HBAs have way too much raw hardware to limit just to TrueNAS. I passthrough the HBA, everything else (including network cards) is virtual and call it a day. Extra bonus - 50+ GBps throughput between any VM on the same node and the NAS.
  • given I update my Proxmox nodes 1 - 2 times / year, the downtime from that is minimal and in my case can easily be neglected.

It all depends on your own hardware, comfort level and intended use case. The only things I keep bare metal in my home lab are 2 PBS servers.

This was the best answer by far.

Your three points are spot on. The only things I will add (just for fun and discussion) are;

  1. For me there is no decision to “virtualize” something. Virtualize everything. Or said another way, isolate everything. Do it for security & version control. Most people only think of the big “things”, but miss the smaller services like DHCP, DNS, VPN, etc. Not everything needs VM level isolation for security. Something as simple as bubblewrap can create a strong sandbox. It’s small, simple, and easy.

  2. Hypervisor (host OS) update risk can be mitigated significantly by running a more minimal host. If the host only controls disks, NICs, and starts containers/VMs, then there isn’t a lot to go wrong. This keeps the update process as safe as humanly possible. And if lightening strikes, the host OS can be rebuilt with maybe 4-5 config files (systemd-networkd actually balloons that number, but it is one directory to copy over). Minimalism wins.

It is debatable whether Proxmox or xcp-ng are apart of this minimal setup. They have a lot more code complexity than cli linux, but they also have nifty resort features to cover over that complexity. I guess it comes down to whether you worry about the update process. If you’re only updating the hypervisor twice per year, then than says something. Also, waiting this long exposes some level of security exposure, however minimal it may be.

I argue updating the host OS (hypervisor) should be like updating a container. From what I understand this is mostly the case with those two hypervisors. Small updates are generally easy, but version upgrades are not. This is why I favor an lts rolling release for my host OS. Heresy, I know! But my host OS is basically just as throw-away as my containers are, so I don’t really care if an update breaks something. And at this bare bones level, that is basically unheard of for rolling release. Doing this saves me from having to do the big upgrades every other year. Just small incremental updates to the host OS once or twice per month on a Friday afternoon. Do that until the box dies or is replaced.

I say all this to spark discussion. But I am not sure the Lawrence forums are the correct space for this kind of discussion.