PFSense VM Speeds in XCP-NG w/ 10G NICs

For context, I have followed all Tom’s guides and other settings mentioned in various different forums with no resolution yet.

My company has a dedicated server build hosted with a cloud provider, with the following hardware:

  • Ryzen 9 7950x
  • Asrock server motherboard w/ 2x 10G Broadcom NIC Ports
  • 128GB RAM
  • 2 x 1TB NVMe
  • 5 Public IPs

Host:

  • XCP

For VMs, we have:

  • 2 x Windows Server, 1 for a CLOUD DC & another for our cloud storage host system (the cloud dc is necessary for the storage one to reference AD)
  • 1 x Linux system that hosts our reverse proxy, docker containers and VPN connection

The goal was to get all of the VMs behind a PFSense VM so they all interface behind one IP and there is better security since they will be behind an actual firewall.

But, when doing all my testing, I am getting poor performance speeds when testing behind the PFSense LAN.

I have both 10G NICs available, one is the host management which a vif was created to be the WAN connection for PFSense while the other for the LAN connection. This setup is just straight simple connections, no VLANS, TX checksums disable on both vifs connected to PFSense and all VMs were rebooted after the vifs attached. I have tested several other ethtool config modifications, changing MTUs, etc, but nothing works. I have even tried completely opening the firewall for everything (only used as an open base for testing purposes to try anything)

When running speed tests directly through the network connections in the windows cli using the speedtest binary, I can hit 8-10Gb everytime. But when running behind pfsense, I am lucky to get 3Gb, which isnt sufficient since we have 400+ users from different states having constant connection sessions to the cloud storage solution we self host (not nextcloud or owncloud lol).

I imagine this is a freebsd driver related issue, possibly with the NICs, but incase there is some sort of setting im missing, I figured id reach out to see if other have insight. Ive even tried BRAVE and CHAT AI, but no luck.

If it is a freebsd related issue, maybe I should just install openwrt and use that?
Our needs are very small for our setup since my traefik reverse proxy handles most of the work, so all we need is:

  • A firewall to put our VMs behind to secure them
  • Be able to install tailscale client to connect to my headscale controlplane so the other DCs can see our CLOUD DC’
  • Be able to port forward the http traffic to my traefik reverse proxy to handle all our microservices and point to our cloud storage implementation

All ideas for a more streamlined setup are welcome as well, otherwise, any info on this particular issue is much appreciated. Im an IT guy, but networking isnt my area of expertise.

First, broadcom NIC’s are not great and in the past have given me issues with XCP-ng and in pfsense. Second, there are some per stream limitation in pfsense but it can handle multiple streams giving you higher overall performance. How did you test the bandwidth?

A friend is seeing 12.6Gbps iperf3 traffic between 2 VMs on the same VLAN (all only in virtual network on XCP-ng, no virtual-to-physical traffic) with no pfsense VM in between. The same VMs with a pfsense VM in between (active filter) will only do 4Gbps. The hardware is a ASRock Rack B650D4U with AMD Ryzen 7 7700.

If I understand the setup described by @jeanxx correctly the traffic is completely on the host. Will the NIC hardware make any difference in this case?

Yep, I remember the Broadcom NIC problems which is what made me hesitant tomove to XCP-NG for our cloud server setup, but when I got everything setup and ran the numbers, I was hitting the 10G speeds out of the box.

I use the speedtest binary to run benchmarks and I use the same server id every time as I know which server shows me the full bandwith my system can reach.

With that being said, I ran the speedtest binary on the windows server several times, getting consistent numbers every time (between 7-8G), but then ran the same test with the windows server then behind the pfsense VM lan and every time, I could not get over 3G.

Im not concerned with the local traffic as the only traffic passed locally between the vms is AD queries. The actual internet traffic speeds are my concern and on a direct virtual connection no issues on the broadcom ports, but behind pfsense, the speeds are cut entirely in half.

Maybe I should just try openwrt and see what results I get since that is based on linux drivers instead of bsd and see if I get the same results. Because if I get better speeds behind openwrt, then ill know its something with freebsd and if the speeds are the same as pfsense, then ill know something else is causing the problem.

So I somewhat misunderstood the setup you tested, but have you seen the numbers I posted 5 mins ago? The pfSense VM pulls the speed down from 12.6Gbps to 4Gbps and in that test the physical NIC should not even have been involved! I don’t think that you can expect a lot more as long as you run pfSense on a amd64 platform, but I am happy to learn how you can get more out of it.

Oh, I misread the first response and just went back to re-read it. Okay, I see what youre saying now. And my setup uses that exact MOBO I believe.
But the question I have relates to pfsense and amd64, is there a known limitation of pfsense on an amd64 architecture? All netgate hardware installs are amd64, so I would imagine it wouldnt be amd64 specific, but instead the virtualization layer of the hardware, no? Maybe direct passthrough, bypassing the virtualization layer would improve the network speeds.

But, if your friend is seeing 4G max speeds, then maybe I will try my luck with OpenWRT since its linux based and seeing if I get better speeds. My setup doesnt specifically call for pfsense, just a good firewall that can run the numbers and have a tailscale connection.

if you look at the numbers that netgate are publishing for their products you don’t get any wiser because they aggregate numbers for all ports in both directions. One could hope that the netgate 8200 can push 9Gbps full duplex between the two SFP+ 10G ports, but there are no numbers published by them that would confirm that. Maybe someone here has tried. The 8200 CPU is a lot less potent than what you or my friend are using, so in theory, you should both see better numbers…

I would also assume that OpenWRT performs better, but I never made a head to head comparison with 2 VMs.

This all makes total sense now. Thank you for the updated info, I really appreciate it. Its quite odd that FreeBSD is having such a huge limitation since Netflix is a big contributor to their codebase and I would imagine the have 10G bandwidth needs. But, nonetheless, I thought maybe I was doing something wrong and it was driving me crazy. So again, thank you.

Run iperf3 using -P 10 option to set the number of parallel client streams to run at 10. Netflix does use FreeBSD, but no ONE person needs a feed that fast, but it can handle thousands of people each needing a connection fast enough for 4K streaming which arround only15Mbps.

@xerxes I figured out what a big chunk of the issues were. I had to change the pfsense vm from Realtek to e1000 (Realtek caused disgusting speeds in the pfsense vm). Then (based on something you said that lead to speed limitations that netgate announced) I also had to increase the cpu count (I had it at 4 vCPUs), and apparently FreeBSD requires a minimum of 8 cores to handle a 10G connection.
Source: Networking/10GbE/Router - FreeBSD Wiki

After those changes as well as disabling tx checksum on the PF VIFs, I was able to reach speedtest 7G and iperf 7.5G (thanks to @LTS_Tom as I wasnt able to saturate the connection with anything below 4 parallel connections, but doing iperf3 -P {any number 5 and up}, I was able to saturate the network and ensure I could get full bandwidth). Theoretical max ive hit on fully open connection was 8G with our current provider, so this provided close to raw bandwidth speeds.

Only issue after that was that Windows couldnt take full advantage of the connection no matter what settings I tried. But, my linux box was able to hit the 7G limit without issue, so may just be Windows issue (not surprising).

Only issue I had after that point was that I use traefik reverse proxy with docker services on my linux box and no matter what, I couldnt get those services to connect on that network behind pfsense. Likely due to the fact that I also run our headscale server on that same subnet and it most likely doesnt like that. I tried redirect rules and my subdomains would auto redirect to the webconfigurator. Then I tried HAProxy and having it just redirect http requests to my traefik reverse proxy, but nothing worked. So, ive scraped it all and went back to stock connections and will just lock everything down further. No ports are exposed on the Windows servers as the DC is connection via tailscale.

I just wanted to come back and update you both on the information I found out and thank you for your assistance on this. Thank you!

2 Likes