Dual-WAN/SD-WAN - What are you guys doing?

Hi all,

I’ve been working on building out my homelab environment for a very long time. I have been looking to solve a very specific problem I have with my current configuration and I was wondering how you all are dealing with it if you are in a similar place as I am. My homelab was started for two reasons: learning and Plex. Plex is still a critical service I maintain and I consider myself a lifelong learner.

Earlier this year I was able to get a Starlink subscription which I use in conjunction with my cable modem. Most of my traffic is statically going out either one or the other by design…but I want to better leverage my resources.

Afterwards I got OSPF setup on my core switch between itself and two dedicated pfSense firewalls (which are also dynamically routed between eachother, more on that later) for each WAN circuit. This creates a dynamically routed WAN edge that was designed so that if an internet connection was misbehaving, my network would simply route out the other connection automatically.

In my current configuration the Starlink path is statically set to have a higher path cost in OSPF, so my “normal” network traffic won’t egress out of Starlink unless the cable modem pfSense box is no longer advertising the default route. In theory, this should have been enough for failover, but I was having some crummy network performance from my cable modem side and a failover never occurred.

What I hadn’t considered is that since the connection of the L3 interface between my switch and pfsense was fine and stable, OSPF had no idea there were issues and continued to route packets in that direction. In otherwords, pfSense was still advertising the default route because it didn’t go down, it was just performing poorly and my dynamic routing implementation didn’t have a mechanism to handle it.

My next phase was the addition of two additional networks between each of the pfSense firewalls. One network had a gateway on my cable modem pfSense and the other network had a gateway on the Starlink pfSense. That, coupled with OSPF, allowed me to create dynamic routing paths between my normal network and the WAN edge through eachother. What this allowed me to do was leverage pfSense’s gateway groups and tiering functionality.

When packetloss or high latency hits a threshold, the cable modem pfsense will automatically change the default gateway from my cable ISP to Starlink. When that happens, my network traffic is automatically rerouted to Starlink, but through the cable modem pfSense box and then to the Starlink pfSense box. This works fairly well, and so far this is where my story ends.

The problem is, when this happens my external IP address changes. This causes clients with active sessions to things like streaming video, gaming, video chats, etc to freak out, and rightfully so! I’ve been looking at trying to find some flexible and cost efficient solutions at solving that problem.

One thing I’ve considered doing is leaving my existing topology exactly as it is, more or less, and creating a Wireguard tunnel to something like Linode from each of my pfSense boxes and forcing all traffic to egress out of my network, through the internet to Linode, and then using the IP address of my Linode instance for all of my services. Another thing I’ve considered is somehow using ZeroTier in a similar way, though I’m not sure how to implement it into my current topology.

The other option I have put serious thought into is co-locating a server (which I already own) somewhere, and then I can run off-site backups to it and have other services in addition to just being my public-facing IP address.

What are you all doing?

Aside from the learning experience, you could obviously simplify things somewhat with similar results to what you’ve achieved thus far with a single pfSense, multi-wan and tweaked gateway monitoring and groups setup.

With the caveat, like you mentioned, that any active tcp connection won’t survive a gateway change. What you’re chasing now is some form of bonded solution, which requires at least two parts to the equation - your side and (typically) an ISP who offers bonded solutions.

Here’s some tidbits you might want to look into:

1 Like

My recommendation is create a transit hub in AWS or Azure and configure a IPSec tunnels to each of your pfSense routers. Then setup BGP and make sure to prepend on the Starlink firewall and setup a higher local-preference on your cable modem pfSense. This will force traffic across your cable connection and maintain the same public IP for internet bound traffic.