YAWT (Yet Another Wan Thread)

I realize this question has been asked before, but the threads are old enough that the version of pfSense is not current. And, I’m at wit’s end trying to get my system to work the way I need it to.

I am using pfSense version 2.6.0
I have an intel 4 port card in my Dell computer with three connections:
TCT (igb0) is a wan interface connected to a cable modem, connected to the cable company.
WAN (igb1) is shown as WAN
(CENT_LINK) PPPOE(igb1) is a wan interface connected to a dsl modem, connected to a dsl enabled phone line.
(LAN) igb2 is shown as LAN
(OPT2) igb3 is not connected to anything
(OPT3) re0 is not connected to anything

Under Status->Gateways->Gateway Groups is a group named Prefer_TCT
In this group, TCT is listed as Tier 1 and CENT_LINK_PPPOE is listed as tier two.

The desired action is failover only. No balancing, just if TCT goes down, pfSense should just switch to CENT_LINK_PPPOE and remain until TCT comes back up and then switch back to TCT.
Currently, if TCT goes down, the switch is made to CENT_LINK_PPPOE as it should. But it never comes back! The only way I find is to reboot the system.

In researching this problem, I have found several threads with answers as helpful as “It works for me” and suggestions that I tried with no success, and even one response that suggested that if the OP couldn’t get it working, he shouldn’t be using pfSense!

A search of NetGate’s site shows documentation about this, but for an older version of the software. Well, if it hasn’t changed, it should work. right? Only it doesn’t!

I’m sorry if my frustration is showing, It’s just that I have been working at this for way too long without success.

I truly hope someone here can help.

Bart

The failover won’t switch back immediately when the main WAN goes down because that would break all the states hosts have created over the backup connection and would be very disruptive. New states over time will eventually start going back over the main wan once back up.
https://docs.netgate.com/pfsense/en/latest/multiwan/load-balance-and-failover.html

Thanks for your reply.
I’ve had a failover occur, according to my logs, at around 9:00 PM and the next day at around noon it still had not failedback.

I don’t know of anything on my system that would keep a connection that long.

Is there a setting that would allow the failback even if it broke states?

Bart

Look in the logs at the “Gateways Log Entries” and see if the gateway is still in a failed state or any other errors that you could throw into google.

I am seeing a massive amount of entries that say I am getting packet loss. I am getting emails that say the same.

example1 at 10:11:19
Notifications in this message: 1

10:11:19 MONITOR: TCT has packet loss, omitting from routing group Prefer_TCT
72.21.70.3|24.72.201.211|TCT|16.918ms|4.325ms|23%|down|highloss

followed 1 minute later by example 2:
Notifications in this message: 1

10:12:30 MONITOR: TCT is available now, adding to routing group Prefer_TCT
72.21.70.3|24.72.201.211|TCT|18.751ms|4.632ms|0.0%|online|none

I have called my provider and they say they can’t see any problem. From my computer, The connection seems to work OK.

You said there may be a connection or state that is preventing it from failing back. Today, I unplugged every device from the switch except this computer. I did a speedtest.net and it showed that the TCT connection was active. I then unplugged the TCT connection and pfSense failed over to the Century Link connection as it should. I then plugged the TCT connection back in, rebooted my computer and tried speedtest.net again. pfSense had not failed back to TCT. So I tried the test again and this time rebooted both my switch and my computer with the same result.

I can’t believe that there are thousands of people and companies using this system without this problem, so it just has to be somewhere in my settings.

Looking in my logs as you suggested, I see a message in the logs, at the the time I unplugged the TCT connection and the change to Century Link. I see entries where some packets were missed but, most importantly, I don’t see a single entry suggesting pfSense was trying to reconnect to TCT.

I’m about to give up, which really goes against my nature!
Bart

Maybe there is some issue with the network cards.

Followup.

This problem has persisted. I got to thinking about your suggestion and watched this problem a little more closely. It seems that if I reboot the router, I’ll get a couple of days without problems, and then they start again, getting more common as time goes on. I forced the system to use my secondary WAN for about a week, and the problem still occurred. That rules out both my ISPs. I also noticed that it seems related to the amount of traffic on my network. When my backups occur, generating a bunch of traffic, the problem pops up. I did some research on network cards and it seems there are some intel cards that don’t work well with pfSense. I bought my 4 port card from Amazon and didn’t pay attention to the model, so I may have one of the bad ones. Something to do with the caching of the hardware ports? Don’t know.

Today, my new Netgate 4100 will arrive. I now know I will have good hardware! We’ll see how it goes.

I only brought this up again in the hopes that someone else just might be helped. So I’ll reply again when, and if, the problem is corrected.

Bart