Sudden throttling transmitting cross-VLAN or to WAN

I recently determined that my primary Windows 10 desktop PC is experiencing severe throughput degradation when sending traffic to any destination other than a host on the same local VLAN. Traffic inbound to the PC – regardless of the source network – does not seem to be similarly impacted.

I’m a pfSense (and Lawrence forums) newbie looking for recommendations on how to troubleshoot this kind of problem and any thoughts on where and what might be the root cause. I’m trying to keep my first post brief in case the symptoms point to an easily identified cause. If that’s not the case, I can drill deeper with more details about our network configuration.

Our current network configuration has been in place for one year with three VLANs: USERS, IOT and GUEST. After a couple of weeks of learning how to write the appropriate firewall rules, everything was working just as I wanted and for nearly eleven months everything has worked great. I haven’t touched any configuration settings during that period.

About a month back I began to experience some unusual behaviors on the Windows 10 PC. Finally I ran Ookla Speedtest and found that while the download speed remained in the same ballpark as before (220-230 Mbps), the upload speed was now about 0.1-0.2 Mpbs, less than 1% of the 20-25 Mpbs rate seen in earlier tests. Speedtest results from other hosts, regardless of Ethernet or Wi-Fi connection or which VLAN that host lives on, still report the expected 220-230 Mbps down / 20-25 Mbps up speeds.

On the Windows 10 PC I ran Wireshark traces of four typical operations. I configured Wireshark to record just the two hosts involved in each operation. The Windows 10 PC is always 10.77.77.100.

  1. Plex app startup on my Pixel 3a phone while it was Wi-Fi connected to the USERS VLAN (10.77.10.110). Worked as expected and the Wireshark trace flagged no transmission errors on that traffic.

  2. Plex app startup from the same Pixel 3a phone while it was Wi-Fi connected to the IOT VLAN (10.77.20.112). The app was sluggish to the point of being unusable. The Wireshark trace flagged a large number of frames (packets?) that experienced issues.

  3. Send a print job to my HP OfficeJet Pro that has an Ethernet connection on the IOT VLAN (10.77.20.99). A simple text document will print, but the Wireshark trace flagged many frames/packets that experienced issues. Printing a document that contains images will stall after printing maybe a 1" swath and eventually the printer times out, ejects the page and the Windows spooler shows the job in an error state.

  4. Using VueScan I scanned a color image from that same HP OfficeJet Pro (10.77.20.99). An 8x10" color photo scanned in a reasonable amount of time and the resultant image preview was as expected. The Wireshark trace flagged some transmission errors on the SOAP messages sent to configure the scan operation, however the frames/packets of image data returned from the OfficeJet transferred without any errors noted.

When Wireshark flagged frames/packets, the following annotations were most commonly seen: “TCP Retransmission,” “TCP Spurious Retransmission,” “TCP Dup Ack xxx#y,” “TCP Previous segment not captured” or “TCP Out-Of-Order.”

I wanted to upload the Wireshark trace files that correspond to these four test cases but was blocked from doing that. How can I share that information to this forum – hyperlinks to files on Google Drive?

Below is a simplified view of our network’s physical topology.

Thanks!

If there haven’t been any changes to your network configuration I’d start at layer 1 (Physical) and see maybe swapping the network cable or trying a different port on the switch and/or new port on the PC if applicable.

I don’t have a nuanced approach, but what I would do is:

  • Perform all my testing over wired connections to take Wi-Fi out of the picture
  • Run iperf using UDP and see if the throttling occurs in a situation where it does throttle using TCP
  • Take the managed switch out of play if you have dedicated interfaces on the pfsense box available for each VLAN to link to a dumb switch
  • Add all clear firewall rules to the top of each of the interfaces in use in pfsense
  • Check each of the VLAN interface settings to ensure that it’s not blocking traffic from private networks

The fact that traffic seems to move in one direction makes this very weird. I can’t think of any sort of layer 2 protocols that would interfere with TCP acknowledgement packets making it back to the sender if the original packet made it through. Could a loop in the network cause a broadcast storm that would cut off an existing connection? Just speculating.