I am experiencing asymmetric speeds with network connections that involve SFP+ modules.
Laptop (we will call it the laptop) is connected to a Ubiquiti USW-Lite-8-PoE (we will call it the switch)
A Proxmox server (we will call it the server) is connected to the switch
The laptop has a run of the mill 1G NIC which is connected to the switch with a 1G link
The server has an Intel X520-DA2 NIC with an Ipolex ASF-10G-T SFP+ module that is connected to the switch with a 1G link
All cables are known to be good. They are CAT6A S/STP and are 2m long.
The switch is on the latest firmware, but was tested with older firmware with the same results
- iperf3 was used with default settings, the server with iperf3 in server mode and the laptop in client mode
iperf3 -c <server_ip> from the laptop reliably results in ~940 Mbps with usually 0 or <10 retries
iperf3 -c <server_ip> -R from the laptop reliably results in ~250~350 Mbps with usually ~25K ~ 30K retries
Troubleshooting steps taken: (not listed in order)
Re-verified the cables and tried different cables with the same results.
Verified that the laptop is capable of ~940 Mbps speeds in both directions by running the same tests but to a Raspberry Pi 4 elsewhere on the network. It is cable of those speeds.
Ran the same test as described in “The Test” section above but used the server’s motherboard’s 1G Realtek NIC instead of the X520. This resulted in symmetric ~940 Mbps speeds as desired.
In the same configurations as described in “The Test” section above. I swapped the Ipolex ASF-10G-T SFP+ module with another of the same model. The same asymmetry was the result
Ran the same test as described in “The Test” section above but used a TrueNAS Core server (we will call this the TrueNAS server). The TrueNAS server also has an Intel X520-DA2 NIC with an Ipolex ASF-10G-T SFP+ module that is connected to the switch with a 1G link. The exact same asymmetry was observed.
Ran the same test as described in “The Test” section above but swapped the switch with a Ubiquiti Switch Flex Mini. The same asymmetry was observed.
Ran the same test as described in “The Test” section above but swapped the switch with an unmanaged Netgear 8 port gigabit switch (I cannot remember the model number). A worse asymmetry was observed. ~940 Mbps were observed going to the server and ~1~2 Mbps were observed coming from the server with ~50K retries.
In the Unifi Network controller, I manually set the link speed to 1G. The same asymmetry was observed.
In the Unifi Network controller, I turned on Flow Control. The same asymmetry was observed.
There is a speed reduction when the server and the TrueNAS server upload data via the X520-DA2 with an Ipolex ASF-10G-T SFP+ module.
The Ubiquiti managed switches actually help the situation over the unmanaged switch.
I believe that the Ipolex ASF-10G-T SFP+ module may be the problem as I have seen other posts on other forums with identical or nearly identical issues that were resolved with going with another SFP+ module. It seems to be a problem when the Ipolex SFP+ module does not have a paired module on the other end. My scenario has it being hooked up to the switch via RJ45 at 1G links. I believe that the 1G links has something to do with it too. I have, in the past, performed these same tests at 10G link speeds with the same X520s and Ipolex modules that are now in the server and the TrueNAS server thru a Mikrotik 305 switch. I was able to achieve ~9.40 Gbps in both directions and even ~9.80 Gbps in both directions with JUMBO Frame enabled.
What I need:
I could use guidance on how to fix this asymmetry without buying new hardware if at all possible. I am just a homelab enthusiast with a budget rightly controlled by my wife. It is entirely possible that I am doing something incorrectly.
I am not opposed to trying different SFP+ module brands as those are cheap enough that I can probably convince the wife to let me buy one or two. The question is, which do I buy?
From the posts I found in other forums, people seem to have solved with problem by going with a MikroTik S+RJ10.
I am hesitating and would like some guidance. The MikroTik S+RJ10, to my knowledge, has the same Marvell 88X3310 controller chip as the 10Gtek ASF-10G-T, which when I go from the 10Gtek product page to the online store they link to, shows an image of the same Ipolex ASF-10G-T that I already have. It appears to be a rebadge. I am not sure why I would have any different experience with the MikroTik S+RJ10, or even the 10Gtek ASF-10G-T (I know that @LTS_Tom has recommended those).
Any help here would be appreciated.
I agree that seems like the Ipolex module is the common factor. It probably doesn’t have a large enough buffer to deal with its host (whichever device currently installed in) sending bursts of 10Gbps packets and having to trickle them out at one gigabit per second. These types of modules always “train” at 10Gbps with the SFP+ port, and then have a (potentially) separate speed on the RJ45 side, and act as a 2-port ethernet switch.
Normal direction (without -R) is client sending to server. -R is server sending to client. I think you were always seeing issues when the device with SFP is sending?
@brwainer Correct. The issue is only seen with the server sending data.
Can you manually force the X520 card to 1gbps mode? The card says it will do 10/1 and Serve the Home seems to have been able to get that SFP+ module working at 1 gbps. You may also need to find a way to update the firmware in that NIC card.
Alternate I might try a used SFP twisted pair module, in general they are backwards compatible. I picked up a bunch on ebay for a few dollars each, bought a lot of 5 because I needed at least 3. I think they were $6usd each.
Or ultimately, just use the 1gbps on the server main board and keep the X520 until you can put in a 10gbps switch, these switches are getting cheap if you are willing to order from Alibaba. More money but the Mikrotik CRS309-1g-8s+in is a nice small switch, i’ve been getting over 5gbps when transferring files between my old lab hardware. I’m getting faster on the one I use as a “top of rack” for my production system, but still hard to push it all the way up to 9gbps, I’ll know more when I can finally move that LAN over to my main switches (if they ever show up). Right now all my services hit the network over 1gbps and the 10gbps is between hosts and storage. I think the fastest I saw was 8gbps when I was migrating a VM from one host to another. Here is a speed test between a Windows host (XCP-NG) on it’s NFS storage on a Truenas server. Write is nice and fast because it is falling into the RAM buffer, the 8 drive spinning rust array isn’t horribly fast bringing that data back out. Eventually I’ll have to upgrade the drives.
I do not know how to manually set the link speed in Proxmox. Know of any tutorials? I will of course look myself, but recommendations are always nice.
I do not know how to update firmware on the on the X520 or the module, but I am willing to learn. Do you know of any good tutorials that I can consume? Also, where do I get new firmware from?
I can see about getting some standard SFP modules. Any model recommendations?
Do you think it is worth trying to get the 10Gtek ASF-10G-T or Mikrotik S+RJ10? I was also thinking about getting the 10Gtek ASF-10G-T80 because it uses a newer Broadcom BCM84891L controller instead of the Marvell 88X3310 that the others use meaning it would have to be flashed with different firmware. Also, it uses less power.
Another option I was looking at was the 10Gtek ASF-10G2-T as it uses the Marvell AQR113C controller.
I would appreciate any thoughts.
OP, those NIC don’t have fans built on them as they are usually inside servers with alot of air flow. Does your server have lot of airflow for removing the heat from it? That 10GE NIC has tiny heatsink on its Intel chip. On my 10GE cards, I have added small Noctua fans to fix that heat issue on my custom servers low-noise (not much airflow).
On the switch ports, do you see errors on it? Dropped frames, tcp windows error, etc, when traffic arrives from the server?
Also, when using -R option, your 10GE NIC start pushing at 10GE can your switch “talk properly” to your server to make it slow down?
Did you try to connect your laptop directly to the server and retest without any switch in between? Modern NICs support MDX so it should work - you’ll have to setup an IP on your laptop on the same network segment as your server - no gateway IP needed.
My NICs have X540 chipset on the servers (with RJ-45 ports) up to my FortiSwitch SFP+ with RJ-45 transceiver, and I have no issue on my 10GE network lab servers. My pfsense router uses a Chelsio T520-SO-CR2 fiber to the same FortiSwitch. This cards also worked perfectly in the past with TrueNAS.
I do have adequate cooling. I am running a 2U supermicro chassis with all the stock supermicro fans installed and running. I had the same thought as you, so I set the fans to 100% (made 'em scream) and ran the test again. Same issue. I do not believe it is a cooling issue.
I see Retries when using the -R option. The link speed reported by the switch, which is set to auto negotiate, is indicating 1G, not 10G. My Ubiquiti switches are not 10G capable. The X520 and the SFP+ modules claim they support 1G or 10G. I did set that port on the Ubiquiti switch specifically to a 1G link. There was no change after doing that.
I did not connect directly to the 10G port on that NIC with my laptop as that port is my main connection to the network for my services and is the “Production” system for my services. Also, it is also the management port for Proxmox. It is kind of a pain to change that. At least it used to be. Maybe Proxmox VE 8.x made it easier. I did run a test going from the laptop to the server using the Realtek NIC instead. This did give the expected ~940 Mbps in both directions. If I get a chance to take down the “production” services, I will try doing a direct connection to the port in question.
Right now, based on other’s feedback here, I believe the issue to be with the SFP+ modules that I am using. Specifically, I think the problem is that I am running them at 1G, not 10G. I found some cheap HP 1G standard SFP modules on ebay. I bought one. It should arrive early next week. I will test with that and report back my findings.
If I can convince the wife to let me spend more money on a 10Gtek ASF-10G-T80, I will get that, run the tests and report back.
If you had an SFP module issue, it would be in both ways, not only one way. If the SFP wouldn’t support 1G, you would have no connectivity at all.
For me, your problem is the NIC of your laptop somehow when receiving data. What CPU are you using and what OS is on the laptop? Have you try another laptop/pc with the same connection. Do you offload anything to your NIC’s card? If so, disable that and let the OS do that job. The throttling implementation of some offload functionnality on some NIC is not always done correctly.
I did state my test above that the NIC in my laptop worked fine when not connecting to the SFP module, but to a different port.
Both the server and the TrueNAS server have this problem. Not just with the Laptop NIC. It also happens with Raspberry Pi 4, an Intel X550, another laptop, even from X520 to X520 (the server to/from TrueNAS). Is that last case, both directions exhibit ~250~350 Mbps speeds. Which would make sense. There only seems to be a problem when using connections with the SFP+ ports. Could it be the two X520 NICs, sure. If the newly procured SFP modules I ordered do not change the behavior, then I believe the X520 NICs would be the problem.
If that is the case, I have another 10G NIC, an HP something, that I can try the SFP+ modules in to see if the problem persists.
Could it be that your SPF+ tranceiver is not compatible with the NIC then?
I suppose that it could be. The next test I do may help determine that. It is not a problem at 10G though. I previously used these NICs and SFP+ modules in a point to point test between the server and the TrueNAS and the machine with my X550. I was able to achieve ~9.40 Gbps in both directions and even ~9.80 Gbps in both directions with JUMBO Frame enabled. That was a few years ago now.
I do currently have the server and TrueNAS server connected directly using the same Intel X520 in both machines, but that link speed is 10G. I am using Ipolex Fiber Transceivers for that connection though. Not RJ45 modules.
I purchased an HP 659580-001 1G SFP RJ-45 module. Running it through the same tests showed expected/proper behavior. I get ~941 Mbps in both directions with no retries.
I believe that the Ipolex ASF-10G-T SFP+ module has some problems when running at a 1G link speed.
Follow Up Steps:
I am going to do more testing at 10G, but that will take me a bit of time and effort to set that up. I believe that I did that testing a few years ago with these modules and there was no problem at 10G.
I am going to see if I can convince the wife to let me buy a 10Gtek ASF-10G-T, Mikrotik S+RJ10, and/or 10Gtek ASF-10G-T80. It would be interesting to see if these also have the problems that the Ipolex ASF-10G-T SFP+ module seems to have. Anyone have any suggestions for other SFP+ modules to try that might be cheaper?