Link Aggregation Concept

jinu · June 27, 2021, 2:58pm

Hi Folks
I am confused on the concept of Link Aggregation. I always thought that link aggregation was used to increase available bandwidth and failover.

So i connected two 1g ports on a Microtik switch and configured them in bonding (802.3ad) . Similarly on the QNAP NAS i configured the two 1g ports as Link Aggregation with mode as 802.3ad.

I then ran some tests using iperf3. What i have realized is that this is only doing fail over. Whether one cable is connected or two makes no difference , the speed is always the same (around 700 Mbps)

Is this the correct behaviour of Link Aggregation ?

brwainer · June 27, 2021, 3:07pm

If you are only testing with one Client-Server connection, meaning a single pair of IPs/MACs, then this is the correct behavior. LACP computes a hash of either the MACs (layer2) or MACs and IPs (layer2+3) and uses that to decide which cable each packet will go down. This is done to keep packets in order within each connection. The effect is that you will only notice both connections being used if you have two or more distinct client-server connections.

jinu · June 27, 2021, 3:45pm

Thanks @brwainer

I tested the scenario. I setup 2 iperf server instances on the QNAP nas and connected to it from two different servers, But same results, the speed on the QNAP NAS is around 700-800mbps. I then reversed the sceanrio, ran 2 iperf servers on the 2 machines and the connected to both simulatanously from qnap nas. Same results net speed was about 800Mbps.

brwainer · June 27, 2021, 4:01pm

You ran the iperfs simultaneously, and got 700-800 on both, totaling 1400-1600? Or you ran them simultaneously and got 350-400 on both, totaling 700-800?

If the former then its working. If the latter then a few possibilities:

The NAS has a CPU bottleneck - watch its CPU usage while testing
You made a mistake on the Mikrotik switch and the traffic isn’t being fully handled by the switch chip - watch the CPU load on the Mikrotik while running your tests, it should barely increase, and definitely not get close to 100%. Also make sure every port that’s part of the bridge, including the bond, shows an “H” next to it which indicates hardware offloading. Manual:CRS3xx series switches - MikroTik Wiki
The other devices you chose to test with have a hash collision, meaning that the hash result for both is the same. Try changing one of their IPs by +/- 1.

I am assuming that the NAS and the test systems are in the same subnet - is this correct?

FredFerrell · June 28, 2021, 1:22am

Any given flow will only be around 1gb. Imagine you are driving on a highway that is 2 lanes. You are the only car on the road and having two lanes doesn’t mean you will travel twice as fast. Each car is essentially a flow; in your case the iperf speed test. The top speed won’t change, but you will be able to run more flows in aggregation.

jinu · June 28, 2021, 1:57am

Thanks @brwainer
It is the latter case where net speed of the 2 clients combined is about 800Mbps

The NAS CPU usage tops at about 29%
The switch CPU load goes to about 5% max. H is present on all ports
I pulled in a 3rd machine to do the test again. There also same result. Total speed does not go above 800 Mpbs.

I monitored the traffic on the 2 ports and realised all the traffic is flowing only on one port. So one port is showing traffic of about 700Mbps and the other port is showing traffic of under 2Mbps

jinu · June 28, 2021, 1:58am

Thanks @FredFerrell
But i am testing with 2 servers connecting to the QNAP NAS, so would that not be the equivalent of 2 cars on the highway ?

brwainer · June 28, 2021, 2:27am

Try changing the hash type on the QNAP side, step #5 here: Set Port Trunking on your QNAP NAS to increase the bandwidth via 802.3ad protocol. | QNAP

Also please confirm these things:

You did actually set the QNAP side to 802.3ad, which is the true specification for LACP
Your NAS and the test clients are all in the same subnet

FredFerrell · June 28, 2021, 2:59am

Yes, and those two cars whether they are in the same lane or not will only be able to go at 1Gb of speed.

jinu · June 28, 2021, 3:08am

thanks @brwainer

I changed the Hash type to MAC+IP and this is what i noticed

Speeds stays same around 800Mbps
But now it is load balanced between the 2 ports. So i can see the data transferred on each port varying between 100Mps to 800Mbps, the second port varies between 800Mbps to 100Mbps so total data transfer hovers around 800-900Mbps. its like each iperf client has picked one port
CPU load on NAS is around 70-80%
CPU Load on Mikrotik switch is around 5%

To answer your questions

Yes on Qnap side i choose 802.3ad with Layer2+3 Hash Policy same on Mikrotik Switch
They are different Subnets.

My only guess is that on QNAP side the backplane is 1Gbps on which the two network ports are connected, but dont know how to verify that,

brwainer · June 28, 2021, 3:27am

Ah so I see the problem here.

Because the clients and NAS are in different subnets, to reach each other they have to be going through your router. Your router only has single 1Gb connections, correct? Regardless of whether its a single cable with two VLANs on it, or separate cables for each subnet, that 1Gb connection is your bottleneck.

Trying having two or more clients on the same subnet as the NAS if you want to see the LACP in action.

This is exactly what I would expect with your setup, because within the NAS’s subnet, only the router’s MAC is seen, for all clients on other subnets. There was nothing for the hash to distinguish with. By changing it to MAC+IP it is now distinguishing based on the IP addresses.

If you really want to push this with different subnets, you’ll need LACP or faster single connection(s) between the switch and router.

brwainer · June 28, 2021, 3:31am

You’re really not helping here - “two cars” (two clients) on a “2-lane highway” (dual connections with LACP) can together use 2Gb/s, provided you don’t have some other bottleneck in between (router) and chose the correct settings (hash). Yes, individually they can only use one lane, 1Gb/s, but @jinu understood that already. Your analogy really didn’t contribute to their understanding.

jinu · June 28, 2021, 5:21am

Thanks @brwainer
The two machines i am using for testing are connected to the Mikrotik Switch on 10Gbps ports (SFP+). And one of them acts as the router (pfsense box). The QNAP is connected to the same switch on two 1Gbps ports. So not sure if the router/Switch is the problem.

But all the same i tested by moving all devices to the same subnet and there was a marginal improvement. Speed goes up to between 900 and 1000 Mbps (this is with 2 threads per iperf client)
QNAP CPU utilization continues to be around 70%

But still anticipated end state not achieved.

EDIT: I think i found the problem. On Qnap if i check at command prompt using top i see a CPU utilization of around 60-70%. But on the QNAP dashboard CPU utilization shows 100%. I am not sure how this works. I am guessing that the CPU is topping off . Worse if I keep running this like this for some time (10-15 minutes) the NAS crashes.

Ridiculous product design whats the point of providing port trunking if the CPU is not capable of handling it. If it is only for resiliency then it should be stated as such.

FredFerrell · June 28, 2021, 12:56pm

The cars are flows, not clients, and a single flow will only be able to run at 1Gbit. BTW, I have used this analogy with many students and it seems to work for the majority of them.

brwainer · June 28, 2021, 5:27pm

You got to it before I could come back and say that it seemed the CPU on the QNAP was the likely culprit…

QNAP, Synology, and others use a common operating system with common features across their entire lineup. The CPU is generally matched to the expected set of drives the appliance will handle. 2-4 hard drives in an array can exceed 1Gb/s in sequential reads or writes, but its fairly common for a NAS to be accessed by multiple clients, causing the activity on the drives to become effectively random. Therefore with 2-4 3.5" bays, you don’t usually expect to serve more than 1Gb/s regardless of the connectivity.

@FredFerrell I think the analogy is good in general, but the understanding of the concept here wasn’t the issue. Although I will disagree that the cars are flows vs clients - this depends on exactly what hash algorithm is being used on each side of the aggregation. Some devices will do layer2+3+4, which would include source and destination ports and therefore flows, but most including the ones in question for this exact situation are only doing layer2+3, meaning for the purposes of assigning traffic to one cable or the other, only MAC and IP are being looked at.

GregBinSD · June 29, 2021, 6:48pm

@jinu
Can you tell us which model QNAP NAS you are using in this experiment?

Thanks.

pjdouillard · June 29, 2021, 6:55pm

@jinu “normal” LAG that follows RFCs will only allow each point-to-point communication (or a flow as other mentioned) to go as fast as the fastest individual link you have in your bundle of links. Some proprietary LAG (like MC-LAG between physical chassis switch) will split the the flow among links but that is out of your control and will actually let you run a single flow at N x number of links you have bundled.
Else, you either tries to used different routing strategy on the LAG it self at layer 2 (MAC),3 (IP) and 4 (Port), or as in most big entreprise with tons of VLAN, you change your root bridge per vlan to control how STP will allow or block certain traffic on certain links. But that’s quite advance networking that need to be designed on paper before going full tilt on configuring the switch! But that will let you maximize link aggregation.

FredFerrell · June 30, 2021, 12:57am

One interesting technology that MS Server provides with SMB3 is the ability to load balance across multiple links without the need for leveraging LACP. Basically if you have two 1G ports, you can pass 2G between a client and server. You just have to make sure you have enough CPU and a good network card on both systems.

jinu · June 30, 2021, 2:09am

@GregBinSD I am using the QNAP TS-231
@pjdouillard I get that. but since i have initiated >2 flows I was hoping that they should individually be able to reach 1Gps and hence show a combined flow of at 1.5 to 1.7Gps. But as i observed does not look like the NAS CPU is capable of handling that.
@FredFerrell My use case is communication between 2 NAS servers (linux based). So this would not be applicable. Though useful info. Thanks

FredFerrell · June 30, 2021, 2:29am

Linux should support iSCSI so maybe look into setting up MPIO between them?