Routing multiple network interfaces

In the quest for better storage performance, I am trying ethernet link aggregation.

I have added additional PCIe NIC interfaces to both my XCP-ng and FreeNAS hosts with the intention to set up LAG groups on each OS (and my switch) so I have more bandwidth to store and retrieve backups. The switch I am using does support LACP, so I think I have what I need.

I have added virtual interfaces on my FreeNAS and XCP-ng host with the new PCIe card ethernet ports added to virtual (LACP) interfaces and also added the associated switch ports to LAG groups with LACP enabled - one group for each host. This is intended to allow me 2Gbps networking from the XCP-ng Xen Orchestra VM to the FreeNAS host for backups. Creating the aggregated ports seems to have worked, and the switch reports the new LAG groups are active.

What I cannot get right is the IP addressing and routing - I need some IP routing pointers:

My default LAN network is (for sake of argument) 10.0.0.0/24. First assumption - I think I need to make the virtual interface for the LAG group reside on a different network, or how could routing work? So, I chose 10.0.1.0/24. See below:

XCP-NG host

  • eth0 (management) 10.0.0.10/24, GW 10.0.0.1
  • eth1 (LAG1)
  • eth2 (LAG1)
  • LAG1 10.0.1.10/24 <-- Different network to eth0 (required?)

FreeNAS host

  • eth0 (management) 10.0.0.20/24, GW 10.0.0.1
  • eth1 (LAG2)
  • eth2 (LAG2)
  • LAG2 10.0.1.20/24 <-- Different network to eth0 (required?)

Layer2 Managed Switch

  • LAG1 ports 1, 2
  • LAG2 ports 3, 4
  • LAG1 LACP managed based on MAC address
  • LAG2 LACP managed based on MAC address

I noted that in both hosts cases the LAG virtual interface does not have a way to set a default gateway, which suggests to me the operating system assumes a default interface in any situation of an ambiguous destination? It also implies there is a concept of interface selection (routing) for sent traffic to a known network would use a specific interface. I am not clear if that routing is implied or needs to be explicit (static).

I need clues as to how IP traffic is routed across multiple interfaces, based on the destination IP. I know that in a simple single interface network anything destined for a network not part of the local network (based on netmask) is sent on to the default gateway. In this LAG case it’s more complex, I have multiple interfaces. Do I need to express static routes so traffic to the network matching LAG1 actually goes over LAG1? What controls the routes with Debian and FreeBSD, such as the /etc/network/interfaces file?

I only need some search terms, and I’ll Google the rest - as usual, it’s all about knowing what the question is (how to phrase it).

Thanks,

Been a while since I have tested LAG on FreeNAS, but it is on my todo list. I just build a new (from old parts) FreeNAS lab server for testing so I will do some videos on that topic soon.

1 Like

Thanks Tom. Topic videos on this would be popular I think. If I figure it all out I will post back here what I think was crucial to it for me.

1 Like

Could be handy to highlight the difference between LAG vs LACP, for the life of me I can’t really suss out what the difference is and why I would use one over the other. Though I notice LACP is not present on all switches, so I try to stick with those that have it.

LACP and LAG are basically the same thing. LACP (Link Aggregation Control Protocol) is the protocol used in the link aggregation and LAG (Link Aggregation) is another way switch manufacturers refer to it in its config. As long as both sides support 803.2ad you should be fine.

1 Like

Regarding your setup, yes it is a good idea to have your LAG and mgmt interfaces on different networks for security reasons, but it isn’t required. You could IP your LAG interfaces with IPs on the same network as your management IPs if you wanted to.

I read your question about traffic routing or passing across multiple ports or a LAG and wanted to give an explanation of what LACP is and how it works. First, LACP is a layer 2 protocol so no actually routing (a layer 3 operation) takes place. When you configure a LAG/port-channel/Etherchannel (all different names for the same thing) it will have an algorithm that determines how traffic is passed across it. This algo will use things such as MAC address, IP address, and port number to determine what physical link it will use. There is a general misconception that a LAG will give you double the bandwidth, but that is not the case. LACP is a load sharing solution, NOT a load balancing solution.

Let me give you an example. You have your virtual host (XCP-ng box 10.0.1.10) that needs to talk to the FreeNAS box at 10.0.1.20. The frame coming from the virtual host will have a source and destination MAC address, IP address, and port. The algo will look at this info and determine what physical link it will forward the traffic on, but it won’t send frames across both links for the same TCP session since this info remains unchanged. You may have multiple TCP sessions coming across a LAG and the algo passes them across a different link, but any single TCP session will only have 1G of bandwidth.

As for configuring your LAG IP to route, I would look at the routing table of the OS. Chances are you’ll need to SSH to the box for these settings if they aren’t in the GUI.

Hope that gives a clear picture of how LACP functions. Let me know if you have any questions.

1 Like

Ah ok I follow that, I use Netgear switches (don’t have experience of other brands) they do make a distinction between LAG and LACP, it looks to me that as long as it adheres to the 803.2ad (LACP) standard it will provide more resilience.

Yes now that makes sense :slight_smile:

I too would like to improve file transfer rates across my network (this should improve if I use jumbo frames) but I use LAG/LACP more for redundancy on my ethernet cables between switches.

Hi Fred,

Your post was most helpful. I have made my config work now, which is very pleasing. There were two things I had not got correct or appreciated:

  1. My Debian /etc/network/interfaces config file was not correct. I had it as follows (not working)
source /etc/network/interfaces.d/*

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface (management)
auto eth0
iface eth0 inet dhcp # 10.0.0.10 dhcp address reserved

# The LAG network interface (storage)
auto eth1
iface eth1 inet static
    address 10.0.1.10      # not enough to 
    netmask 255.255.255.0  # ... get an IP assigned to eth1

My issue was eth1 was not getting the IP requested, so I had to add two more directives:

# The LAG network interface (storage)
auto eth1
iface eth1 inet static
    address 10.0.1.10
    netmask 255.255.255.0
    network 10.0.1.0         # added or no eth1 interface address set
    broadcast 10.0.1.255     # added also
  1. LACP will not give me more bandwidth on a single session. I have confirmed this using iperf3 at both ends. The maximum one iperf3 session can get is ~1Gbps. Two iperf3 sessions (two servers, two clients) get 1Gbps, however. Nice.

The realisation of #2 is disappointing as backup storage write performance is no better, although I could overlap backups and I have gained resilience, which is welcome.

Thanks for the pointers. Your attention is really appreciated. I love these forums!

1 Like

Jumbo frames probably won’t make a difference unless you were running 100Mb links (or slower) or have really high latency, like across a WAN link.

There are some storage protocols that will support multiple TCP sessions to move data, but you have to make sure the network design doesn’t send them across a common link. iSCSI and SMB3 are two that you could look into. Basically the ports will have their own individual IP address and the OS knows it can load balance across both of them since they hit the same destination. I would put each of these ports on their own network/VLAN on both sides. So port 1 on both the virtual host and NAS will be on one network /VLAN and port 2 will be on another network/VLAN. If on the switch you only allow the VLAN that matches the network that port is on, it’ll guarantee that traffic is split across links. Also, you are moving your redundancy to the application as opposed to the network via LACP.

I’m on gigabit at home which is quite ok in most situations, however, occasionally I need to move data off 8TB disks which can take ages over the network but half the time over USB3. My setup isn’t optimal as I am constantly tweaking things, but it’s disappointing to hear no benefits for Jumbo frames however I will look into SMB3 and some other food for thought thanks

I have progressed with my effort to use LACP, and after waiting for another 20m cat5e cable to arrive from Amazon I have enough cabling for the two active LACP port groups working between my switch, Freenas host and XCPNG host.

Testing two concurrent iperf3 connections from XCPNG VM1 (with a virtual interface with two bonded physical interfaces) to FreeNAS (with a LAG interface made up of a bonded pair of physical interfaces) confirmed for me the misconception that link aggregation between two hosts will double bandwidth is not true (thanks for pointing that out to me @FredFerrell). The reason is the LACP bonding uses MAC and/or IP to decide which interfaces to use.

So I tried introducing another second VM on XCPNG to run another iperf3 client from a new IP, and sure enough I found that VM1 and VM2 would both get 1Gbps throughput simultaneously over their bonded(aggregated) interface. This is a nice result.

I am interested in finding out more about the various interface aggregation/bonding modes in XCPNG/Xen and Freenas. If I get some useful info I will edit this post and insert details below.

XCP-NG networking (bonded interface) > network setting can be one of:

  • active-active
  • active-passive
  • LACP with load balancing based on IP and port of source and destination <-- I have this set
  • LACP with load balancing based on source MAC address

Freenas LAG interface types:

  • LACP <-- I have this set
  • Failover
  • Load balance
  • Round robin

My Zyxel switch also has some LAG management options too:

  • static
  • LACP <-- I have this set

It’s always nice when expected results match testing or labbing efforts!

One other test I would look into setting up is MPHA iSCSC between your XCP-ng virtual host and your FreeNAS. MPHA stands for multipath high availability. Basically it allows your virtual host and your storage to talk across both links for storage. It does this at layer 3 so no need for LACP in this setup. You won’t be able to test with iPerf, but a disk speed test on a VM will create the traffic flow across both links.