I was wondering if anyone else has had issues with specifically VLAN tagging with VMWare and Unifi Switches when using 10Gb Switches or Switch Port ??
I currently have a dialogue going on with Ubiquiti over this, since we’re deploying a new 10Gb network along with new VMware ESX 7.01 hosts.
Normally I’ve never had any issues when using Cisco and VMWare, and if I use 1Gb then I also don’t have issues with VMWare and Unifi either, however I seem to have a weird edge case when we use 10Gb between the VMWare hosts and the 10Gb (US-16-XG) Ubiquiti switches.
If I don’t do any tagging and just use the native VLAN on the port, then everything works fine. We can ping the default gateway from any VM running on the host, but as soon as I change the port profile on the port to one with tagging on it, then move the VM to the tagged network within VMWare, we notice only Layer 2, seems to be coming over onto the switch, but nothing layer 3 or above.
We can see the virtual machines MAC addresses in the MAC table on the switch, so we know at least layer 2 is sort of working, although the MAC addresses appear in both the native and in the tagged VLAN too, which I find odd, and that’s even after i’ve cleared the MAC table from the pervious setup too.
However, we can’t ping out of the VM, and yet if I drop the connection down to 1Gb Full Duplex using the same hardware, same ports on the switch etc, then it all works fine. So the config is correct.
We’ve tried different 10Gb network cards in the hosts, different SFP+ modules and even the 10Gb RJ45 SFP+ modules too with a small length of CAT6 between, and i’ve even tried using the 10Gb ports on the USW-PRO-48-POE switch and the same happens.
Also tried different config on the switch ports, such as setting the MTU to 1500, turning off VLAN ingress filter etc.
What is really odd is I have exactly the same hardware running 2 pfSense servers, and those work perfectly fine with the tagging. So we know it’s not a hardware compatibility issue between the switch and the physical server hardware.
It’s almost like VMWare is using a different 802.1q standard to what the switch is expecting and not splitting the VLAN traffic out properly.
We even tried port mirroring the 10Gb port and using wireshark to inspect the traffic, and we only see ARP. Very little TCP/IP or ICMP traffic, even when we’re trying to run a ping on the server.
Any insight that anyone may have had with this would be appreciated.
I think it may be a really obscure “edge case” type issue with the Unifi firmware, but i’m interested to see if anyone else has run into this issue with similar setups.