I have this home lab that I wanted use to spin up a couple VMs and do some testing. The equipment I was using was a virtualization server ( Proxmox ) connected to a pfsense appliance ( SG-5100 ). I already had a prior vlan set up in pfsense that I don’t use anymore and thought it could be repurposed for a test network.
To get a lay of the land here is the setup:
pfsense+ : 21.50.2-RELEASE (upgraded last week)
poxmox : 7.0-9
vm test server : Ubuntu 20.04
Network:
LAN : 192.168.10.1/24
VLAN 20 : 192.168.20.1/24
So I go to spin up the VM, set the VLAN Tag of the network device to 20, and boot it up.
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0@if57: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 6e:e1:d0:de:3e:1e brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.20.54/24 brd 192.168.20.255 scope global dynamic eth0
valid_lft 7120sec preferred_lft 7120sec
Everthing appears to be in order, however……
# ping 192.168.20.1
PING 192.168.20.1 (192.168.20.1) 56(84) bytes of data.
^C
--- 192.168.20.1 ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 4094ms
The packets aren’t reaching the gateway. Yet, it could still resolve an IP address from DHCP. Surely, this is most likely a firewall issue? Right? Wrong.
I spent a good chunk of time messing with firewall rules to the point I deleted all the rules and set the network to be wide open. I still cannot get a ping through.
At this point I thinking it might be a proxmox issue, so I’m messing with settings there. After awhile, I’m shutting down all my servers to reboot proxmox, and couple reboots later, still no ping. Then it dawns on me that I did use this vlan and I might still have a VM kicking around that was on that network. I had 1 left over. Turned it on, checked the IP, and then did a ping.
# ping 192.168.20.1
PING 192.168.20.1 (192.168.20.1) 56(84) bytes of data.
64 bytes from 192.168.20.1: icmp_seq=1 ttl=64 time=0.133 ms
64 bytes from 192.168.20.1: icmp_seq=2 ttl=64 time=0.122 ms
64 bytes from 192.168.20.1: icmp_seq=3 ttl=64 time=0.164 ms
64 bytes from 192.168.20.1: icmp_seq=4 ttl=64 time=0.171 ms
^C
--- 192.168.20.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3052ms
rtt min/avg/max/mdev = 0.122/0.147/0.171/0.023 ms
I check the settings to make sure I had both VMs the same, looking for any discrepancy. The proxmox settings were fine. However on pfsense there was only 1 thing that was different.
In the settings for the DHCP server for VLAN 20 I had a static mapping for the old VM with ARP Table Static Entry checked.
I go to check my DHCP leases, the old VM is there and says it’s online with a static lease type. However when you do a static mapping it will always say it’s online. The new VM that is not static mapped is also showing up here, and says the lease is active but it’s offline. Yet the VM is still on and should be online.
I go to check the ARP table the new VM isn’t here. This is where the problem is. I’ve shut down that VM kicked it back on, and even from a fresh boot right as it’s getting an IP from the DHCP server but it’s not going in the ARP table.
I know that it’s making ARP requests because arpwatch is catching and logging it.
arpwatch 45357 new station 192.168.20.54 6e:e1:d0:de:3e:1e
For extra measure, I ran a set of test cases with the same VM and the only difference was I would increment the MAC address as to make sure I wasn’t catching some sort of cache.
NETWORK | DCHP |VLAN Tag | working
LAN | dynamic | no | pass
LAN | static | no | pass
VLAN 20 | dynamic | 20 | fail
VLAN 20 | static | 20 | pass
It appears to me that for some reason pfSense is not putting my servers in the ARP table, only if it’s for a VLAN and needs a dynamic IP from DHCP. Is there something I may be missing about this? Is there some sort of setting I may have misconfigured that would cause this? Or is it possible I’m looking at a bug? ( I did just upgrade my pfsense to a newer version just a week ago)
While I can work around this issue for now, it’s not ideal. If I need to spin up a half a dozen containers / VMs (or more), I’ll have to go into pfsense to statically assign an address for each and every one for every new project… this is going to be painful.
This issue is so bizarre to me. Any help on this issue would be appreciated.