pfSense VLAN with DHCP issue

I have this home lab that I wanted use to spin up a couple VMs and do some testing. The equipment I was using was a virtualization server ( Proxmox ) connected to a pfsense appliance ( SG-5100 ). I already had a prior vlan set up in pfsense that I don’t use anymore and thought it could be repurposed for a test network.

To get a lay of the land here is the setup:
 pfsense+  :  21.50.2-RELEASE  (upgraded last week)
 poxmox :  7.0-9
 vm test server : Ubuntu 20.04

 Network:
 LAN : 192.168.10.1/24
 VLAN 20 : 192.168.20.1/24

So I go to spin up the VM, set the VLAN Tag of the network device to 20, and boot it up.

# ip a  
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0@if57: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 6e:e1:d0:de:3e:1e brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.20.54/24 brd 192.168.20.255 scope global dynamic eth0
       valid_lft 7120sec preferred_lft 7120sec

Everthing appears to be in order, however……

# ping 192.168.20.1
PING 192.168.20.1 (192.168.20.1) 56(84) bytes of data.
^C
--- 192.168.20.1 ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 4094ms

The packets aren’t reaching the gateway. Yet, it could still resolve an IP address from DHCP. Surely, this is most likely a firewall issue? Right? Wrong.

I spent a good chunk of time messing with firewall rules to the point I deleted all the rules and set the network to be wide open. I still cannot get a ping through.

At this point I thinking it might be a proxmox issue, so I’m messing with settings there. After awhile, I’m shutting down all my servers to reboot proxmox, and couple reboots later, still no ping. Then it dawns on me that I did use this vlan and I might still have a VM kicking around that was on that network. I had 1 left over. Turned it on, checked the IP, and then did a ping.

# ping 192.168.20.1
PING 192.168.20.1 (192.168.20.1) 56(84) bytes of data.
64 bytes from 192.168.20.1: icmp_seq=1 ttl=64 time=0.133 ms
64 bytes from 192.168.20.1: icmp_seq=2 ttl=64 time=0.122 ms
64 bytes from 192.168.20.1: icmp_seq=3 ttl=64 time=0.164 ms
64 bytes from 192.168.20.1: icmp_seq=4 ttl=64 time=0.171 ms
^C
--- 192.168.20.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3052ms
rtt min/avg/max/mdev = 0.122/0.147/0.171/0.023 ms

I check the settings to make sure I had both VMs the same, looking for any discrepancy. The proxmox settings were fine. However on pfsense there was only 1 thing that was different.

In the settings for the DHCP server for VLAN 20 I had a static mapping for the old VM with ARP Table Static Entry checked.

I go to check my DHCP leases, the old VM is there and says it’s online with a static lease type. However when you do a static mapping it will always say it’s online. The new VM that is not static mapped is also showing up here, and says the lease is active but it’s offline. Yet the VM is still on and should be online.

I go to check the ARP table the new VM isn’t here. This is where the problem is. I’ve shut down that VM kicked it back on, and even from a fresh boot right as it’s getting an IP from the DHCP server but it’s not going in the ARP table.

I know that it’s making ARP requests because arpwatch is catching and logging it.

arpwatch	45357	  new station 192.168.20.54 6e:e1:d0:de:3e:1e

For extra measure, I ran a set of test cases with the same VM and the only difference was I would increment the MAC address as to make sure I wasn’t catching some sort of cache.

NETWORK		| DCHP		|VLAN Tag	|  working
LAN			| dynamic	| no		|  pass
LAN			| static	| no		|  pass
VLAN 20		| dynamic	| 20		|  fail
VLAN 20		| static	| 20		|  pass

It appears to me that for some reason pfSense is not putting my servers in the ARP table, only if it’s for a VLAN and needs a dynamic IP from DHCP. Is there something I may be missing about this? Is there some sort of setting I may have misconfigured that would cause this? Or is it possible I’m looking at a bug? ( I did just upgrade my pfsense to a newer version just a week ago)

While I can work around this issue for now, it’s not ideal. If I need to spin up a half a dozen containers / VMs (or more), I’ll have to go into pfsense to statically assign an address for each and every one for every new project… this is going to be painful.

This issue is so bizarre to me. Any help on this issue would be appreciated.

I don’t use Proxmox but Jay from Learn Linux TV does but I am not aware of anything special settings needed to VLAN’s and virtual servers. I use XCP-NG and I just add the VLANs tag when creating the extra network and it works like any other Switch or AP does for putting things on that VLAN.

Well the problem is with pfSense. It doesn’t matter what hypervisor I use.

The TL;DR :

Any VLAN I have on pfSense with DHCP enabled will not record any client to the ARP Table.

The client will make a broadcast, pfSense responds and hands out an IP to the client.
Then nothing. pfSense never writes it to the ARP table.

It’s it not a VLAN it works just fine… which is why this is bizarre.
I only noticed this problem when I upgraded pfSense+ to version 21.05.2-RELEASE (amd64) last week

I use pfSense with a variety of VLANs which have DHCP enabled and I see the DHCP clients in the ARP table on pfSense (Diagnostics > ARP Table). They are real, not virtualized, but that shouldn’t make a difference. I am using Unifi and set the VLAN on the WiFi network and/or switch. My pfSense DHCP service is pretty default, I specify a range, DNS server, gateway, domain name, and ignore BOOTP where that makes sense. Maybe try a new VLAN rather than reusing an old one?

Try posting in the pfsense forums https://forum.netgate.com/ maybe someone there will have some more insight.