Hi,
I’m having a bit of problem with a policy routing via an OpenVPN uplink. I hope someone can help me. Here’s the setup:
A SG-7100 is connected to the internet via a static subnet on Interface lagg0.4090
Additionally we’re using a peer-to-peer OpenVPN tunnel to a router in a datacenter in order to route a couple of additional public IP subnets to our on-site firewall.
We’ve been doing this for a couple of years now with a linux based router/firewall, but we’re now in the process of migrating that setup to our new SG-7100.
The incoming traffic is being routed via a openvpn transfer net (172.29.2.0/24) to our pfSense and is then being forwarded via another internal transfer net to a host that is using one of our public addresses from that datacenter subnet: 95.216.47.180
The incoming traffic reaches our host just fine, as we can see with tcpdump on the host:
root@core-hg:~# tcpdump -n -i ens160 |grep 95.216.47.180
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens160, link-type EN10MB (Ethernet), capture size 262144 bytes
12:10:02.882885 IP 217.240.147.187 > 95.216.47.180: ICMP echo request, id 35634, seq 33015, length 44
12:10:02.882904 ARP, Request who-has 192.168.100.250 tell 95.216.47.180, length 28
12:10:02.883350 IP 95.216.47.180 > 217.240.147.187: ICMP echo reply, id 35634, seq 33015, length 44
12:10:02.985654 IP 217.240.147.187 > 95.216.47.180: ICMP echo request, id 35634, seq 33016, length 44
12:10:02.985665 IP 95.216.47.180 > 217.240.147.187: ICMP echo reply, id 35634, seq 33016, length 44
12:10:03.086816 IP 217.240.147.187 > 95.216.47.180: ICMP echo request, id 35634, seq 33017, length 44
12:10:03.086828 IP 95.216.47.180 > 217.240.147.187: ICMP echo reply, id 35634, seq 33017, length 44
12:10:03.187353 IP 217.240.147.187 > 95.216.47.180: ICMP echo request, id 35634, seq 33018, length 44
In this example I’m pinging the host (95.216.47.180) from my home office (217.240.147.187)
As you can see the host generates reply packages and routes them back via the transfer net to our SG-7100. I can see the reply packages on the SG-7100:
[2.4.4-RELEASE][root@netgate1.wittich-hoehr.de]/root: tcpdump -n -i ix0 |grep 95.216.47.180
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ix0, link-type EN10MB (Ethernet), capture size 262144 bytes
12:12:47.177766 IP 217.240.147.187 > 95.216.47.180: ICMP echo request, id 35651, seq 33015, length 44
12:12:47.177832 IP 95.216.47.180 > 217.240.147.187: ICMP echo reply, id 35651, seq 33015, length 44
12:12:47.276528 IP 217.240.147.187 > 95.216.47.180: ICMP echo request, id 35651, seq 33016, length 44
12:12:47.276602 IP 95.216.47.180 > 217.240.147.187: ICMP echo reply, id 35651, seq 33016, length 44
12:12:47.379494 IP 217.240.147.187 > 95.216.47.180: ICMP echo request, id 35651, seq 33017, length 44
12:12:47.379566 IP 95.216.47.180 > 217.240.147.187: ICMP echo reply, id 35651, seq 33017, length 44
So, up until this point everything works as expected.
But despite a policy rule that should route traffic from 95.216.47.160/27 back over the OpenVPN link, the SG-7100 is actually routing those reply packages via it’s regular default nexthop on lagg0.4090:
2.4.4-RELEASE][root@netgate1.wittich-hoehr.de]/root: tcpdump -n -i lagg0.4090 | grep 95.216.47.180
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lagg0.4090, link-type EN10MB (Ethernet), capture size 262144 bytes
12:16:00.254283 IP 95.216.47.180 > 217.240.147.187: ICMP echo reply, id 35671, seq 33015, length 44
12:16:00.356478 IP 95.216.47.180 > 217.240.147.187: ICMP echo reply, id 35671, seq 33016, length 44
12:16:00.461685 IP 95.216.47.180 > 217.240.147.187: ICMP echo reply, id 35671, seq 33017, length 44
12:16:00.564703 IP 95.216.47.180 > 217.240.147.187: ICMP echo reply, id 35671, seq 33018, length 44
12:16:00.668986 IP 95.216.47.180 > 217.240.147.187: ICMP echo reply, id 35671, seq 33019, length 44
As a result the reply packages are being discarded by our local ISP and never reach my home office.
One could think that I just made a mistake setting up the policy route, but I’m pretty sure the rule is correct, because outgoing traffic from 95.216.47.180 that is being INITIATED here is being routed correctly via the OpenVPN uplink:
root@core-hg:~# traceroute -n -s 95.216.47.180 1.1.1.1
traceroute to 1.1.1.1 (1.1.1.1), 30 hops max, 60 byte packets
1 172.29.2.1 32.577 ms 32.648 ms 32.640 ms
2 95.216.14.65 32.881 ms 32.879 ms 33.085 ms
3 213.239.224.133 32.828 ms 32.832 ms 32.808 ms
4 213.239.252.102 39.615 ms 39.714 ms 213.239.224.17 39.603 ms
5 194.68.128.246 38.930 ms 194.68.123.246 39.949 ms 40.030 ms
6 1.1.1.1 40.883 ms 40.647 ms 39.679 ms
As you can see the traffic leaves through our OpenVPN transfer net and reaches it’s destination. Also the reply packages from 1.1.1.1 are being routed correctly back through that OpenVPN uplink.
So, I would say, that the policy rule works in general, but I must have missed something.
Do you have any idea why traffic that is being INITIATED by our host is being routed correctly through OpenVPN while REPLY traffic is going through the regular default route?