Policy routing not working on replies (works on host initiated trafic)

Started by ybizeul, December 10, 2020, 09:57:59 AM

Previous topic - Next topic
Hi folks,

I'm pulling my hair out on this one.

I'm trying to route traffic from a specific network to a VPN gateway.

It works great for ping initiated from my host (ovpnc5 is the VPN interface I should go through):


tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ovpnc5, link-type NULL (BSD loopback), capture size 262144 bytes
08:54:14.110777 IP 10.101.0.100 > 1.1.1.1: ICMP echo request, id 68, seq 1, length 64
08:54:14.124537 IP 1.1.1.1 > 10.101.0.100: ICMP echo reply, id 68, seq 1, length 64


But for traffic the host replies to, it goes through igb2 and immediatly goes out igb1 instead of the matching ovpnc5 from the gateway rule.


# igb2 is the interface for the 10.101.0.100 network
root@opnsense:~ # tcpdump -i igb2 -n icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on igb2, link-type EN10MB (Ethernet), capture size 262144 bytes
08:45:35.525754 IP xxx.xxx.xxx.xxx.xxx > 10.101.0.100: ICMP echo request, id 50180, seq 220, length 64
08:45:35.525947 IP 10.101.0.100 > xxx.xxx.xxx.xxx.xxx: ICMP echo reply, id 50180, seq 220, length 64



# igb1 is the interface for the default route network
root@opnsense:~ # tcpdump -i igb1 -n icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on igb1, link-type EN10MB (Ethernet), capture size 262144 bytes
08:46:30.715195 IP 10.101.0.100 > xxx.xxx.xxx.xxx.xxx: ICMP echo reply, id 50180, seq 275, length 64
08:46:31.719935 IP 10.101.0.100 > xxx.xxx.xxx.xxx.xxx: ICMP echo reply, id 50180, seq 276, length 64
08:46:32.723703 IP 10.101.0.100 > xxx.xxx.xxx.xxx.xxx: ICMP echo reply, id 50180, seq 277, length 64


Now the replies goes a different route and never works.

Does that makes sense ?


Thank you, the issue certainly looks similar.

But I think I disabled reply-to pretty much across the board.

I'm wondering if my setup can be called "multiple wan", as in some way I'm making another WAN through VPN...



That's the trace from OPNsense LAN interface, facing the machine that replies to ping


1 2020-12-11 21:18:18.753424 xxx.xxx.xxx.xxx.xxx 10.101.0.100 ICMP 98 Echo (ping) request  id=0xa25f, seq=33/8448, ttl=51 (reply in 2)
Ethernet II, Src: IskraTra_d0:9b:e3 (00:0e:c4:d0:9b:e3), Dst: VMware_a4:72:a2 (00:50:56:a4:72:a2)

2 2020-12-11 21:18:18.753621 10.101.0.100 xxx.xxx.xxx.xxx.xxx ICMP 98 Echo (ping) reply    id=0xa25f, seq=33/8448, ttl=64 (request in 1)
Ethernet II, Src: VMware_a4:72:a2 (00:50:56:a4:72:a2), Dst: IskraTra_d0:9b:e3 (00:0e:c4:d0:9b:e3)


That's full packet 2, the ping reply :


Frame 2: 98 bytes on wire (784 bits), 98 bytes captured (784 bits)
    Encapsulation type: Ethernet (1)
    Arrival Time: Dec 11, 2020 21:18:18.753621000 CET
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1607717898.753621000 seconds
    [Time delta from previous captured frame: 0.000197000 seconds]
    [Time delta from previous displayed frame: 0.000197000 seconds]
    [Time since reference or first frame: 0.000197000 seconds]
    Frame Number: 2
    Frame Length: 98 bytes (784 bits)
    Capture Length: 98 bytes (784 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ethertype:ip:icmp:data]
    [Coloring Rule Name: ICMP]
    [Coloring Rule String: icmp || icmpv6]
Ethernet II, Src: VMware_a4:72:a2 (00:50:56:a4:72:a2), Dst: IskraTra_d0:9b:e3 (00:0e:c4:d0:9b:e3)
    Destination: IskraTra_d0:9b:e3 (00:0e:c4:d0:9b:e3)
        Address: IskraTra_d0:9b:e3 (00:0e:c4:d0:9b:e3)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: VMware_a4:72:a2 (00:50:56:a4:72:a2)
        Address: VMware_a4:72:a2 (00:50:56:a4:72:a2)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 10.101.0.100, Dst: xxx.xxx.xxx.xxx.xxx
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
        0000 00.. = Differentiated Services Codepoint: Default (0)
        .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    Total Length: 84
    Identification: 0x9097 (37015)
    Flags: 0x00
        0... .... = Reserved bit: Not set
        .0.. .... = Don't fragment: Not set
        ..0. .... = More fragments: Not set
    Fragment Offset: 0
    Time to Live: 64
    Protocol: ICMP (1)
    Header Checksum: 0xdf41 [validation disabled]
    [Header checksum status: Unverified]
    Source Address: 10.101.0.100
    Destination Address: xxx.xxx.xxx.xxx.xxx
Internet Control Message Protocol
    Type: 0 (Echo (ping) reply)
    Code: 0
    Checksum: 0xd1b5 [correct]
    [Checksum Status: Good]
    Identifier (BE): 41567 (0xa25f)
    Identifier (LE): 24482 (0x5fa2)
    Sequence Number (BE): 33 (0x0021)
    Sequence Number (LE): 8448 (0x2100)
    [Request frame: 1]
    [Response time: 0.197 ms]
    Timestamp from icmp data: Dec 11, 2020 21:18:18.748765000 CET
    [Timestamp from icmp data (relative): 0.004856000 seconds]


And for some reason, this packet ends up on the default route interface, even though I have a rule (reply-to disabled) that says anything "in" this interface (igb2) set Gateway to OpenVPN gateway :

76:ac:b9:d9:17:ae is my internet router


Frame 1: 98 bytes on wire (784 bits), 98 bytes captured (784 bits)
    Encapsulation type: Ethernet (1)
    Arrival Time: Dec 11, 2020 21:20:44.110994000 CET
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1607718044.110994000 seconds
    [Time delta from previous captured frame: 0.000000000 seconds]
    [Time delta from previous displayed frame: 0.000000000 seconds]
    [Time since reference or first frame: 0.000000000 seconds]
    Frame Number: 1
    Frame Length: 98 bytes (784 bits)
    Capture Length: 98 bytes (784 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ethertype:ip:icmp:data]
    [Coloring Rule Name: ICMP]
    [Coloring Rule String: icmp || icmpv6]
Ethernet II, Src: IskraTra_d0:9b:e2 (00:0e:c4:d0:9b:e2), Dst: 76:ac:b9:d9:17:ae (76:ac:b9:d9:17:ae)
    Destination: 76:ac:b9:d9:17:ae (76:ac:b9:d9:17:ae)
        Address: 76:ac:b9:d9:17:ae (76:ac:b9:d9:17:ae)
        .... ..1. .... .... .... .... = LG bit: Locally administered address (this is NOT the factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: IskraTra_d0:9b:e2 (00:0e:c4:d0:9b:e2)
        Address: IskraTra_d0:9b:e2 (00:0e:c4:d0:9b:e2)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 10.101.0.100, Dst: xxx.xxx.xxx.xxx.xxx
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
        0000 00.. = Differentiated Services Codepoint: Default (0)
        .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    Total Length: 84
    Identification: 0xd562 (54626)
    Flags: 0x00
        0... .... = Reserved bit: Not set
        .0.. .... = Don't fragment: Not set
        ..0. .... = More fragments: Not set
    Fragment Offset: 0
    Time to Live: 63
    Protocol: ICMP (1)
    Header Checksum: 0x9b76 [validation disabled]
    [Header checksum status: Unverified]
    Source Address: 10.101.0.100
    Destination Address: xxx.xxx.xxx.xxx.xxx
Internet Control Message Protocol
    Type: 0 (Echo (ping) reply)
    Code: 0
    Checksum: 0x9e6b [correct]
    [Checksum Status: Good]
    Identifier (BE): 41567 (0xa25f)
    Identifier (LE): 24482 (0x5fa2)
    Sequence Number (BE): 178 (0x00b2)
    Sequence Number (LE): 45568 (0xb200)
    Timestamp from icmp data: Dec 11, 2020 21:20:44.106254000 CET
    [Timestamp from icmp data (relative): 0.004740000 seconds]



Sorry if I wasn't clear.

None of the pings or locally initiated. "host" in the topic subject designates a Linux OS on the LAN.



In that case I'm redirection the whole a.b.c.d IP address to this Test Linux. It's a simple DNAT to route packers through VPN down to Linux. That part works and Test linux has a ping request coming from some public address.

It replies but the problem is OPNsense sends it down the default route instead of the VPN.

But when Test Linux pings some public IP, all is fine with PBR

Finally found someone with the same problem !
Apparently reply traffic always go through default route.

I might just configure a dedicated OPNsense for this use case which is fine.

https://forum.netgate.com/topic/149740/policy-based-routing-of-return-traffic/3


I had a strange situation like that on an OPNsense yesterday. For some reasons packets were not leaving the way they should. Ended up in doing a reboot because everything looked like it should be. Reboot then seems to have set the interfaces and routes correctly. It just worked.
Maybe it's worth a try.
,,The S in IoT stands for Security!" :)

It might be worth setting up a lab for this. I'll see if I can make some time but I have everything working on a separate OPNsense.