OPNsense Forum

Archive => 20.1 Legacy Series => Topic started by: danielm on April 03, 2020, 03:57:27 am

Title: Packets bigger than MTU get dropped
Post by: danielm on April 03, 2020, 03:57:27 am
The problem: On my box, I noticed that if I ping something that is connected to the physical WAN port (something on the internet or the modem) and the packet is bigger than the MTU, it seems to get dropped (in case of ping: no echo answer). If the packet is smaller though, ping works reliably. To my understanding, the packet should have been fragmented, the fragments being sent over the WAN. Also, e.g. if I lower the MTU on the WAN interface, I can see that also smaller packets will start to get dropped reliably, so I think the opnsense firewall is the culprit.

The system looks like this:

VDSL2 internet connection
^
Modem (Draytek Vigor 165)-----------------------------|
^                                                                             ^
Firewall PPPoE on VLAN 7 (WAN interface)       Firewall VLAN 0 ("Modem" interface only for accessing modem gui)
^                                                                             ^
OPNSENSE (v. 20.1.3)-----------------------------------|
^
LAN interface
^
LAN network

The MTU on WAN interface and on Modem interface are set to default for now. I saw no logs that gave anything useful. Interestingly, ipv6 ping also seems to get lost if packets are too big. As of now, everyday stuff like browsing, VOIP, conferencing or servers dont seem to be affected by this problem, but it seems to me that this is mostly luck and I fear this will bite me when it comes to things like VPN in the future, so I'd be glad if someone can point me in the right direction here.
Title: Re: Packets bigger than MTU get dropped
Post by: mimugmail on April 03, 2020, 05:50:50 am
Do you set the DF bit with your ping test?
Title: Re: Packets bigger than MTU get dropped
Post by: danielm on April 03, 2020, 06:47:23 am
No, I am using ping on linux like "ping -s 1600 ..." and the WAN interface has a MTU of 1500 (1492 effective bc. PPP)
Title: Re: Packets bigger than MTU get dropped
Post by: danielm on April 03, 2020, 06:51:34 am
If I set the MTU low enough (e.g. 900) everyday stuff starts being affected - e.g. some websites like google refuse to load for example. Ping also reflects that then with packets above size 900 getting lost. I thought about potential driver issues - the machine is a HP DL320e gen8 v2, I am using its default NICs. Do you think it might be a good idea to buy a different NIC card and try with that?
Title: Re: Packets bigger than MTU get dropped
Post by: mimugmail on April 03, 2020, 06:53:27 am
http://yurisk.info/2009/09/01/ping-setting-dont-fragment-bit-in-linuxfreebsdsolarisciscojuniper/

your ping denies the firewall to fragment ... you have to remove on the cilent or add a scrubbing rule to delete the df bit from the packet (which costs cpu cycles)
Title: Re: Packets bigger than MTU get dropped
Post by: danielm on April 03, 2020, 07:51:32 am
If the ping was denying the firewall to fragment, then this shouldn't work right? (MTU of modem interface is 1500)

# 192.168.1.1 is the modem behind the modem interface - i know it's weird, but it's not the router
daniel@daniel-desktop-ubuntu:~$ ping -s 2000 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 2000(2028) bytes of data.
2008 bytes from 192.168.1.1: icmp_seq=1 ttl=61 time=1.22 ms
2008 bytes from 192.168.1.1: icmp_seq=2 ttl=61 time=1.65 ms
2008 bytes from 192.168.1.1: icmp_seq=3 ttl=61 time=1.59 ms
^C
--- 192.168.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 1.220/1.487/1.650/0.195 ms


I guess this could have to do with the fact that a packet of 2000 would be fragmented already when being sent over the 1500 MTU ethernet?
I tried a VM host I set up myself without FW and there it seems that you're right:

daniel@daniel-desktop-ubuntu:~$ ping -4 -s 1470 -M dont 167.86.122.130
PING some host (some host) 1470(1498) bytes of data.
1478 bytes from some host: icmp_seq=1 ttl=56 time=17.5 ms
1478 bytes from some host: icmp_seq=2 ttl=56 time=18.2 ms
1478 bytes from some host: icmp_seq=3 ttl=56 time=17.9 ms
1478 bytes from some host: icmp_seq=4 ttl=56 time=17.8 ms
...
^C
--- some host ping statistics ---
13 packets transmitted, 13 received, 0% packet loss, time 12019ms
rtt min/avg/max/mdev = 17.554/17.882/18.256/0.265 ms
daniel@daniel-desktop-ubuntu:~$ ping -4 -s 1470 some host
PING some host (some host) 1470(1498) bytes of data.
^C
--- some host ping statistics ---
9 packets transmitted, 0 received, 100% packet loss, time 8170ms


The LAN and the PPPoE internet connection surely have different MTU because of PPP, so on LAN the ping with size 1498 doesn't need to be fragmented, but on the WAN side it certainly does (1492 effective MTU)
What's interesting though is this:

daniel@daniel-desktop-ubuntu:~$ ping -4 -s 1500 some host
PING some host (some host) 1500(1528) bytes of data.
1508 bytes from some host: icmp_seq=1 ttl=56 time=17.8 ms
1508 bytes from some host: icmp_seq=2 ttl=56 time=17.4 ms
1508 bytes from some host: icmp_seq=3 ttl=56 time=18.0 ms
1508 bytes from some host: icmp_seq=4 ttl=56 time=17.4 ms
^C
--- some host ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3005ms
rtt min/avg/max/mdev = 17.416/17.690/18.040/0.317 ms


What would be your explanation for this? The packet must be getting split into smaller packets than 1492 right?
Also, why doesn't opnsense announce in the second ping an ICMP error that the packet is too big? I got this error sometimes already, but not this time, I just don't understand why, here is an ipv6 ping where it seems to work at first, then on the second and third try no error is placed:

daniel@daniel-desktop-ubuntu:~$ ping -s 1450 google.de
PING google.de(muc12s06-in-x03.1e100.net (2a00:1450:4016:805::2003)) 1450 data bytes
From OPNsense (some ipv6 address) icmp_seq=1 Packet too big: mtu=1492
^C
--- google.de ping statistics ---
4 packets transmitted, 0 received, +1 errors, 100% packet loss, time 3046ms

daniel@daniel-desktop-ubuntu:~$ ping -s 1460 google.de
PING google.de(muc12s06-in-x03.1e100.net (2a00:1450:4016:805::2003)) 1460 data bytes
^C
--- google.de ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3053ms

daniel@daniel-desktop-ubuntu:~$ ping -s 1450 google.de
PING google.de(muc12s06-in-x03.1e100.net (2a00:1450:4016:805::2003)) 1450 data bytes
^C
--- google.de ping statistics ---
11 packets transmitted, 0 received, 100% packet loss, time 10245ms
Title: Re: Packets bigger than MTU get dropped
Post by: mimugmail on April 03, 2020, 08:51:01 am
You need to check the traffic on each interface via tcpdump so you can see if packets already enter fragmented.
Try to search sysctl of some icmp error messages are disabled.
Title: Re: Packets bigger than MTU get dropped
Post by: Ricardo on April 03, 2020, 10:58:58 am
sub
Title: Re: Packets bigger than MTU get dropped
Post by: danielm on April 04, 2020, 11:52:53 am
Okay, so I tried to debug this further and I think I MIGHT have figured it out....
I used tcpdump to measure if the ping would
(1) exit through the sending computer correctly
(2) enter into the firewall through LAN correctly
(3) exit through the WAN interface correctly
This yields strange results:

(a) If the packet is smaller than both the MTU of LAN and WAN, the packet will go through correctly, reach the host and come back
(b) If the packet is a lot bigger than both the MTU of LAN and WAN, the packet will also go through correctly
(c) If the packet is a little bit smaller than the LAN MTU, but just a little bigger than WAN MTU (e.g. packet size 1498, which is smaller than LAN MTU (1500) and bigger than WAN MTU (1492 effective b.c. PPP)) it will exit the sender and enter the FW through LAN correctly (unfragmented) but nothing will exit through WAN
(d) If i try like in (c), but with ping option "-M dont", it starts working again - strangely enough, it seems to only work "most of the time", not every time, but I lost the record of the last time it didn't work unfortunately

For case (a)

# ping goes through normally
daniel@daniel-desktop-ubuntu:~$ ping -s 1460 <host>
PING <host> (<host>) 1460(1488) bytes of data.
1468 bytes from <host>: icmp_seq=1 ttl=56 time=24.1 ms
1468 bytes from <host>: icmp_seq=2 ttl=56 time=24.1 ms
1468 bytes from <host>: icmp_seq=3 ttl=56 time=23.6 ms
1468 bytes from <host>: icmp_seq=4 ttl=56 time=23.9 ms
^C
--- <host> ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 23.684/23.993/24.184/0.232 ms

# (1) packet exits fine
root@daniel-desktop-ubuntu:~# tcpdump -v dst <host>
tcpdump: listening on enp8s0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:05:16.860273 IP (tos 0x0, ttl 64, id 53689, offset 0, flags [DF], proto ICMP (1), length 1488)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 6910, seq 197, length 1468
11:05:17.862002 IP (tos 0x0, ttl 64, id 53732, offset 0, flags [DF], proto ICMP (1), length 1488)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 6910, seq 198, length 1468
...
^C
10 packets captured
10 packets received by filter
0 packets dropped by kernel

# (2) packet enters LAN on FW
root@OPNsense:~ # tcpdump -v -i bge1 dst <host>
tcpdump: listening on bge1, link-type EN10MB (Ethernet), capture size 262144 bytes
11:05:17.864991 IP (tos 0x0, ttl 64, id 53732, offset 0, flags [DF], proto ICMP (1), length 1488)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 6910, seq 198, length 1468
11:05:18.866650 IP (tos 0x0, ttl 64, id 53733, offset 0, flags [DF], proto ICMP (1), length 1488)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 6910, seq 199, length 1468
...
^C
158 packets captured
6965 packets received by filter
0 packets dropped by kernel

# (3) ping exits through WAN
root@OPNsense:~ # tcpdump -v -i pppoe0 dst <host>
tcpdump: listening on pppoe0, link-type NULL (BSD loopback), capture size 262144 bytes
11:05:18.866695 IP (tos 0x0, ttl 63, id 53733, offset 0, flags [DF], proto ICMP (1), length 1488)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 16722, seq 199, length 1468
11:05:19.868438 IP (tos 0x0, ttl 63, id 53754, offset 0, flags [DF], proto ICMP (1), length 1488)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 16722, seq 200, length 1468
...
^C
204 packets captured
4253 packets received by filter
0 packets dropped by kernel


For case (b)

# ping goes through normally
daniel@daniel-desktop-ubuntu:~$ ping -c 2 -s 2000 <host>
PING <host> (<host>) 2000(2028) bytes of data.
2008 bytes from <host>: icmp_seq=1 ttl=56 time=24.0 ms
2008 bytes from <host>: icmp_seq=2 ttl=56 time=23.6 ms

--- <host> ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 23.693/23.858/24.023/0.165 ms

# (1) ping exits fine (fragmented)
root@daniel-desktop-ubuntu:~# tcpdump -v dst <host>
tcpdump: listening on enp8s0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:17:06.615928 IP (tos 0x0, ttl 64, id 4424, offset 0, flags
  • , proto ICMP (1), length 1500)

    daniel-desktop-ubuntu > <host>: ICMP echo request, id 7213, seq 1, length 1480
11:17:06.615937 IP (tos 0x0, ttl 64, id 4424, offset 1480, flags [none], proto ICMP (1), length 548)
    daniel-desktop-ubuntu > <host>: icmp
11:17:07.617860 IP (tos 0x0, ttl 64, id 4604, offset 0, flags
  • , proto ICMP (1), length 1500)

    daniel-desktop-ubuntu > <host>: ICMP echo request, id 7213, seq 2, length 1480
11:17:07.617909 IP (tos 0x0, ttl 64, id 4604, offset 1480, flags [none], proto ICMP (1), length 548)
    daniel-desktop-ubuntu > <host>: icmp
^C
4 packets captured
4 packets received by filter
0 packets dropped by kernel

# (2) ping enters fine on LAN (fragmented)
root@OPNsense:~ # tcpdump -v -i bge1 dst <host>
tcpdump: listening on bge1, link-type EN10MB (Ethernet), capture size 262144 bytes
11:17:06.618924 IP (tos 0x0, ttl 64, id 4424, offset 0, flags
  • , proto ICMP (1), length 1500)

    daniel-desktop-ubuntu > <host>: ICMP echo request, id 7213, seq 1, length 1480
11:17:06.618946 IP (tos 0x0, ttl 64, id 4424, offset 1480, flags [none], proto ICMP (1), length 548)
    daniel-desktop-ubuntu > <host>: icmp
11:17:07.620993 IP (tos 0x0, ttl 64, id 4604, offset 0, flags
  • , proto ICMP (1), length 1500)

    daniel-desktop-ubuntu > <host>: ICMP echo request, id 7213, seq 2, length 1480
11:17:07.621002 IP (tos 0x0, ttl 64, id 4604, offset 1480, flags [none], proto ICMP (1), length 548)
    daniel-desktop-ubuntu > <host>: icmp
^C
4 packets captured
11481 packets received by filter
0 packets dropped by kernel

# (3) ping exits fine through WAN (new fragmentation)
root@OPNsense:~ # tcpdump -v -i pppoe0 dst <host>
tcpdump: listening on pppoe0, link-type NULL (BSD loopback), capture size 262144 bytes
11:17:06.618988 IP (tos 0x0, ttl 63, id 4424, offset 0, flags
  • , proto ICMP (1), length 1492)

    <WAN-IP> > <host>: ICMP echo request, id 20100, seq 1, length 1472
11:17:06.620978 IP (tos 0x0, ttl 63, id 4424, offset 1472, flags [none], proto ICMP (1), length 556)
    <WAN-IP> > <host>: icmp
11:17:07.621018 IP (tos 0x0, ttl 63, id 4604, offset 0, flags
  • , proto ICMP (1), length 1492)

    <WAN-IP> > <host>: ICMP echo request, id 20100, seq 2, length 1472
11:17:07.621035 IP (tos 0x0, ttl 63, id 4604, offset 1472, flags [none], proto ICMP (1), length 556)
    <WAN-IP> > <host>: icmp
^C
4 packets captured
15729 packets received by filter
0 packets dropped by kernel


For case (c) - the interesting part:

# ping fails - no receive
daniel@daniel-desktop-ubuntu:~$ ping -c 2 -s 1470 <host>
PING <host> (<host>) 1470(1498) bytes of data.

--- <host> ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1020ms

# (1) ping exits fine (fragmented)
root@daniel-desktop-ubuntu:~# tcpdump -v dst <host>
tcpdump: listening on enp8s0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:26:44.570184 IP (tos 0x0, ttl 64, id 41993, offset 0, flags [DF], proto ICMP (1), length 1498)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 7385, seq 1, length 1478
11:26:45.582304 IP (tos 0x0, ttl 64, id 42233, offset 0, flags [DF], proto ICMP (1), length 1498)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 7385, seq 2, length 1478
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel

# (2) ping enter LAN on FW fine
root@OPNsense:~ # tcpdump -v -i bge1 dst <host>
tcpdump: listening on bge1, link-type EN10MB (Ethernet), capture size 262144 bytes
11:26:44.573413 IP (tos 0x0, ttl 64, id 41993, offset 0, flags [DF], proto ICMP (1), length 1498)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 7385, seq 1, length 1478
11:26:45.585459 IP (tos 0x0, ttl 64, id 42233, offset 0, flags [DF], proto ICMP (1), length 1498)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 7385, seq 2, length 1478
^C
2 packets captured
4678 packets received by filter
0 packets dropped by kernel

# (3) ping doesnt exit on WAN
root@OPNsense:~ # tcpdump -v -i pppoe0 dst <host>
tcpdump: listening on pppoe0, link-type NULL (BSD loopback), capture size 262144 bytes
^C
0 packets captured
4175 packets received by filter
0 packets dropped by kernel


For case (d) - it works again

# ping works again
daniel@daniel-desktop-ubuntu:~$ ping -M dont -c 2 -s 1471 <host>
PING <host> (<host>) 1471(1499) bytes of data.
1479 bytes from <host>: icmp_seq=1 ttl=56 time=25.2 ms
1479 bytes from <host>: icmp_seq=2 ttl=56 time=24.4 ms

--- <host> ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 24.430/24.837/25.245/0.436 ms

# (1) ping exits on sender fine
root@daniel-desktop-ubuntu:~# tcpdump -v dst <host>
tcpdump: listening on enp8s0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:35:04.292361 IP (tos 0x0, ttl 64, id 5852, offset 0, flags [none], proto ICMP (1), length 1499)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 7570, seq 1, length 1479
11:35:05.293686 IP (tos 0x0, ttl 64, id 5897, offset 0, flags [none], proto ICMP (1), length 1499)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 7570, seq 2, length 1479
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel

# (2) ping enters on LAN interface fine
root@OPNsense:~ # tcpdump -v -i bge1 dst <host>
tcpdump: listening on bge1, link-type EN10MB (Ethernet), capture size 262144 bytes
11:35:04.295395 IP (tos 0x0, ttl 64, id 5852, offset 0, flags [none], proto ICMP (1), length 1499)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 7570, seq 1, length 1479
11:35:05.296824 IP (tos 0x0, ttl 64, id 5897, offset 0, flags [none], proto ICMP (1), length 1499)
    daniel-desktop-ubuntu > <host>: ICMP echo request, id 7570, seq 2, length 1479
^C
2 packets captured
140206 packets received by filter
0 packets dropped by kernel

# (3) ping exits WAN with fragmentation
root@OPNsense:~ # tcpdump -v -i pppoe0 dst <host>
tcpdump: listening on pppoe0, link-type NULL (BSD loopback), capture size 262144 bytes
11:35:04.295441 IP (tos 0x0, ttl 63, id 5852, offset 0, flags
  • , proto ICMP (1), length 1492)

   <WAN-IP> > <host>: ICMP echo request, id 12729, seq 1, length 1472
11:35:04.295449 IP (tos 0x0, ttl 63, id 5852, offset 1472, flags [none], proto ICMP (1), length 27)
    <WAN-IP> > <host>: icmp
11:35:05.296852 IP (tos 0x0, ttl 63, id 5897, offset 0, flags
  • , proto ICMP (1), length 1492)

    <WAN-IP> > <host>: ICMP echo request, id 12729, seq 2, length 1472
11:35:05.296860 IP (tos 0x0, ttl 63, id 5897, offset 1472, flags [none], proto ICMP (1), length 27)
    <WAN-IP> > <host>: icmp
^C
4 packets captured
131037 packets received by filter
0 packets dropped by kernel


This is (in my opinion) strange and I would be very interested in an explanation or possible reproduction by someone else.
My own explanation, seems to be the case from tcpdump output: Linux ping sets the DF flag only if the packet doesn't immediately need to be fragmented to leave the local LAN interface, so the issue arises only with packets that are too small to be fragmented immediately, but big enough that the FW would need to fragment them on WAN... (?)
Then following that, is that likely to be an issue with other applications or would these not set the DF flag therefore not hindering fragmentation, so no real world issue arises?
Title: Re: Packets bigger than MTU get dropped
Post by: mimugmail on April 04, 2020, 01:53:07 pm
Sounds like a min frag size value in sysctl?
Title: Re: Packets bigger than MTU get dropped
Post by: danielm on April 04, 2020, 03:54:24 pm
Ok I just tried similar things on windows and there this edge case seems to vanish, leaving no unintuitive behavior - so maybe it is really just a linux quirk and not a problem on the firewall per se.
If it happens again in unexpected places, I will post in this topic again, the only question that remains I think is why does opnsense not respond with an ICMP error if the packet was dropped due to DF flag - I thought a router dropping like this should return ICMP error code that ping would also show.
It is weird since I already saw opnsense doing something like that with ipv6 pings.