Solved: [24.7.4] Some UDP traffic causes packet loss

Started by hifol792, September 23, 2024, 01:55:03 PM

Previous topic - Next topic
September 25, 2024, 03:53:30 PM #15 Last Edit: September 25, 2024, 03:57:47 PM by iMx
I don't see this mentioned here ... so ...

... have you checked the firewall logs for drops? 

If you tcpdump the LAN side interface, do you see your DNS requests ingress to the interface?  For example:

tcpdump -i lan-interface host 8.8.8.8 and port 53


If you do, then if you tcpdump the WAN side interface, do you see your DNS requests egress the WAN interface?

tcpdump -i wan-interface host 8.8.8.8 and port 53

Another question: Did you see this problem on 24.1?  Or have you only ever run 24.7?

September 25, 2024, 04:10:50 PM #17 Last Edit: September 25, 2024, 04:13:23 PM by hifol792
> ... have you checked the firewall logs for drops? 

I checked and I don't see anything blocked during the test

tcpdump during network problems due to mtr:

pppoe1:

21:04:07.785801 IP 100.68.87.90.64228 > 8.8.8.8.53: 6509+ [1au] A? google.com. (51)
21:04:12.790998 IP 100.68.87.90.16792 > 8.8.8.8.53: 6509+ [1au] A? google.com. (51)
21:04:17.791183 IP 100.68.87.90.50947 > 8.8.8.8.53: 6509+ [1au] A? google.com. (51)


lan:

21:04:07.785782 IP 192.168.1.148.57720 > 8.8.8.8.53: 6509+ [1au] A? google.com. (51)
21:04:12.790960 IP 192.168.1.148.33664 > 8.8.8.8.53: 6509+ [1au] A? google.com. (51)
21:04:17.791164 IP 192.168.1.148.36768 > 8.8.8.8.53: 6509+ [1au] A? google.com. (51)


> Did you see this problem on 24.1?  Or have you only ever run 24.7
I will try to check this point as soon as I can

September 25, 2024, 04:14:36 PM #18 Last Edit: September 25, 2024, 04:29:59 PM by iMx
Well, it seems to make it through the firewall...

Can you dig @1.1.1.1 when you see the problems with 8.8.8.8?

Are you sure your ISP doesn't rate limit UDP traffic?  If you remove opnsense completely from the equation, if you can, and use an ISP router, do you see the same then?

From your timestamps, I'm guessing you're in South East Asia (GMT+7) somewhere?  I know for a fact some of the ISPs in that region do filter/limit traffic.

EDIT: I also missed this:

21:04:17.791183 IP 100.68.87.90.50947

... you're behind CGNAT? 

I think the next step is to prove that you don't see the same problem, when you remove opnsense, i.e using the ISP supplier router if you have one.

> Well, it seems to make it through the firewall...
Do you mean the provider's firewall?

> Can you dig @1.1.1.1 when you see the problems with 8.8.8.8?

tcpdump from pppoe - dig

21:20:59.664312 IP 192.168.1.144.56024 > 1.1.1.1.53: 29888+ A? ping.archlinux.org. (36)
21:20:59.664339 IP 192.168.1.144.56024 > 1.1.1.1.53: 40904+ AAAA? ping.archlinux.org. (36)
21:20:59.687528 IP 192.168.1.144.42368 > 1.1.1.1.53: 17505+ A? ping.archlinux.org. (36)
21:20:59.687546 IP 192.168.1.144.42368 > 1.1.1.1.53: 50282+ AAAA? ping.archlinux.org. (36)
21:20:59.732367 IP 192.168.1.144.47537 > 1.1.1.1.53: 53375+ A? ping.archlinux.org. (36)
21:20:59.732386 IP 192.168.1.144.47537 > 1.1.1.1.53: 3655+ AAAA? ping.archlinux.org. (36)
21:20:59.756007 IP 192.168.1.144.57205 > 1.1.1.1.53: 52608+ A? ping.archlinux.org. (36)
21:20:59.756030 IP 192.168.1.144.57205 > 1.1.1.1.53: 3481+ AAAA? ping.archlinux.org. (36)
21:20:59.800721 IP 192.168.1.144.44555 > 1.1.1.1.53: 1659+ A? ping.archlinux.org. (36)
21:20:59.800741 IP 192.168.1.144.44555 > 1.1.1.1.53: 47200+ AAAA? ping.archlinux.org. (36)
21:20:59.877564 IP 1.1.1.1.53 > 192.168.1.144.44555: 1659 2/0/0 CNAME redirect.archlinux.org., A 95.216.195.133 (75)
21:20:59.877699 IP 1.1.1.1.53 > 192.168.1.144.44555: 47200 2/0/0 CNAME redirect.archlinux.org., AAAA 2a01:4f9:c010:2636::1 (87)
21:21:01.144887 IP 192.168.1.144.33543 > 1.1.1.1.53: 41086+ A? dns.hd-8000.fun. (33)
21:21:02.003906 IP 192.168.1.144.36696 > 1.1.1.1.53: 7486+ A? kabinet.kemerovo.mts.ru. (41)
21:21:04.505538 IP 192.168.1.144.47328 > 1.1.1.1.53: 7486+ A? kabinet.kemerovo.mts.ru. (41)
21:21:04.669693 IP 192.168.1.144.56024 > 1.1.1.1.53: 29888+ A? ping.archlinux.org. (36)
21:21:04.669714 IP 192.168.1.144.56024 > 1.1.1.1.53: 40904+ AAAA? ping.archlinux.org. (36)
21:21:04.692787 IP 192.168.1.144.42368 > 1.1.1.1.53: 17505+ A? ping.archlinux.org. (36)
21:21:04.692806 IP 192.168.1.144.42368 > 1.1.1.1.53: 50282+ AAAA? ping.archlinux.org. (36)
21:21:04.737481 IP 192.168.1.144.47537 > 1.1.1.1.53: 53375+ A? ping.archlinux.org. (36)
21:21:04.737495 IP 192.168.1.144.47537 > 1.1.1.1.53: 3655+ AAAA? ping.archlinux.org. (36)
21:21:04.761603 IP 192.168.1.144.57205 > 1.1.1.1.53: 52608+ A? ping.archlinux.org. (36)
21:21:04.761634 IP 192.168.1.144.57205 > 1.1.1.1.53: 3481+ AAAA? ping.archlinux.org. (36)
21:21:06.005407 IP 192.168.1.144.42276 > 1.1.1.1.53: 7486+ A? example.com. (41)
21:21:06.005430 IP 192.168.1.144.45547 > 1.1.1.1.53: 8464+ AAAA? example.com. (41)
21:21:07.005821 IP 192.168.1.144.45112 > 1.1.1.1.53: 9465+ A? k.root-servers.net. (36)
21:21:07.005824 IP 192.168.1.144.49444 > 1.1.1.1.53: 10932+ AAAA? k.root-servers.net. (36)
21:21:09.508329 IP 192.168.1.144.58546 > 1.1.1.1.53: 9465+ A? k.root-servers.net. (36)
21:21:09.508333 IP 192.168.1.144.59279 > 1.1.1.1.53: 10932+ AAAA? k.root-servers.net. (36)
21:21:11.008094 IP 192.168.1.144.39466 > 1.1.1.1.53: 10932+ AAAA? k.root-servers.net. (36)


Some requests pass, but with great difficulty

tcpdump from pppoe - ping

21:21:52.385845 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 3634, seq 37, length 64
21:21:52.507677 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 3634, seq 49, length 64
21:21:52.913988 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 8, length 64
21:21:53.927327 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 9, length 64
21:21:54.943987 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 10, length 64
21:21:55.953987 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 11, length 64
21:21:56.970653 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 12, length 64
21:21:57.983988 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 13, length 64
21:21:58.993996 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 14, length 64
21:22:00.007326 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 15, length 64
21:22:01.020664 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 16, length 64
21:22:02.034012 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 17, length 64
21:22:03.047325 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 18, length 64
21:22:04.060663 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 19, length 64
21:22:05.073992 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 20, length 64
21:22:06.087324 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 21, length 64
21:22:07.100668 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 22, length 64
21:22:08.117329 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 23, length 64
21:22:09.127331 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 24, length 64
21:22:10.140662 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 25, length 64
21:22:11.153994 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 26, length 64
21:22:12.020323 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 2918, seq 1, length 64
21:22:12.142855 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 2918, seq 13, length 64
21:22:12.167343 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 27, length 64
21:22:12.265809 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 2918, seq 25, length 64
21:22:12.389405 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 2918, seq 37, length 64
21:22:12.512430 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 2918, seq 49, length 64
21:22:13.180672 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 28, length 64
21:22:14.194012 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 29, length 64
21:22:15.207334 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 30, length 64
21:22:16.220685 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 31, length 64
21:22:17.234032 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 32, length 64
21:22:18.247336 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 33, length 64
21:22:19.260695 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 34, length 64
21:22:20.274001 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 35, length 64
21:22:21.287348 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 36, length 64
21:22:22.300664 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 37, length 64
21:22:23.314009 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 38, length 64
21:22:24.330729 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 39, length 64
21:22:25.340681 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 40, length 64
21:22:26.354207 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 41, length 64
21:22:27.367340 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 42, length 64
21:22:28.380691 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 43, length 64
21:22:29.394032 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 44, length 64
21:22:30.407343 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 45, length 64
21:22:31.424010 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 39924, seq 46, length 64
21:22:32.020359 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 27475, seq 1, length 64
21:22:32.142843 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 27475, seq 13, length 64
21:22:32.264745 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 27475, seq 25, length 64
21:22:32.387374 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 27475, seq 37, length 64
21:22:32.509528 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 27475, seq 49, length 64
21:22:47.842253 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 29957, seq 1, length 64
21:22:48.891512 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 29957, seq 2, length 64
21:22:49.931457 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 29957, seq 3, length 64
21:22:52.018789 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 62932, seq 1, length 64
21:22:52.141018 IP 100.68.87.90 > one.one.one.one: ICMP echo request, id 62932, seq 13, length 64


> Are you sure your ISP doesn't rate limit UDP traffic?  If you remove opnsense completely from the equation, if you can, and use an ISP router, do you see the same then?

Yes, before writing about this problem, I made sure several times that this does not happen again without opnsense.

> From your timestamps, I'm guessing you're in South East Asia (GMT+7) somewhere?  I know for a fact some of the ISPs in that region do filter/limit traffic.
Yes. Russia. But as I wrote above, there is no such thing without opnsense.



> Do you mean the provider's firewall?

Your firewall, opnsense :)

From your tcpdump, you see the DNS request in the PPPoE interface dump, so it makes it through opnsense and gets 'dropped onto the wire' of the WAN interface.

> ... you're behind CGNAT? 
yes

> I think the next step is to prove that you don't see the same problem, when you remove opnsense, i.e using the ISP supplier router if you have one.

I'll check it out.

Hello everyone.

I checked the internet through another router (openwrt) and everything works fine there.

Also, before that, I checked on a completely clean opnsense 24.7, it did not help solve the problem.

I also attach information about my NIC:

igc3@pci0:4:0:0: class=0x020000 rev=0x04 hdr=0x00 vendor=0x8086 device=0x125c subvendor=0x8086 subdevice=0x0000
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Controller I226-V'
    class      = network
    subclass   = ethernet

September 27, 2024, 05:21:23 PM #23 Last Edit: September 27, 2024, 05:28:16 PM by iMx
This is very odd, I will admit there are things that just don't make sense to me :)

- That you DO see (tcpdump) the DNS request enter the LAN interface and pass out of the PPPoE interface.  Which means it's passing through opnsense, i.e is not being blocked, dropped, etc.

- That you DONT see the issue with OpenWRT, suggests the connection itself is good

- That you DO still see the problem with a completely (completely-completely?) clean install

Have you ruled out things like specific ports (on the device, or switch ports, etc), cables, etc?

If it's easy-ish to do, I think opnsense 24.1 would be interesting to test (i.e FreeBSD 13, rather than 14.1).

September 27, 2024, 05:31:06 PM #24 Last Edit: September 27, 2024, 06:24:45 PM by CruxtheNinth
question, do you have timeouts against your lan gw as well?

i just upgraded to 24.7.5 and now i have sporadic timeouts (like 0.9%) pinging my lan def gw (192.168.2.1 on igc1)

other hosts on the same switch, with ping from same client do not show this behaviour.
Running opnsense on Baremetal (n100 mini pc; intel i226 nics)

EDIT: loss is gone reverting to 24.7.4_1

Hello everyone. The problem is solved

What was the problem:   
MTR sent UDP packets with fixed source ports, but by default (static port = NO) is enabled in the NAT opnsense settings, thus the outgoing port was set to a new one each time.

Apparently, my provider doesn't like it and he banned me for it.

Solution:
Set manual mode to NAT and set each rule with a static port = YES.

Thank all so much for help  :)

I have only one question, why is this option enabled by default?

Because it mostly doesnt cause problems.

The autorules specify two NAT rules by default.

1. The general one with static port NO.
2. And a specific rule for ISAKMP (500) to set for the static port YES

The 1. usually doesn't do any harm
The 2. is set cause in case of ISAKMP (IPSEC p1) not having pinned ports breaks the establishment of the tunnel.

A lot of times when you have sensitive applications to this, you can set them specifically to pin the Port "static port YES". Other reason why people may want to use the "static port YES" is if they a lower grade NAT for online games testing via Nintendo or PS etc.

But saying this as mentioned the  "static port NO" usually doesn't cause problems.

If your ISP banned you well.... who knows whats the reason behind this. Cause they actually can not see what is the origination port if you use dynamic port NATing on OPNsense.

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD