Print Page - Regular LAN Traffic hits Default deny / state violation rule since 24.7

Title: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: TheDJ on September 04, 2024, 10:39:00 PM

I have been an OPNsense user for multiple years and so far it has been mostly stable.
Since the upgrade to 24.7, I started to experience very long loading times and sometimes even timeouts for normal web traffic on websites and smartphone apps.

When checking the logs, I started to see FW rule hits for the Default deny / state violation rule for 443 traffic. This seems to be traffic from (prematurely) closed states (TCP A or PA flag). The FW optimization is set to conservative.

I am on a (German) Telekom DS connection and running OPNsense 24.7.3_1. I know about the current kernel problems for IPv6, but to me this seems like different behavior: I don't have any ping/traceroute problems, I tried the full revert experimental kernel from github and the rule hits stem from IPv4 connections.

The only thing I could imagine would be that the devices in the nets contact the servers with their v6 addresses, ICMP fails and they try to resend the package on v4 without an existing state. But I don't know if that is realistic behavior.

Any advice on how to investigate?

Thanks and regards!

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: rkube on September 06, 2024, 02:18:05 PM

Hello,

I can also observe this behaviour

Quotehits Default deny / state violation rule

with actually allowed traffic (iperf3).

Until just now I thought that the errors are not related, but we also share the one in the topic "[Solved] Log flooded with "pf: ICMP error message too short (ip6)"" (https://forum.opnsense.org/index.php?topic=42632.0) TCP symptoms.

QuoteBut I don't know if that is realistic behavior.

Currently, I have to say, I'm sceptical about `pf`, but I don't think it's directly related to the ICMPv6 issue.

QuoteAny advice on how to investigate?

Next I'm giving a try:

Code Select

opnsense-update -zkr 24.7.3-no_sa
Many thx to @meyergru

I will also keep a close eye on this topic and report here if I find out anything.

Br
Reza

P.S.: @TheDJ: I'm native german speaking...

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: rkube on September 06, 2024, 04:14:21 PM

Just a shot in the dark: Service "IGMP Proxy" active on the some or all interfaces?

I will have to test that next. Even before I apply the patch 24.7.3-no_sa.

But if I remember correctly, only the interfaces on which the IGMP proxy is running are affected.

Of course, this could just be a coincidence, but I think the other interfaces have a "clean" packet rate that is sufficiently close to the native bit rate of the interface - and the interfaces involved in IGMP proxy have a severely restricted bit rate

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: TheDJ on September 06, 2024, 11:02:15 PM

I don't run the IGMP Proxy service (actually not even using IGMP at all in my network).

So, I would assume that this is not related.

As currently the other thread is more active I posted my assumption based on the current testing there just a few moments ago.
But I am very open to switch to this one, if we feel like the "ICMP error message too short (ip6)" logs can be targeted with the full-revert kernel (and are therefore manageble), while the state losses don't seem to be affected by the -no_sa kernel.

P.S. I am also native German speaker, but I think we should keep it international-friendly in this thread :)

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: rkube on September 08, 2024, 05:55:18 PM

Quote from: TheDJ on September 06, 2024, 11:02:15 PM
I don't run the IGMP Proxy service (actually not even using IGMP at all in my network).
So, I would assume that this is not related.

Unfortunately, I could not confirm my suspicion of IGMP proxy. Would have been too nice (easy). ;)

Quote from: TheDJ on September 06, 2024, 11:02:15 PM
As currently the other thread is more active I posted my assumption based on the current testing there just a few moments ago.
But I am very open to switch to this one, if we feel like the "ICMP error message too short (ip6)" logs can be targeted with the full-revert kernel (and are therefore manageble), while the state losses don't seem to be affected by the -no_sa kernel.

It seems to me that TCP connections can be drastically slower than UDP connections on the same route.

Reproducibly, some routes, e.g. over my slow pppoe interface, are only about 25% slower. I suspect with faster routes (1Gibt/s) it is 80% to almost 100% packet loss.
On the connections that don't show 100% loss anyway, however, there are noticeable dropouts lasting a few seconds every few minutes, during which 100% packet loss occurs.

That's why everything to do with TCP hardware offloading came into question for me. In the meantime, however, I have tried every "hw offlaodung turn off" combination without finding any significant differences.

In the firewall log you can find individual messages, such as the one in the topic, which indicate that traffic that is actually permitted is being dropped by last rule.

Together with the observed "debug" message: 'pf: loose state match...', however, it seems clear what is happening:
TCP packets are discarded because pf no longer recognises the TCP states. Each discarded packet slows down the TCP connection - each additional discarded packet slows it down even more.

I think that the underlying problem also explains why TCP connections have different speeds depending on the direction. And I don't just mean a small difference:

Code Select


❯ iperf3 -c 198.18.50.136 -t 3 --bidir
Connecting to host 198.18.50.136, port 5201
[  5] local 198.18.178.160 port 42184 connected to 198.18.50.136 port 5201
[  7] local 198.18.178.160 port 42188 connected to 198.18.50.136 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec   201 KBytes  [color=red]1.64 Mbits/sec    2 [/color]  1.41 KBytes       
[  7][RX-C]   0.00-1.00   sec   111 MBytes   931 Mbits/sec                  
[  5][TX-C]   1.00-2.00   sec  0.00 Bytes  [color=red]0.00 bits/sec    1  [/color] 1.41 KBytes       
[  7][RX-C]   1.00-2.00   sec   111 MBytes   932 Mbits/sec                  
[  5][TX-C]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes       
[  7][RX-C]   2.00-3.00   sec   111 MBytes   932 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -

Br
Reza

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: meyergru on September 08, 2024, 06:30:38 PM

I do not see that:

Code Select


#iperf3 -c iperf.online.net 198.18.50.136 -t 3 --bidir
Connecting to host iperf.online.net, port 5201
[  5] local 192.168.10.3 port 48222 connected to 51.158.1.21 port 5201
[  7] local 192.168.10.3 port 48226 connected to 51.158.1.21 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec  60.2 MBytes   505 Mbits/sec   16   5.20 MBytes
[  7][RX-C]   0.00-1.00   sec  96.6 MBytes   810 Mbits/sec
[  5][TX-C]   1.00-2.00   sec  54.5 MBytes   457 Mbits/sec   64   3.24 MBytes
[  7][RX-C]   1.00-2.00   sec   132 MBytes  1.11 Gbits/sec
[  5][TX-C]   2.00-3.00   sec  53.2 MBytes   447 Mbits/sec  114   3.11 MBytes
[  7][RX-C]   2.00-3.00   sec   132 MBytes  1.11 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-3.00   sec   168 MBytes   470 Mbits/sec  194             sender
[  5][TX-C]   0.00-3.03   sec   146 MBytes   405 Mbits/sec                  receiver
[  7][RX-C]   0.00-3.00   sec   417 MBytes  1.16 Gbits/sec  3203             sender
[  7][RX-C]   0.00-3.03   sec   361 MBytes  1000 Mbits/sec

These numbers correspond to my expected performance.

You could try with "-u" to see if UDP is faster. If it is, I would guess that your MTU is misconfigured. Can you try this (https://www.baeldung.com/linux/maximum-transmission-unit-mtu-ip) to find your real maximum MTU?

Probably, setting "-M 1400" would show if this is the problem.

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: rkube on September 08, 2024, 07:49:31 PM

Quote from: meyergru on September 08, 2024, 06:30:38 PM
You could try with "-u" to see if UDP is faster. If it is, I would guess that your MTU is misconfigured. Can you try this (https://www.baeldung.com/linux/maximum-transmission-unit-mtu-ip) to find your real maximum MTU?

Probably, setting "-M 1400" would show if this is the problem.

Many thanks for the answer.
The MTU I have set is (unfortunately) not the problem, as I am only testing between two local VLANs (MTU==1500) that are routed/filtered via opnsense. I could try jumbo frames, maybe it can get even worse ;-)

The reference to pppoe only referred to the fact that this is my slowest connection (100/40); however, the problem is not as obvious there as with my gigabit/VLAN/LAGG connections.

I would like to point out that this topic is about allowed traffic inexplicably hitting the "last rule", a problem with pf and TCP states is suspected, and that this is interfering with TCP connections.

Br
Reza

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: TheDJ on September 08, 2024, 08:24:17 PM

Another shot in the dark from my side (I would not be able to explain it, but maybe someone else): could it be a VLAN routing problem? The (LAN) interfaces that have this problem on my device are all VLANs. Not LAGGs, but VLANs just like your setup.
At the same time, I have a road warrior WireGuard VPN setup on the same box, leaving via the same WAN, which (at least from very shallow use) did not encounter any problem in this regard.

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: rkube on September 08, 2024, 08:41:44 PM

...and wireguard is UDP only.

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: meyergru on September 08, 2024, 10:47:18 PM

Quote from: rkube on September 08, 2024, 07:49:31 PM
The MTU I have set is (unfortunately) not the problem, as I am only testing between two local VLANs (MTU==1500) that are routed/filtered via opnsense. I could try jumbo frames, maybe it can get even worse ;-)

Yet you show results from a iperf3 test run against an internet IP?

Quote from: rkube on September 08, 2024, 07:49:31 PM
The reference to pppoe only referred to the fact that this is my slowest connection (100/40); however, the problem is not as obvious there as with my gigabit/VLAN/LAGG connections.

I would like to point out that this topic is about allowed traffic inexplicably hitting the "last rule", a problem with pf and TCP states is suspected, and that this is interfering with TCP connections.

So there are alo VLANs and LAGGs in the mix? Maybe netmap and suricata as well? ???

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: TheDJ on September 09, 2024, 12:56:30 AM

Quote from: meyergru on September 08, 2024, 10:47:18 PM
So there are alo VLANs and LAGGs in the mix? Maybe netmap and suricata as well? ???

For me, only VLANs. No LAGGs, netmap and suricata (had it in IDS mode before, but turned it of without any difference). Also, these VLANs have been stable before (for months).
I also had a traffic shaper running beforehand, but it does not make a difference if it is running with or without it (although the iperf results show way more Retr packets if it is running with the shaper).

With the shaper:

Code Select

# iperf3 -c iperf.online.net 198.18.50.136 --bidir
Connecting to host iperf.online.net, port 5201
[  5] local 10.200.10.2 port 44254 connected to 51.158.1.21 port 5201
[  7] local 10.200.10.2 port 44266 connected to 51.158.1.21 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec  6.12 MBytes  51.4 Mbits/sec   15    222 KBytes       
[  7][RX-C]   0.00-1.00   sec  23.2 MBytes   195 Mbits/sec                  
[  5][TX-C]   1.00-2.00   sec  4.88 MBytes  40.9 Mbits/sec   59    222 KBytes       
[  7][RX-C]   1.00-2.00   sec  26.7 MBytes   224 Mbits/sec                  
[  5][TX-C]   2.00-3.00   sec  4.75 MBytes  39.8 Mbits/sec   92    228 KBytes       
[  7][RX-C]   2.00-3.00   sec  26.7 MBytes   224 Mbits/sec                  
[  5][TX-C]   3.00-4.00   sec  3.57 MBytes  29.9 Mbits/sec   77    222 KBytes       
[  7][RX-C]   3.00-4.00   sec  27.0 MBytes   227 Mbits/sec                  
[  5][TX-C]   4.00-5.00   sec  4.76 MBytes  39.9 Mbits/sec  136    166 KBytes       
[  7][RX-C]   4.00-5.00   sec  27.1 MBytes   227 Mbits/sec                  
[  5][TX-C]   5.00-6.00   sec  3.52 MBytes  29.5 Mbits/sec  145    225 KBytes       
[  7][RX-C]   5.00-6.00   sec  26.9 MBytes   225 Mbits/sec                  
[  5][TX-C]   6.00-7.00   sec  4.76 MBytes  39.9 Mbits/sec   90    219 KBytes       
[  7][RX-C]   6.00-7.00   sec  27.0 MBytes   227 Mbits/sec                  
[  5][TX-C]   7.00-8.00   sec  4.70 MBytes  39.4 Mbits/sec   84    148 KBytes       
[  7][RX-C]   7.00-8.00   sec  26.3 MBytes   221 Mbits/sec                  
[  5][TX-C]   8.00-9.00   sec  3.52 MBytes  29.6 Mbits/sec   85    222 KBytes       
[  7][RX-C]   8.00-9.00   sec  27.7 MBytes   232 Mbits/sec                  
[  5][TX-C]   9.00-10.00  sec  4.80 MBytes  40.3 Mbits/sec  123    152 KBytes       
[  7][RX-C]   9.00-10.00  sec  26.9 MBytes   226 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec  45.4 MBytes  38.1 Mbits/sec  906             sender
[  5][TX-C]   0.00-10.02  sec  42.3 MBytes  35.4 Mbits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec   277 MBytes   233 Mbits/sec  2261             sender
[  7][RX-C]   0.00-10.02  sec   266 MBytes   222 Mbits/sec                  receiver

iperf Done.

Without the shaper:

Code Select


# iperf3 -c iperf.online.net 198.18.50.136 --bidir
Connecting to host iperf.online.net, port 5201
[  5] local 10.200.10.2 port 52252 connected to 51.158.1.21 port 5201
[  7] local 10.200.10.2 port 52266 connected to 51.158.1.21 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec  7.46 MBytes  62.6 Mbits/sec   91    382 KBytes       
[  7][RX-C]   0.00-1.00   sec  23.7 MBytes   199 Mbits/sec                  
[  5][TX-C]   1.00-2.00   sec  4.66 MBytes  39.1 Mbits/sec   33    294 KBytes       
[  7][RX-C]   1.00-2.00   sec  29.1 MBytes   244 Mbits/sec                  
[  5][TX-C]   2.00-3.00   sec  4.73 MBytes  39.7 Mbits/sec   12    259 KBytes       
[  7][RX-C]   2.00-3.00   sec  30.2 MBytes   253 Mbits/sec                  
[  5][TX-C]   3.00-4.00   sec  4.70 MBytes  39.4 Mbits/sec    0    276 KBytes       
[  7][RX-C]   3.00-4.00   sec  31.9 MBytes   267 Mbits/sec                  
[  5][TX-C]   4.00-5.00   sec  4.70 MBytes  39.4 Mbits/sec    0    253 KBytes       
[  7][RX-C]   4.00-5.00   sec  30.7 MBytes   257 Mbits/sec                  
[  5][TX-C]   5.00-6.00   sec  4.63 MBytes  38.8 Mbits/sec    0    264 KBytes       
[  7][RX-C]   5.00-6.00   sec  29.4 MBytes   247 Mbits/sec                  
[  5][TX-C]   6.00-7.00   sec  4.70 MBytes  39.5 Mbits/sec    0    273 KBytes       
[  7][RX-C]   6.00-7.00   sec  33.6 MBytes   282 Mbits/sec                  
[  5][TX-C]   7.00-8.00   sec  4.67 MBytes  39.2 Mbits/sec    0    270 KBytes       
[  7][RX-C]   7.00-8.00   sec  31.9 MBytes   267 Mbits/sec                  
[  5][TX-C]   8.00-9.00   sec  4.66 MBytes  39.1 Mbits/sec    0    262 KBytes       
[  7][RX-C]   8.00-9.00   sec  31.5 MBytes   265 Mbits/sec                  
[  5][TX-C]   9.00-10.00  sec  4.70 MBytes  39.4 Mbits/sec    0   5.62 KBytes       
[  7][RX-C]   9.00-10.00  sec  31.4 MBytes   264 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec  49.6 MBytes  41.6 Mbits/sec  136             sender
[  5][TX-C]   0.00-10.02  sec  46.7 MBytes  39.1 Mbits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec   320 MBytes   268 Mbits/sec  1054             sender
[  7][RX-C]   0.00-10.02  sec   303 MBytes   254 Mbits/sec                  receiver

iperf Done.

Both iperf runs are in line with what I expect for my line. But the TCP state losses and FW hits still happen.

What still seems strange to me: all of the TCP state losses and FW hits happen on v4 addresses, although the devices have SLAAC GUAs available. Of course, the public servers, for which those connection dropouts happen, might only have v4 addresses, so I'm not sure if that is any specific symptom.
The TCP dropouts also happen for some apps (e.g. German ZDF Mediathek) more often than for others.
For my internal networks, I have never experienced a state loss - only to the Internet.

What else could be done to diagnose this? I am close to downgrading to 24.1. The timeouts are really annoying during regular use.

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: meyergru on September 09, 2024, 10:40:53 AM

With the normal 24.7.3 kernel, I can confirm the "pf: ICMP error message too short (ip6)" messages - which go away with the no-sa kernel.

I can also confirm the "pf: loose state match" notices with both kernels.

Code Select

2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP out wire: 79.222.100.212:488 99.88.77.66:64307 stack: 79.222.100.212:488 10.0.1.7:56542 [lo=2251669520 high=2251734519 win=502 modulator=0 wscale=7] [lo=232729624 high=232785857 win=510 modulator=0 wscale=7] 10:10 R seq=232729624 (232721729) ack=2251669520 len=0 ackskew=0 pkts=12:16 dir=in,rev	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP in wire: 10.0.1.7:56542 79.222.100.212:488 stack: - [lo=2251669520 high=2251734519 win=502 modulator=0 wscale=7] [lo=232729624 high=232785857 win=510 modulator=0 wscale=7] 10:10 R seq=232729624 (232721729) ack=2251669520 len=0 ackskew=0 pkts=12:15 dir=out,rev	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP out wire: 79.222.100.212:488 99.88.77.66:64307 stack: 79.222.100.212:488 10.0.1.7:56542 [lo=2251669520 high=2251734519 win=502 modulator=0 wscale=7] [lo=232729624 high=232785857 win=510 modulator=0 wscale=7] 10:10 R seq=232729624 (232721729) ack=2251669520 len=0 ackskew=0 pkts=12:15 dir=in,rev	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP out wire: 79.222.100.212:488 99.88.77.66:64307 stack: 79.222.100.212:488 10.0.1.7:56542 [lo=2251669520 high=2251734519 win=502 modulator=0 wscale=7] [lo=232723169 high=232785857 win=510 modulator=0 wscale=7] 9:4 R seq=2251669520 (2251669495) ack=232723169 len=0 ackskew=0 pkts=11:9 dir=out,fwd	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP in wire: 10.0.1.7:56542 79.222.100.212:488 stack: - [lo=2251669520 high=2251734519 win=502 modulator=0 wscale=7] [lo=232723169 high=232785857 win=510 modulator=0 wscale=7] 9:4 R seq=2251669520 (2251669495) ack=232723169 len=0 ackskew=0 pkts=11:9 dir=in,fwd	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP out wire: 10.0.2.36:9443 10.0.1.7:48268 stack: - [lo=1094662497 high=1094727369 win=502 modulator=0 wscale=7] [lo=1009336067 high=1009389476 win=510 modulator=0 wscale=7] 10:10 R seq=1094662497 (1094662473) ack=1009336067 len=0 ackskew=0 pkts=14:14 dir=out,fwd	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP in wire: 10.0.1.7:48268 10.0.2.36:9443 stack: - [lo=1094662497 high=1094727369 win=502 modulator=0 wscale=7] [lo=1009336067 high=1009389476 win=510 modulator=0 wscale=7] 10:10 R seq=1094662497 (1094662473) ack=1009336067 len=0 ackskew=0 pkts=14:14 dir=in,fwd	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP out wire: 10.0.2.36:9443 10.0.1.7:48268 stack: - [lo=1094662497 high=1094727369 win=502 modulator=0 wscale=7] [lo=1009336067 high=1009389476 win=510 modulator=0 wscale=7] 10:10 R seq=1094662497 (1094662473) ack=1009336067 len=0 ackskew=0 pkts=13:14 dir=out,fwd	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP in wire: 10.0.1.7:48268 10.0.2.36:9443 stack: - [lo=1094662497 high=1094727369 win=502 modulator=0 wscale=7] [lo=1009336067 high=1009389476 win=510 modulator=0 wscale=7] 10:10 R seq=1094662497 (1094662473) ack=1009336067 len=0 ackskew=0 pkts=13:14 dir=in,fwd	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP out wire: 10.0.2.36:9443 10.0.1.7:48268 stack: - [lo=1094662497 high=1094727369 win=502 modulator=0 wscale=7] [lo=1009336067 high=1009389476 win=510 modulator=0 wscale=7] 10:10 R seq=1094662497 (1094662473) ack=1009336067 len=0 ackskew=0 pkts=12:14 dir=out,fwd	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP in wire: 10.0.1.7:48268 10.0.2.36:9443 stack: - [lo=1094662497 high=1094727369 win=502 modulator=0 wscale=7] [lo=1009336067 high=1009389476 win=510 modulator=0 wscale=7] 10:10 R seq=1094662497 (1094662473) ack=1009336067 len=0 ackskew=0 pkts=12:14 dir=in,fwd	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP out wire: 10.0.2.36:9443 10.0.1.7:48268 stack: - [lo=1094662497 high=1094727369 win=502 modulator=0 wscale=7] [lo=1009336067 high=1009389476 win=510 modulator=0 wscale=7] 10:10 R seq=1094662497 (1094662473) ack=1009336067 len=0 ackskew=0 pkts=11:14 dir=out,fwd	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP in wire: 10.0.1.7:48268 10.0.2.36:9443 stack: - [lo=1094662497 high=1094727369 win=502 modulator=0 wscale=7] [lo=1009336067 high=1009389476 win=510 modulator=0 wscale=7] 10:10 R seq=1094662497 (1094662473) ack=1009336067 len=0 ackskew=0 pkts=11:14 dir=in,fwd	
2024-09-09T10:30:02	Notice	kernel	TCP out wire: 10.0.2.36:9443 10.0.1.7:48268 stack: - [lo=1094662497 high=1094727369 win=502 modulator=0 wscale=7] [lo=1009336067 high=1009389476 win=510 modulator=0 wscale=7] 10:10 R seq=1094662497 (1094662473) ack=1009336067 len=0 ackskew=0 pkts=10:14 dir=out,fwd	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: pf: ICMP error message too short (ip6)	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP in wire: 10.0.1.7:48268 10.0.2.36:9443 stack: - [lo=1094662497 high=1094727369 win=502 modulator=0 wscale=7] [lo=1009336067 high=1009389476 win=510 modulator=0 wscale=7] 10:10 R seq=1094662497 (1094662473) ack=1009336067 len=0 ackskew=0 pkts=10:14 dir=in,fwd	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP out wire: 10.0.2.36:9443 10.0.1.7:48268 stack: - [lo=1094662497 high=1094727369 win=502 modulator=0 wscale=7] [lo=1009332316 high=1009389476 win=510 modulator=0 wscale=7] 10:10 R seq=1094662497 (1094662473) ack=1009332316 len=0 ackskew=0 pkts=9:10 dir=out,fwd	
2024-09-09T10:30:02	Notice	kernel	pf: loose state match: TCP in wire: 10.0.1.7:48268 10.0.2.36:9443 stack: - [lo=1094662497 high=1094727369 win=502 modulator=0 wscale=7] [lo=1009331028 high=1009389476 win=510 modulator=0 wscale=7] 10:10 R seq=1094662497 (1094662473) ack=1009331028 len=0 ackskew=0 pkts=9:9 dir=in,fwd	
2024-09-09T10:30:01	Notice	kernel	pf: ICMP error message too short (ip6)	
2024-09-09T10:30:01	Notice	kernel	pf: ICMP error message too short (ip6)	
2024-09-09T10:30:01	Notice	kernel	pf: ICMP error message too short (ip6)	
2024-09-09T10:30:00	Notice	kernel	pf: loose state match: TCP out wire: 157.240.252.35:443 99.88.77.66:18542 stack: 157.240.252.35:443 10.0.1.7:57294 [lo=3539577390 high=3539646254 win=502 modulator=0 wscale=7] [lo=4072763929 high=4072811232 win=269 modulator=0 wscale=8] 10:10 R seq=3539577390 (3539577366) ack=4072763929 len=0 ackskew=0 pkts=24:28 dir=out,fwd

However, I do not see any massive performance degradation because of this.

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: TheDJ on September 09, 2024, 01:36:56 PM

Do you notice any FW hits on the default deny for this traffic?

For me, these TCP state losses correspond quite well with the state losses, as far as I can tell. Right now, I wouldn't know any other reason, why incoming 443 traffic would be blocked (especially with the A and PA flags).

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: struppie on September 09, 2024, 03:45:15 PM

I think I'm running into an issue which has the same root cause as reported here.
After updating to 24.7.x I cannot connect my web server from the public internet anymore.
Before the update everything was running fine and I did not touch the configuration.
OPNsense has a port forwarding and allow rules etc. which were working fine to forward public internet traffic towards my internal web server.

But after the update each attempt to connect the web server is rejected via the floating "default deny / state violation" rule. Even incoming traffic with tcpflags S is catched by the "default deny" rule.

Are there any changes with OPNsense v24.7 why this happens, or recommendations to overcome the problem?
I'm currently running 24.7.3_1. Any hints are welcome.

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: struppie on September 09, 2024, 04:15:05 PM

Quote from: struppie on September 09, 2024, 03:45:15 PM
I think I'm running into an issue which has the same root cause as reported here.
....

Found the issue - I'm checking the source IP with GeoIPWhitelisting. Seems, that this does not work anymore as expected, need to analyse it in detail.
But using "any" for source (instead GeoIPWhitelisting) in the forwarding rule heals everything (means custom forwarding + rule matches and therefore the traffic does not run into default deny anymore).

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: rkube on September 09, 2024, 04:16:09 PM

Quote from: meyergru on September 09, 2024, 10:40:53 AM
With the normal 24.7.3 kernel, I can confirm the "pf: ICMP error message too short (ip6)" messages - which go away with the no-sa kernel.

I can also confirm the "pf: loose state match" notices with both kernels.

I just went back to OPNsense 24.1 (imported config from 24.7) and, with debug logging turned on,... taddahhh... I also see same 'pf: loose state match' notices.

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: TheDJ on September 10, 2024, 07:02:06 PM

Quote from: rkube on September 09, 2024, 04:16:09 PM
I just went back to OPNsense 24.1 (imported config from 24.7) and, with debug logging turned on,... taddahhh... I also see same 'pf: loose state match' notices.

Thanks, good to know. Maybe it's a different (but somehow related) issue that did not surface in the same way until now.

Do you also see the performance degredation/FW hits?

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: rkube on September 10, 2024, 09:44:11 PM

Hi!

Quote from: TheDJ on September 10, 2024, 07:02:06 PM
Thanks, good to know.

I was a little bit disappointed at that moment.

Quote from: TheDJ on September 10, 2024, 07:02:06 PM
Maybe it's a different (but somehow related) issue that did not surface in the same way until now.

I'm going to take a step back and look at the whole thing with some distance to the maybe blinding "FreeBSD 14 / IPv6 issue".

Quote from: TheDJ on September 10, 2024, 07:02:06 PM
Do you also see the performance degredation/FW hits?

Unfortunately, yes.
But I did something just days before upgrading to OPNsense 27.4: I had previously done the bonding (lacp) of the interfaces and VLAN tagging on the host (promox) and put the resulting bond0.[vlan id] interfaces as separated virtio-networkcards to the OPNsense-VM. I just changed that, because I was unhappy with creating an interface for each new (or changed) VLAN on the host and having to guess in which order it will be assigned to the OPNsense interfaces (this was the behavior under virtualbox).

So, I'm going back to 24.7 (no_SA kernel), but assemble again the bond- and vlan-interfaces on the host - assuming that linux (probably?) will have the better driver and working hardware offloading *fingers crossed* ;-)

Br
Reza

P.S.: Sorry, I'm a little bit sick at the moment and spending less time in front of my homelab at the moment ...

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: rkube on September 10, 2024, 09:53:12 PM

Hi,

sorry, I missed your post for days ...

Quote from: meyergru on September 08, 2024, 10:47:18 PM
Quote from: rkube on September 08, 2024, 07:49:31 PM
The MTU I have set is (unfortunately) not the problem, as I am only testing between two local VLANs (MTU==1500) that are routed/filtered via opnsense. I could try jumbo frames, maybe it can get even worse ;-)

Yet you show results from a iperf3 test run against an internet IP?

Please dont beat me, but 198.18.0.0/15 are not public route-able IPs. (pssst: "bogus IPs" ;-] )

Quote from: meyergru on September 08, 2024, 10:47:18 PM
So there are alo VLANs and LAGGs in the mix? Maybe netmap and suricata as well? ???

I think of LAGGs and VLANs as very basic FW/Router interface types. Beside of pppoe I have not more in the mix. So... no netmap or suricata here.

Br
Reza

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: TheDJ on September 10, 2024, 10:15:31 PM

@meyergru and @rkube: what are your ISPs (I assume both of you are in Germany)?

As mentioned, I am on a Telekom Dual Stack with 250/40. Maybe it is a routing/peering issue that coincidentally appeared at the same time. Then, the TCP packets might be just a little too late (running out of the TTL) and the state is closed? This would also explain why it is not perfectly consistent and now even hits 24.1?

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: TheDJ on September 12, 2024, 08:11:25 PM

Just for the record: 24.7.4 did not change/improve the behavior.

I didn't expect it to, because I did not see anything in the changelog that would indicate better behavior for v4, but I just wanted to note it here.

Is there anything else that could be done? I am very open to suggestions.

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: rkube on September 12, 2024, 08:58:28 PM

Quote from: TheDJ on September 12, 2024, 08:11:25 PM
Is there anything else that could be done? I am very open to suggestions.

Next of my shots into the ~~dark~~ light: We are reloading OPNsense, or just pf, a lot at the moment. But e.g. our laptop and other network devices, which has a lot of established TCP connections, stays "online" during this time.

So when OPNsense (pf) loses it's knowledge of all states (because of a reboot or config reload,...), the laptop still has the knowledge of it's already established connections.

When the laptop sends a TCP packet to another station with already established TCP state, it won't send a new SYN packet - it will just send acks (or maybe push acks) with sequence numbers.

OPNsense, seeing this traffic, does not know about this already established state and will log a debug message.
The more we test atm, the more we'll get this debug message.

And the packets will be blocked at "last rule", because of the state violation. "Works as designed" ;-)

Br
Reza

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: TheDJ on September 12, 2024, 11:23:23 PM

This is true for very fresh traffic after a reboot/reconnect. But should stabilize after a few minutes. For me, the behavior is ongoing even after a few days.

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: TheDJ on October 21, 2024, 05:34:31 PM

After weeks of hunting down this behavior and literally exchanging every hardware component, I found the problem: it (presumably) was a firmware upgrade in a Wifi Access Point that worked as a wireless backhaul.

I deployed the new v7.XX branch for Zyxel NWA220AX-6E roughly at the same time. I performed multiple firmware upgrades on that device and even got it swapped via an RMA afterward, so I didn't think this was related.
Today I DOWNgraded it to a 6.XX firmware that I still had - 'poof' - all issues seem to be gone. I will continue to monitor the situation, but I believe that firmware for that device is borked. This leads to packet loss and in turn a closing of TCP states.

Title: Re: Regular LAN Traffic hits Default deny / state violation rule since 24.7
Post by: rkube on October 21, 2024, 08:29:20 PM

Quote from: TheDJ on October 21, 2024, 05:34:31 PM
Today I DOWNgraded it to a 6.XX firmware that I still had - 'poof' - all issues seem to be gone. I will continue to

Fingers crossed ;-)

OPNsense Forum

Archive => 24.7, 24.10 Legacy Series => Topic started by: TheDJ on September 04, 2024, 10:39:00 PM