Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - TheDJ

#1
Yes, this is how I remembered it: only ISOs/images for major versions. This is why I was so surprised to see the 26.1.2 minor version in the mirror repos. Is there an announcement explaining why this version specifically is available as a full image?
#2
I want to set up OPNsense with a config backup and I went to download the newest installer/image.
I now noticed that the 26.1 directories for all mirrors that I checked (leaseweb, https://pkg.opnsense.org/releases/mirror/, dns-root etc.) also contain a 26.1.2 image (including all keys etc.). On top of that, the root directories for these mirrors also contain a 26.1.2 directory (instead of only major versions).

This seems weird to me for two reasons:
  • none of the other major version directories contain 2x.1/7.z minor versions - only the respective major versions. In the past, I also can't remember that it was possible to download minor versions from the mirrors
  • 26.1.2 is not the current minor version for the 26.1.2 - so if the most current minor version should now also show up in the mirrors, 26.1.4 should be up.

Is there a specific reason that this is different for the 26.1.2 version?
#3
German - Deutsch / Re: DS-Lite OPNsense
October 27, 2024, 09:38:24 PM
Wenn ich das richtig sehe, wird dieses Problem jetzt in https://github.com/opnsense/core/issues/7713 getrackt.
#4
After weeks of hunting down this behavior and literally exchanging every hardware component, I found the problem: it (presumably) was a firmware upgrade in a Wifi Access Point that worked as a wireless backhaul.

I deployed the new v7.XX branch for Zyxel NWA220AX-6E roughly at the same time. I performed multiple firmware upgrades on that device and even got it swapped via an RMA afterward, so I didn't think this was related.
Today I DOWNgraded it to a 6.XX firmware that I still had - 'poof' - all issues seem to be gone. I will continue to monitor the situation, but I believe that firmware for that device is borked. This leads to packet loss and in turn a closing of TCP states.
#5
This is true for very fresh traffic after a reboot/reconnect. But should stabilize after a few minutes. For me, the behavior is ongoing even after a few days.
#6
Just for the record: 24.7.4 did not change/improve the behavior.

I didn't expect it to, because I did not see anything in the changelog that would indicate better behavior for v4, but I just wanted to note it here.

Is there anything else that could be done? I am very open to suggestions.
#7
Try widening the widget a bit more. I think, I noticed the same behavior when I resized it and it got too small.
#8
@meyergru and @rkube: what are your ISPs (I assume both of you are in Germany)?

As mentioned, I am on a Telekom Dual Stack with 250/40. Maybe it is a routing/peering issue that coincidentally appeared at the same time. Then, the TCP packets might be just a little too late (running out of the TTL) and the state is closed? This would also explain why it is not perfectly consistent and now even hits 24.1?
#9
Quote from: rkube on September 09, 2024, 04:16:09 PM
I just went back to OPNsense 24.1 (imported config from 24.7)  and, with debug logging turned on,... taddahhh... I also see same 'pf: loose state match' notices.

Thanks, good to know. Maybe it's a different (but somehow related) issue that did not surface in the same way until now.

Do you also see the performance degredation/FW hits?
#10
Do you notice any FW hits on the default deny for this traffic?

For me, these TCP state losses correspond quite well with the state losses, as far as I can tell. Right now, I wouldn't know any other reason, why incoming 443 traffic would be blocked (especially with the A and PA flags).
#11
Quote from: meyergru on September 08, 2024, 10:47:18 PM
So there are alo VLANs and LAGGs in the mix? Maybe netmap and suricata as well? ???

For me, only VLANs. No LAGGs, netmap and suricata (had it in IDS mode before, but turned it of without any difference). Also, these VLANs have been stable before (for months).
I also had a traffic shaper running beforehand, but it does not make a difference if it is running with or without it (although the iperf results show way more Retr packets if it is running with the shaper).

With the shaper:
# iperf3 -c iperf.online.net 198.18.50.136 --bidir
Connecting to host iperf.online.net, port 5201
[  5] local 10.200.10.2 port 44254 connected to 51.158.1.21 port 5201
[  7] local 10.200.10.2 port 44266 connected to 51.158.1.21 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec  6.12 MBytes  51.4 Mbits/sec   15    222 KBytes       
[  7][RX-C]   0.00-1.00   sec  23.2 MBytes   195 Mbits/sec                 
[  5][TX-C]   1.00-2.00   sec  4.88 MBytes  40.9 Mbits/sec   59    222 KBytes       
[  7][RX-C]   1.00-2.00   sec  26.7 MBytes   224 Mbits/sec                 
[  5][TX-C]   2.00-3.00   sec  4.75 MBytes  39.8 Mbits/sec   92    228 KBytes       
[  7][RX-C]   2.00-3.00   sec  26.7 MBytes   224 Mbits/sec                 
[  5][TX-C]   3.00-4.00   sec  3.57 MBytes  29.9 Mbits/sec   77    222 KBytes       
[  7][RX-C]   3.00-4.00   sec  27.0 MBytes   227 Mbits/sec                 
[  5][TX-C]   4.00-5.00   sec  4.76 MBytes  39.9 Mbits/sec  136    166 KBytes       
[  7][RX-C]   4.00-5.00   sec  27.1 MBytes   227 Mbits/sec                 
[  5][TX-C]   5.00-6.00   sec  3.52 MBytes  29.5 Mbits/sec  145    225 KBytes       
[  7][RX-C]   5.00-6.00   sec  26.9 MBytes   225 Mbits/sec                 
[  5][TX-C]   6.00-7.00   sec  4.76 MBytes  39.9 Mbits/sec   90    219 KBytes       
[  7][RX-C]   6.00-7.00   sec  27.0 MBytes   227 Mbits/sec                 
[  5][TX-C]   7.00-8.00   sec  4.70 MBytes  39.4 Mbits/sec   84    148 KBytes       
[  7][RX-C]   7.00-8.00   sec  26.3 MBytes   221 Mbits/sec                 
[  5][TX-C]   8.00-9.00   sec  3.52 MBytes  29.6 Mbits/sec   85    222 KBytes       
[  7][RX-C]   8.00-9.00   sec  27.7 MBytes   232 Mbits/sec                 
[  5][TX-C]   9.00-10.00  sec  4.80 MBytes  40.3 Mbits/sec  123    152 KBytes       
[  7][RX-C]   9.00-10.00  sec  26.9 MBytes   226 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec  45.4 MBytes  38.1 Mbits/sec  906             sender
[  5][TX-C]   0.00-10.02  sec  42.3 MBytes  35.4 Mbits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec   277 MBytes   233 Mbits/sec  2261             sender
[  7][RX-C]   0.00-10.02  sec   266 MBytes   222 Mbits/sec                  receiver

iperf Done.


Without the shaper:

# iperf3 -c iperf.online.net 198.18.50.136 --bidir
Connecting to host iperf.online.net, port 5201
[  5] local 10.200.10.2 port 52252 connected to 51.158.1.21 port 5201
[  7] local 10.200.10.2 port 52266 connected to 51.158.1.21 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec  7.46 MBytes  62.6 Mbits/sec   91    382 KBytes       
[  7][RX-C]   0.00-1.00   sec  23.7 MBytes   199 Mbits/sec                 
[  5][TX-C]   1.00-2.00   sec  4.66 MBytes  39.1 Mbits/sec   33    294 KBytes       
[  7][RX-C]   1.00-2.00   sec  29.1 MBytes   244 Mbits/sec                 
[  5][TX-C]   2.00-3.00   sec  4.73 MBytes  39.7 Mbits/sec   12    259 KBytes       
[  7][RX-C]   2.00-3.00   sec  30.2 MBytes   253 Mbits/sec                 
[  5][TX-C]   3.00-4.00   sec  4.70 MBytes  39.4 Mbits/sec    0    276 KBytes       
[  7][RX-C]   3.00-4.00   sec  31.9 MBytes   267 Mbits/sec                 
[  5][TX-C]   4.00-5.00   sec  4.70 MBytes  39.4 Mbits/sec    0    253 KBytes       
[  7][RX-C]   4.00-5.00   sec  30.7 MBytes   257 Mbits/sec                 
[  5][TX-C]   5.00-6.00   sec  4.63 MBytes  38.8 Mbits/sec    0    264 KBytes       
[  7][RX-C]   5.00-6.00   sec  29.4 MBytes   247 Mbits/sec                 
[  5][TX-C]   6.00-7.00   sec  4.70 MBytes  39.5 Mbits/sec    0    273 KBytes       
[  7][RX-C]   6.00-7.00   sec  33.6 MBytes   282 Mbits/sec                 
[  5][TX-C]   7.00-8.00   sec  4.67 MBytes  39.2 Mbits/sec    0    270 KBytes       
[  7][RX-C]   7.00-8.00   sec  31.9 MBytes   267 Mbits/sec                 
[  5][TX-C]   8.00-9.00   sec  4.66 MBytes  39.1 Mbits/sec    0    262 KBytes       
[  7][RX-C]   8.00-9.00   sec  31.5 MBytes   265 Mbits/sec                 
[  5][TX-C]   9.00-10.00  sec  4.70 MBytes  39.4 Mbits/sec    0   5.62 KBytes       
[  7][RX-C]   9.00-10.00  sec  31.4 MBytes   264 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec  49.6 MBytes  41.6 Mbits/sec  136             sender
[  5][TX-C]   0.00-10.02  sec  46.7 MBytes  39.1 Mbits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec   320 MBytes   268 Mbits/sec  1054             sender
[  7][RX-C]   0.00-10.02  sec   303 MBytes   254 Mbits/sec                  receiver

iperf Done.


Both iperf runs are in line with what I expect for my line. But the TCP state losses and FW hits still happen.

What still seems strange to me: all of the TCP state losses and FW hits happen on v4 addresses, although the devices have SLAAC GUAs available. Of course, the public servers, for which those connection dropouts happen, might only have v4 addresses, so I'm not sure if that is any specific symptom.
The TCP dropouts also happen for some apps (e.g. German ZDF Mediathek) more often than for others.
For my internal networks, I have never experienced a state loss - only to the Internet.

What else could be done to diagnose this? I am close to downgrading to 24.1. The timeouts are really annoying during regular use.
#12
Another shot in the dark from my side (I would not be able to explain it, but maybe someone else): could it be a VLAN routing problem? The (LAN) interfaces that have this problem on my device are all VLANs. Not LAGGs, but VLANs just like your setup.
At the same time, I have a road warrior WireGuard VPN setup on the same box, leaving via the same WAN, which (at least from very shallow use) did not encounter any problem in this regard.
#13
Yeah, I switched the mirror and then the download worked.

However, even with multiple more hours on the no_sa kernel, the TCP state losses (and FW rule hits) are still there and completely the same.
The "ICMP error message too short (ip6)" (which were the initial starting point for this thread) are gone (like the others described), but the TCP behavior did not change.
#14
Quote from: doktornotor on September 06, 2024, 11:01:24 PM
Fetching the kernel works fine for me.

It's very weird, but fetching the kernel does not work for me anymore: after the status "Fetching kernel-24.7.3-no_sa-amd64.txz" the loading dots just keep on running for a long time. No error message or anything.
The same happens for the other test kernel for 24.7.3 in the snapshot directory (https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/sets/)

Trying to fetch a non-existing kernel times out immediately with an error message.
Fetching kernel-24.7.3-test-amd64.txz: ..[fetch: https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/sets/kernel-24.7.3-test-amd64.txz.sig: Not Found] failed, no signature found
I already reverted to the zfs snapshot that I set up before my kernel testing earlier this week.
Maybe, my opnsense installation is more broken than I thought.

This means, I can't currently verify anything regarding the TCP timeouts with the -no_sa kernel.

EDIT: Scratch that - changing the mirror works. I will now also test the -no_sa deeper than before.
#15
I don't run the IGMP Proxy service (actually not even using IGMP at all in my network).

So, I would assume that this is not related.

As currently the other thread is more active I posted my assumption based on the current testing there just a few moments ago.
But I am very open to switch to this one, if we feel like the "ICMP error message too short (ip6)" logs can be targeted with the full-revert kernel (and are therefore manageble), while the state losses don't seem to be affected by the -no_sa kernel.

P.S. I am also native German speaker, but I think we should keep it international-friendly in this thread :)