Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - rmayr

#1
No changes. In response to packet like these that I see from the Android device with
tcpdump -n -i igb2_vlan64 ether host fa:17:c7:f8:dd:85 and ip6 (that is the Android device randomized MAC address for this SSID):

18:00:19.946419 IP6 2a03:fa00:650:30:9a7c:9494:3859:2d9b.45934 > 2606:4700:10::6814:2f59.443: Flags [S], seq 3003767200, win 65535, options [mss 1432,sackOK,TS val 3205487265 ecr 0,nop,wscale 8], length 0
So a simple TCP start of connection (SYN flag) to port 443 on the default route. I can ping that target IP from another Linux laptop on the same LAN/SSID.

I get the firewall log entry (not for exactly this packet, because once the Android device / OPNsense combination get into this state, I don't really get many log entries for this source IP anymore):

LAN In 2026-01-22T17:46:21 TCP [2a03:fa00:650:30:9a7c:9494:3859:2d9b]:49736 [2a00:1450:4001:805::200a]:443 block Default deny / state violation rule
With the (now even simplified) default rules on the OPNsense ruleset, I really, really don't understand why it would be blocked. Can there be any weird packet flags that cause the "state violation"? Or maybe this has to do with traffic shaping (simple QoS rules)? I am quite at a loss to understand this behavior.

As soon as the Android device starts using a new randomized client IPv6 address, traffic gets through again for a short while before the same happens with the new address.
#2
The issue still happens, although it seems to be harder to reproduce / takes longer to trigger now. I am not sure if having two (non-quick) rules that allow the same traffic from the respective VLAN to WAN makes it less likely that packets don't match? It also seems like the more traffic an IPv6 address has already created, the more quickly it triggers the condition. But as before, I am completely stumped on what could cause these symptoms and am not closer to solving it (though I have reduced complexity somewhat by, e.g., shutting down the backup firewall for the time being). Any other hints on how to debug further would be greatly appreciated.
#3
The only difference is the rule allowing clients in the Guest net to WAN, all other rules have not been modified. I will keep watching if it seems to work without erratic failures with this broader rule and then try to start narrowing down again (and if that doesn't change anything, start up the backup firewall again).
#4
Thanks for the pointers towards debugging options!

I have completely shut down the backup firewall for the time being, just to be certain that CARP is not part of the problem (I didn't expect it based on previous experience, but it's good to be clear). As of a day ago, no host could have received any secondary RAs even if the backup firewall had restarted radvd without my noticing.

I have checked the Aliases definitions under Firewall -> Diagnostics, and they are all correct. Also, just to be sure, I have added another debugging "pass" rule to the Guest incoming interface from any to any (non-quick, IPv4+6, all protocols, with logging).

At the moment, after manually re-connecting my current Android test client, it seems to work and this debugging firewall rule engages. I will wait and see if it stops again at some point and debug further. So far, if it continues to work, I am still puzzled why this debugging rule might hit but the other one won't match all those packets.
#5
I don't think that the clients are dropping their default routes or losing neighbor discovery tables. As you can see in my initial post, the OPNsense host sees the packets on the incoming interface. They just never get forwarded to the WAN side, and I see firewall block log entries. My hypothesis on why it works on a reconnect of the Android device - with new source IPv6 address - is that this creates a new state in the pf firewall that allows the packets to be forwarded. Then, shortly after, _something_ happens to that state and the connections start to drop.

What I don't understand is _why_ the firewall rule described in my original post doesn't match on some of those incoming packets. Are they deemed invalid because of some packet flags? Is the pf state dropped for some reason (I have already switched firewall state behavior to conservative, with no change on this issue).

I didn't previously mention that during the experiments, I manually stop the radvd on the backup firewall, so only one default route is pushed to the clients.
#6
Thanks for the quick reply! The Guest interface I have generated these tcdpump and live firewall logs from doesn't actually have any Virtual IP Aliases on it at the moment (some other VLANs do, and the Android devices on those behave similarly). This Guest interface only has a static IPv4 address (which works without an issue on the Android devices) and the IPv6 address tracking the WAN assigned prefix.

These are (hopefully relevant parts of) the interface details for Guest:

Media 1000baseT <full-duplex>
Media (Raw) Ethernet autoselect (1000baseT <full-duplex>)
Status up
nd6
flags
performnud
auto_linklocal
Routes 192.168.65.0/24
2a03:fa00:650:31::/64
fe80::%igb2_vlan65/64
Identifier opt5
Description Guest
Enabled true
Link Type static
addr4 192.168.65.254/24
addr6 2a03:fa00:650:31:20d:b9ff:fe58:5e2a/64
IPv4 Addresses
192.168.65.254/24
192.168.65.1/24 vhid 6
IPv6 Addresses
2a03:fa00:650:31:20d:b9ff:fe58:5e2a/64
fe80::20d:b9ff:fe58:5e2a/64
VLAN Tag 65
Gateways
Driver vlan1
...
Line Rate 1.00 Gbit/s
Packets Received 154404
Input Errors 0
Packets Transmitted 204150
Output Errors 0
Collisions 0

For the Guest interface, "Allow manual adjustment of DHCPv6 and Router Advertisements" is set. DHCPv4 is served by Dnsmasq, and no DHCPv6 at the moment (as Android devices won't use it anyways). Router advertisements (radvd, not dnsmasq so far) are set to Stateless with "Advertise Default Gateway" ticked and Source Address to "Automatic".
#7
Hi everybody,

I have been bumping my head against this issue for too many months and would really like to solve this now, but don't know where to continue debugging.

Issue: Android clients on WiFi often fail to connect through OPNsense on IPv6 to the WAN connection, while all other Linux hosts work fine, with static IPv6 addresses assigned manually as well as with SLAAC and privacy extensions generated addresses.

Setup: I run a cluster of two OPNsense firewalls. IPv6 address space is assigned by the FTTH provider with DHCPv6-PD over PPPoE on the respective master firewall. This works without issues, and I assign a derived address to the WAN interface (PPPoE) through a Virtual IP Alias to make sure the firewall itself is globally reachable through IPv6. All local interfaces (on VLANs assigned to a trunk port) are tracking the WAN interface for IPv6 with unique prefixes. Additionally, some of the local interfaces have ULA Virtual IP Aliases (statically as well as through CARP tied to the IPv4 shared address) for local IPv6 connectivity when the WAN side is down.

All_local is an interface group around all local interfaces (those that are neither WAN nor VPN).

Firewall rules: Fairly simple setup for each of the local interfaces: e.g. for the Guest interface, the first interface-specific rule is
  • pass
  • non-quick
  • interface: Guest
  • Direction: in
  • TCP/IP Version: IPv4/IPv6
  • Protocol: any
  • Source: Guest net
  • Destination: (inverted) All_local net
  • Occasionally with logging enabled while I am debugging, and also experimented with advanced "State Type" set to "sloppy", but that does not seem to change anything.

When I watch the traffic on this interface locally on the firewall shell with tcpdump, I see the traffic from the (randomized) MAC address of an Android device trying to run https://test-ipv6.com:

20:33:55.367693 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40900 > 2600:3c0e::f03c:94ff:fed0:118c.443: Flags [S], seq 3874906471, win 65535, options [mss 1432,sackOK,TS val 502769255 ecr 0,nop,wscale 8], length 0
20:33:55.414492 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40510 > 2a01:7e01::f03c:94ff:fed0:4087.443: Flags [S], seq 1619834821, win 65535, options [mss 1432,sackOK,TS val 2932320577 ecr 0,nop,wscale 8], length 0
20:33:55.437239 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40512 > 2a01:7e01::f03c:94ff:fed0:4087.443: Flags [S], seq 4131834176, win 65535, options [mss 1432,sackOK,TS val 2932320598 ecr 0,nop,wscale 8], length 0
20:33:55.463117 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.47108 > 2a01:7e01:e001:8a6::1280.443: Flags [S], seq 1539005391, win 65535, options [mss 1432,sackOK,TS val 2440076159 ecr 0,nop,wscale 8], length 0
20:33:55.473900 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40516 > 2a01:7e01::f03c:94ff:fed0:4087.443: Flags [S], seq 3851900761, win 65535, options [mss 1432,sackOK,TS val 2932320635 ecr 0,nop,wscale 8], length 0
20:33:55.479104 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40532 > 2a01:7e01::f03c:94ff:fed0:4087.443: Flags [S], seq 3848844477, win 65535, options [mss 1432,sackOK,TS val 2932320641 ecr 0,nop,wscale 8], length 0
20:33:55.620392 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40916 > 2600:3c0e::f03c:94ff:fed0:118c.443: Flags [S], seq 1870185323, win 65535, options [mss 1432,sackOK,TS val 502769508 ecr 0,nop,wscale 8], length 0
20:33:55.667322 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40536 > 2a01:7e01::f03c:94ff:fed0:4087.443: Flags [S], seq 2725483835, win 65535, options [mss 1432,sackOK,TS val 2932320829 ecr 0,nop,wscale 8], length 0
20:33:55.690199 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40550 > 2a01:7e01::f03c:94ff:fed0:4087.443: Flags [S], seq 3188154451, win 65535, options [mss 1432,sackOK,TS val 2932320851 ecr 0,nop,wscale 8], length 0
20:33:55.716457 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.47118 > 2a01:7e01:e001:8a6::1280.443: Flags [S], seq 1085541019, win 65535, options [mss 1432,sackOK,TS val 2440076413 ecr 0,nop,wscale 8], length 0
20:33:55.725721 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40552 > 2a01:7e01::f03c:94ff:fed0:4087.443: Flags [S], seq 2983529768, win 65535, options [mss 1432,sackOK,TS val 2932320887 ecr 0,nop,wscale 8], length 0
20:33:56.392231 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40900 > 2600:3c0e::f03c:94ff:fed0:118c.443: Flags [S], seq 3874906471, win 65535, options [mss 1432,sackOK,TS val 502770279 ecr 0,nop,wscale 8], length 0
20:33:56.419442 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40510 > 2a01:7e01::f03c:94ff:fed0:4087.443: Flags [S], seq 1619834821, win 65535, options [mss 1432,sackOK,TS val 2932321581 ecr 0,nop,wscale 8], length 0
20:33:56.487596 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.47108 > 2a01:7e01:e001:8a6::1280.443: Flags [S], seq 1539005391, win 65535, options [mss 1432,sackOK,TS val 2440077184 ecr 0,nop,wscale 8], length 0
20:33:56.644072 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40916 > 2600:3c0e::f03c:94ff:fed0:118c.443: Flags [S], seq 1870185323, win 65535, options [mss 1432,sackOK,TS val 502770531 ecr 0,nop,wscale 8], length 0
20:33:56.675675 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.40536 > 2a01:7e01::f03c:94ff:fed0:4087.443: Flags [S], seq 2725483835, win 65535, options [mss 1432,sackOK,TS val 2932321837 ecr 0,nop,wscale 8], length 0
20:33:56.739577 02:a2:4a:0c:29:fd > 00:0d:b9:58:5e:2a, ethertype IPv6 (0x86dd), length 94: 2a03:fa00:650:31:b307:9f59:86f:33a9.47118 > 2a01:7e01:e001:8a6::1280.443: Flags [S], seq 1085541019, win 65535, options [mss 1432,sackOK,TS val 2440077436 ecr 0,nop,wscale 8], length 0

but no replies. In the firewall live log, when filtering for this (randomized) SLAAC address, I see some pass, but also some occasional block messages:

Guest In 2026-01-02T20:33:15 TCP [2a03:fa00:650:31:b307:9f59:86f:33a9]:37112 [2a01:7e01:e001:8a6::1280]:443 pass Allow Guest to WAN
Guest In 2026-01-02T20:33:15 TCP [2a03:fa00:650:31:b307:9f59:86f:33a9]:39150 [2a01:7e01::f03c:94ff:fed0:4087]:443 pass Allow Guest to WAN
Guest In 2026-01-02T20:33:12 TCP [2a03:fa00:650:31:b307:9f59:86f:33a9]:59330 [2a00:1450:4001:80e::200a]:443 block Default deny / state violation rule
Guest In 2026-01-02T20:33:03 TCP [2a03:fa00:650:31:b307:9f59:86f:33a9]:60540 [2a00:1450:4001:831::2003]:443 pass Allow Guest to WAN
Guest In 2026-01-02T20:33:03 TCP [2a03:fa00:650:31:b307:9f59:86f:33a9]:60530 [2a00:1450:4001:831::2003]:443 pass Allow Guest to WAN
Guest In 2026-01-02T20:33:00 TCP [2a03:fa00:650:31:b307:9f59:86f:33a9]:60040 [2a01:7e01::f03c:94ff:fed0:4087]:443 pass Allow Guest to WAN

The weirdest thing is that, when disconnecting the Android device from WiFi and reconnecting (forcing a refresh of the SLAAC addresses), it occasionally works for a brief period (in the range of minutes) before it fails again. Other Linux hosts work consistently. I have tried many different things during debugging (disabling source or destination matches, adding an explicit only-IPv6 rule, etc.) and am completely confused why that rule might not match some of the packets. Any hints would be greatly appreciated.
#8
22.1 Legacy Series / Re: DNS aliases not resolving
March 11, 2022, 12:18:39 PM
I have exactly the same issue.
#9
Hi everybody,

I currently face a strange issue that I can't understand and hope that anybody can enlighten me.

Setting:
* Upstream connection (igb0) is behind an ISP router and hence already behind a NAT. MTU on that interface is 1500.
* Wireguard tunnel to another OPNsense instance, MTU of these interfaces manually set to 1420.
* Pinging from a Linux host on one side to a Linux server on another side using "ping -4 -s 1392 -M want <serverIP>" works as expected and replies arrive with latency of 10-11ms.
* Doing the same with "ping -4 -s 1394 -M want <serverIP>" yields on the Linux client:

PING <serverIP> (<serverIP>) 1394(1422) Bytes Daten.
Von 192.168.64.254 icmp_seq=1 Frag needed and DF set (mtu = 1420)


(192.168.64.254 is the static IP on the LAN interface of OPNsense, while the default route of the Linux client is 192.168.64.1, which is a virtual CARP IP. This seems to work as expected, with the exception of replies not being received.)

Sniffing on the firewall itself on the Wireguard interface:

root@firewall2:~ # tcpdump -n -i wg1 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wg1, link-type NULL (BSD loopback), capture size 262144 bytes
11:44:11.804655 IP <clientIP> > <serverIP>: ICMP echo request, id 77, seq 513, length 1400
11:44:11.804689 IP <clientIP> >  <serverIP>: ip-proto-1
11:44:11.815346 IP  <serverIP> > <clientIP>: ICMP echo reply, id 77, seq 513, length 1400
11:44:11.815501 IP  <serverIP> > <clientIP>: ip-proto-1


So the (fragmented) replies are coming back in through the Wireguard tunnel.

However, these are dropped by the default rule:

VPN_INS      2022-03-11T12:01:49   <serverIP>   <clientIP>   icmp   Default deny rule

(VPN_INS is the assigned interface name for wg1.)

What I don't understand: Why would unfragmented ICMP replies correctly match the state table entry caused by the ICMP echo request, but wouldn't match when fragmented? Is interface scrubbing (which is not turned off) messing up the replies here?
#10
22.1 Legacy Series / Re: IPv6 working properly???
February 02, 2022, 08:13:42 PM
I can confirm that (on an APU4d4) IPv6 PD/track based advertisements are only working without the option "Allow manual adjustment of DHCPv6 and Router Advertisements" on 22.1 at the moment. When selected, RAs just stop shortly after restarting radvd and dhcpv6 doesn't seem to respond reliably. I have not yet tried to debug the differences in radvd.conf as generated.
#11
Another confirmation that clearing cache in Firefox solved this issue.
#12
This seems to be fixed in 21.1.7. Thanks!
#13
With my standard configuration, unbound can no longer start after updating to 21.1.6. I tracked down the issue to a wrong format being exported to /var/unbound/host_entries.conf when multiple domains are listed in the "domain search list": then all the domains are added to the static lease entry in /var/unbound/host_entries.conf instead of only the "domain name" for the respective host. This prevents unbound from starting.

The current workaround is to remove additional domains from the "domain search list" and set it the same as "domain name", but this breaks some other use cases in my setup.
#15
The concerning bit is the heavy side effect of having IPsec enabled for completely unrelated traffic. It points to a general performance bottleneck in the kernel.