Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - seacycle

#1
Quote from: franco on September 01, 2024, 08:34:09 AM
> igc1 and vlan01 are not currently assigned. Are you saying I'd drop the v6 configuration from WAN (pppoe1), assign vlan01 or igc1 to WAN6, and configure the 6rd from that?

Correct. I believe it would be vlan01. This should confirm it for you from the console:

# ifconfig wan_stf

Trial on 24.7.3_1:


  • Removed the 6rd configuration from WAN (assigned to the pppoe interface), save, apply.  The 6rd prefix is removed from the routing table, but the wan_stf interface remains partially configured (it has the tv4rbr address still configured, but no inet6 address) and is the tsill default v6 route.
  • Reboot, flushes out the wan_stf interface and the associated default v6 route.
  • Assign the vlan interface under the pppoe interface to WAN6, enable. IPv4 configured as none, IPv6 configured as 6rd, save apply.  No stf interface gets configured.  Also notice the old 6RD gateway associated with WAN is still listed, marked as defunct. Delete.
  • Reboot, still no stf interface.
  • Repeat, but use the hardware igc interface under the vlan interface (under the pppoe interface), same result.
  • Remove the WAN6 assignment, re-configure 6RD on the WAN assignment (as it was originally), and the stf interface returns with a correct 6rd configuration.

Searching logs for WAN6 I see this a bunch of times:

/interfaces.php: The command '/sbin/ifconfig 'opt6_stf' inet6 description 'WAN6 (opt6)' up' returned exit code '1', the output was 'ifconfig: interface opt6_stf does not exist'

So it seems like a code path to create the stf interface is getting missed in this scenario, but later code paths expecting to find it existing is being hit.

With 6rd configured on the WAN assignment (pppoe), the stf interface looks like:


wan_stf: flags=1004041<UP,RUNNING,LINK2,LOWER_UP> metric 0 mtu 1280
description: WAN (wan)
options=0
inet6 2602:<redacted>:: prefixlen 24
groups: stf
v4net 0.0.0.0/32 -> tv4br 205.171.2.64
nd6 options=103<PERFORMNUD,ACCEPT_RTADV,NO_DAD>
#2
Quote from: franco on August 29, 2024, 07:32:38 AM
Interesting note about radvd. Are you using LAN interfaces to track the WAN? Just by memory I think it's true that for 6RD/6to4 there is no event to automatically reconfigure clients. Happy to fix that, but ideally with a ticket on GitHub.

Yes, tracking the WAN interface.  I can put together a ticket.

Quote from: franco on August 29, 2024, 07:32:38 AM
I'd still like clarification on the second half:

Quote from: franco on August 26, 2024, 08:24:22 AM
The important thing going forward is that these types of setups will no longer work on 25.1 in the way they are currently performed. In 25.1 you will have to delete 6RD from your WAN and create a separate "WAN6" interface from your "port" where PPPoE is running on (something like igb1 for example). There you chose IPv4 None and IPv6 6RD and it should work as before. That being said, the same should already work on all known OPNsense versions but it was favoured by be convenient two-in-one WAN configuration which, again, has been a source of great confusion for at least a decade.

Can you:

[...]
2. Confirm that the configuration suggestions works as well on your end?

I agree that the two-in-one WAN configuration feels like an abstraction-too-far from the actual topology of the plumbing underneath, leading to confusion. I had thought that the stf tunnel setup would make more sense as an independent "Other Types" interface configuration; make an stf device, assign it to WAN6 and configure as 6rd there. Ultimately, it doesn't seem coupled to the pppoe interface by anything than by the routing table, as far as I understand.

I'm not 100% sure I follow how I would set up what you describe on 24.7, but would like to try it out.  If I have:

igc1 - hardware interface connected to the provider's ONT (unassigned)
vlan01 - pppoe packates need to be tagged with vlan 201, so this is on igc1 (unassigned)
pppoe1 - on top of vlan01 (assigned as WAN)

igc1 and vlan01 are not currently assigned. Are you saying I'd drop the v6 configuration from WAN (pppoe1), assign vlan01 or igc1 to WAN6, and configure the 6rd from that?
#3
Quote from: franco on August 26, 2024, 08:24:22 AM
@seacycle

Thanks for reaching out. TLDR: you're looking for https://github.com/opnsense/core/commit/947e61b1a5

# opnsense-patch 947e61b1a5

...

Can you:

1. Confirm that the patch works? I'd add that to 24.7.3 of course.
2. Confirm that the configuration suggestions works as well on your end?

Please don't go, we need you for this. :)

The patch does appear to work with the convoluted 6rd-over-pppoe-over-a-tagged-vlan that CenturyLink requires, thanks! (There are some issues with radvd not picking up the prefix reliably without a manual restart, but that pre-dates 24.7.)
#4
Quote from: franco on August 26, 2024, 08:24:22 AM
@seacycle

Thanks for reaching out. TLDR: you're looking for https://github.com/opnsense/core/commit/947e61b1a5

# opnsense-patch 947e61b1a5

The long version: Oh boy. This is one of the effects of confusion that "IPv4 connectivity" has created, because 6RD and 6TO4 do not work over PPPoE at all. So this is a side-by-side configuration, which is pretty mind-boggling considering your ISP goes through the effort to bring you online via PPPoE tunnel and gives you IPv6 outside the PPPoE tunnel... ok, why not? ;)

I've always felt the disconnect between the "simple" 6rd selection in the WAN interface configuration, and the actual underlying plumbing it sets up to be somewhat confusing. I do think the 6rd/6to4 configuration would make more sense relegated to the "Other Types" section along with GRE and friends, more or less completely independently configured, with its own assignment and firewall rules, as you describe, and would absolutely endorse that approach going forward.

6rd over pppoe it is the jankiest of all configurations an ISP could possibly offer for v6. But at about 2.7 million broadband subscribers CenturyLink isn't quite in the completely ignorable category here in the US. (And their symmetric fiber ipv4 performance, where I am, blows Comcast away, for less than half the price.)

Quote from: franco on August 26, 2024, 08:24:22 AM
I'm going to assume 6RD still works on 24.7 despite the visibility glitch?

I'm only occasionally at the physical location of this router, which makes testing different WAN configurations difficult, but I should be able to verify by Thursday. I've got 24.7.2 in loaded up in a separate boot environment so I can flip back and forth easily.  What I recall from by brief testing was that (a) the wan_stf interface kept its configuration through the upgrade process and obtained a valid v6 prefix usable for outbound traffic from the router. But (b) that clients on the LAN side couldn't make v6 connections through the router to the outside. I didn't get as far as identifying where the failure was.
#5
I updated from the latest 24.1 to 24.7.2 (no patches from this thread applied). I'm cursed by an ISP supporting only 6rd.  The 6rd tunnel configuration UI is completely missing from the PPPoE interface configuration in 24.7.2?  The IPv6 option shows up as "None" after the update. The wan_stf tunnel device does remain configured through the upgrade process, and gets an address from the ISP, but isn't passing outbound v6 traffic from the LAN side, though v6 from the router itself gets out fine.

I didn't have a lot of time to diagnose and rolled back to 24.1 (thank you zfs boot environments!) when I have more time at the physical location of the router.

Will 6rd tunnel support over a PPPoE interface be returning in some form?
#6
The router advertisements from your Apple TV are not bogus, but they may also not be at all useful if you have no thread devices you want to communicate with. (The Apple TV acts as a border router for a network of thread devices. Thread is all IPv6. If you have no thread devices, you don't need a border router, but in typical Apple fashion, there are no knobs to turn it off on the Apple TV device itself. Thread capable HomePods do the same thing.)

I'd suggest disabling whatever you did to block the router advertisements from the Apple TV and see if that resolves the problem. Unplug the Apple TV from the network for the experiment if you don't want it involved while the defenses are disabled. If that fixes it, then the block you applied may be too broad in scope.
#7
Does a rebooted client never get an address, or does it get one if you wait long enough for radvd to send out an unsolicited router advertisement? Between 200 and 600 seconds may be the default configuration, but you can adjust it way down experimentally adjusted down for testing. Restarting radvd will send one out immediately.

If it never gets an address, maybe radvd is falling over somehow.  If it does get one within 600 seconds, I would be suspicious that something is blocking the router solicitation messages from the client getting to the router.  A tcpdump to capture router advertisements (icmp6 134) and router solicitations (icmp6 133) done both on the client and on the router should clarify what is going on.

Just testing on a macos client, capturing both (adjust for whatever network interface is relevant on your client):

sudo tcpdump -vvv -ttt -i en0 icmp6 and \( 'ip6[40] = 133' or 'ip6[40] = 134' \)

And unplug and re-plug the ethernet cable and I see the solicitation and immediate response:


tcpdump: listening on en0, link-type EN10MB (Ethernet), snapshot length 524288 bytes
00:00:00.000000 IP6 (flowlabel 0xd0000, hlim 255, next-header ICMPv6 (58) payload length: 8) flux.local > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 8
00:00:00.008806 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 144) fe80::2e2:69ff:fe53:c44e > flux.local: [icmp6 sum ok] ICMP6, router advertisement, length 144


If you see the solicitation, but no advertisement in response, then run a similar tcpdump on the router side and see if the solicitation makes it there. If it doesn't, then something in between ate it, and that is your problem.  If the solicitation does show up on the router, but it doesn't send a response out, something is likely up with radvd.

Since you say that client get an address on restarting radvd, I believe that shows that the router advertisements can get from the router to the client, so that is unlikely to be the problem.

In my case, I once had a buggy firmware revision on a wifi access point that was dropping router solicitations, causing client to not get an address until the unsolicited advertisement rolled around up to 10 minutes later.  If not a bug like that blocked router solicitation, an intermediate switch with inappropriate ACLs on it, or some firewall rule on either the client or the router could be blocking the necessary icmp6 messages.
#8
I'd start with:

Interfaces > Settings > IPv6 DHCP > Log level = debug

And then inspect System > Log files > General, searching for "dhcp6c"

dhcp6c is the component getting the WAN v6 interface address, the prefix delegation, and then assigning addresses to the LAN interfaces set to "track" the WAN prefix delegation.

That your LAN interface has only a link local address suggests something going wrong with dhcp6c. I suspect that dhcpd6, which operates on the LAN side of things, is failing is a symptom, not a cause.

I'm curious about the "Use IPv4 connectivity" in the DHCPv6 client configuration. I guess that would be highly ISP specific, and isn't appropriate for the ISPs I'm familiar with. Is your ISPs v6 service through a v4 tunnel?

If your v6 delegated prefix is truly static, have you tried manually configuring the LAN v6 interface with a /64 from that prefix?
#9
Oh, this is interesting: https://kb.isc.org/docs/aa-00621

QuoteNormally binding to a reserved port on FreeBSD requires the process to be be running as root. For most uses this is not a problem as named binds to port 53 before changing user id; however, if you are running in a environment where interface addresses are changing this can be a issue. FreeBSD has a kernel module, mac-portacl, that will allow a non-privileged user to bind to specified ports.

Update: confirmed, this works. It is still not optimal, because named stops listening when dhcp6c is fiddling with the addresses, the initial listen fails (maybe because the addresses are waiting for DAD to complete and not yet bindable?) and they only get added a minute later.

My workaround is a statically configured ULA address alias on lo0 that I put in the router advertisements for DNS.

Update 2: confirmed that duplicate address detection is at the root of named failing to initially bind to the "new" addresses when dhcp6c fiddles with the interface addresses.  With DAD disabled, named is successful binding to the addresses on the first try.

So in summary, there are two problems here:

1. named as configured in the plugin to listen on all addresses will track when addresses change, BUT when they do change, named is unable to bind to the updated addresses because it dropped root privileges after the initial scan-and-bind pass at service start. This can be addressed by the method described in https://kb.isc.org/docs/aa-00621, or by restarting named.
2. When detecting new IPv6 addresses, named will initially fail to listen due a race condition between named attempting to listen and duplicate address detection. Unfortunately, this is common because dhcp6c as configured will periodically remove and re-add a prefix-delegation derived address even when the address is unchanged. This can be addressed by disabling duplicate address detection, or by restarting named.

It is not clear to me what the best solution is to these in the broader OPNsense context.

An alternate no-code-change workaround is to configure the bind plugin to only listen on statically assigned addresses.

cc: @franco, should I write this up as a bind plugin bug?
#10
So when this happens, dhcp6c is manipulating the global unicast addresses on the interfaces tracking the prefix delegation, and while this is happening, named is trying to adapt to the addresses as they come and go.  After dhcp6c is done, named is left in a state where every minute, it tries to listen on the these addresses, and gets an error. A minute later, tries again, fails again and so on until I restart named.

While in the state, I did a syscall ktrace on named and see this pattern:


18016 isc-net-0000 CALL  socket(PF_INET6,0x2<SOCK_DGRAM>,IPPROTO_IP)
18016 isc-net-0000 RET   socket 58/0x3a
18016 isc-net-0000 CALL  setsockopt(0x3a,IPPROTO_IPV6,IPV6_DONTFRAG,0x8370b11d4,0x4)
18016 isc-net-0000 RET   setsockopt 0
18016 isc-net-0000 CALL  setsockopt(0x3a,IPPROTO_IPV6,IPV6_V6ONLY,0x8370b11dc,0x4)
18016 isc-net-0000 RET   setsockopt 0
18016 isc-net-0000 CALL  setsockopt(0x3a,SOL_SOCKET,SO_REUSEPORT,0x8370b11dc,0x4)
18016 isc-net-0000 RET   setsockopt 0
18016 isc-net-0000 CALL  setsockopt(0x3a,SOL_SOCKET,SO_REUSEPORT_LB,0x8370b11dc,0x4)
18016 isc-net-0000 RET   setsockopt 0
18016 isc-net-0000 CALL  write(0xd,0x827490f45,0x1)
18016 isc-net-0000 RET   write 1
18016 isc-net-0001 RET   kevent 1
18016 isc-net-0000 CALL  socket(PF_INET6,0x2<SOCK_DGRAM>,IPPROTO_IP)
18016 isc-net-0001 CALL  read(0xc,0x839d78990,0x400)
18016 isc-net-0001 RET   read 1
18016 isc-net-0001 CALL  setsockopt(0x3a,IPPROTO_IPV6,IPV6_USE_MIN_MTU,0x839d7885c,0x4)
18016 isc-net-0001 RET   setsockopt 0
18016 isc-net-0001 CALL  ioctl(0x3a,FIONBIO,0x839d7880c)
18016 isc-net-0001 RET   ioctl 0
18016 isc-net-0001 CALL  setsockopt(0x3a,SOL_SOCKET,SO_REUSEPORT,0x839d7884c,0x4)
18016 isc-net-0001 RET   setsockopt 0
18016 isc-net-0001 CALL  getpeername(0x3a,0x839d78798,0x839d7875c)
18016 isc-net-0001 RET   getpeername -1 errno 57 Socket is not connected
18016 isc-net-0001 CALL  setsockopt(0x3a,IPPROTO_IPV6,IPV6_V6ONLY,0x839d787fc,0x4)
18016 isc-net-0001 RET   setsockopt 0
18016 isc-net-0001 CALL  bind(0x3a,0x849924d30,0x1c)
18016 isc-net-0001 RET   bind -1 errno 13 Permission denied
18016 isc-net-0001 CALL  _umtx_op(0x84a1706b0,0x8<UMTX_OP_CV_WAIT>,0,0x84a170690,0)


And not long after, the socket 0x3a gets closed.  Why would permission be denied?
#11
Actually, it looks like BIND does get a kick when the WAN side dhcp6c refreshes, but it errors out on listening to the global unicast addresses, for example:

23-Mar-2024 14:51:28.795 network: info: listening on IPv6 interface vlan0.1.4, <redacted-v6-address>#53
23-Mar-2024 14:51:28.795 network: error: creating IPv6 interface vlan0.1.4 failed; interface ignored

And so on through all the interfaces with dhcp6c derived addresses.

Maybe a race condition? Manually restarting BIND moments later, everything works.
#12
OPNsense 24.1.4-amd64, using BIND for DNS (unbound is disabled).  BIND is configured to listen on :: for IPv6, yielding this in the BIND configuration:

    listen-on-v6 port 53 { any; };

WAN side v6 address is assigned via DHCP6 with prefix delegation. The LAN interfaces are assigned from the prefix delegation (Track Interface).

The LAN side router advertisements default to advertising the interface's global unicast address, derived from the DHCP prefix delegation, as the name server address.

The problem: Each time the WAN side DHCP6 client refreshes the WAN address and prefix delegation, it also refreshes the LAN addresses tied to the prefix delegation. This is fine, the prefix is the same, the addresses are the same. But every time this happens, BIND stops listening on the prefix delegated LAN side addresses. It takes a manual restart of BIND to start listening again.

Is there a way to automatically kick BIND to re-evaluate its listening addresses when this happens?