OPNsense Forum

Archive => 21.1 Legacy Series => Topic started by: danielm on May 04, 2021, 12:26:26 pm

Title: [21.1.5] IPv6 unstable with DHCPv6 and RA "Managed"
Post by: danielm on May 04, 2021, 12:26:26 pm
I've been getting complaints from workers in both our offices that sometimes their network on their win10 machines was unstable.
Upon further inspection, I found that the problem was IPv6, which Windows uses as default if available.
The interface for the office machines is configured with DHCPv6 and the RAs set to "Managed" mode.
The behaviour is weird in that it often gives out IPv6 addresses, but they don't persist for long and keep coming back and going away, which results in unstable connectivity.
In that network, IPv4 is perfectly stable, so I don't suspect a physical problem with the network.
Is this a known problem with this version?
I am looking around for a solution but I think everything is configured right and I don't know what to do besides disabling DHCP6 and trying SLAAC.

EDIT: I forgot to mention that there doesn't seem to be a general problem with ipv6 on the network, because on both locations, we have statically configured ipv6 machines that have stable connectivity, also there is another subnet here that doesn't run in a VLAN which uses DHCP6 + RA "Managed" and it is stable.
So maybe it only happens on the VLAN networks.
Also I tried both the setting "dynamic" and "static" RA but it doesn't seem to make much of a difference regarding stability. Both locations use DHCP static assigned ::56 subnets from the ISP, so both settings should work if I understand right.
Title: Re: [21.1.5] IPv6 unstable with DHCPv6 and RA "Managed"
Post by: Greelan on May 04, 2021, 01:02:37 pm
Probably need more details as to how you have allocated IPv6 subnets to each VLAN, eg a /64? Maybe post relevant configs
Title: Re: [21.1.5] IPv6 unstable with DHCPv6 and RA "Managed"
Post by: priller on May 04, 2021, 02:49:18 pm

Possibly related to this?:

DHCPv6 server intermittently unresponsive, not responding to solicits
https://github.com/opnsense/core/issues/4691

Title: Re: [21.1.5] IPv6 unstable with DHCPv6 and RA "Managed"
Post by: danielm on May 04, 2021, 03:37:52 pm
Probably need more details as to how you have allocated IPv6 subnets to each VLAN, eg a /64? Maybe post relevant configs

Correct, I have assigned /64 subnet, using a subrange of that as the dhcp range, it is set as <prefix>:1:: - <prefix>:1::ffff so the range has 16^4 addresses in it. The ip6 address is set manually on the interface, too,, because I had trouble with the automatic options. I don't want to post screenshots to keep the prefix private.
Title: Re: [21.1.5] IPv6 unstable with DHCPv6 and RA "Managed"
Post by: danielm on May 04, 2021, 03:39:41 pm

Possibly related to this?:

DHCPv6 server intermittently unresponsive, not responding to solicits
https://github.com/opnsense/core/issues/4691

That sounds unnervingly similar to what I encounter, I will try to check using a packet capture like suggested if it is the case.
Though I must say even not having checked it, the intermittent nature of the error leads me to believe it is a software bug and not a misconfiguration.

EDIT: I just made a quick packet capture and it seems to be true. I can see ip6 dhcp6 request that remain unanswered by the server. They just get ignored. I will post on github about my findings, because AFAIK this problem *should* have been 21.7.x only
Title: Re: [21.1.5] IPv6 unstable with DHCPv6 and RA "Managed"
Post by: Maurice on May 04, 2021, 07:12:00 pm
The "intermittently unresponsive" issue was introduced for both Router Advertisements (which broke SLAAC) and DHCPv6 when OPNsense switched to FreeBSD 12.1 in 20.7. For Router Advertisements, it was eventually solved by patching radvd. So SLAAC works again in 21.1. But for DHCPv6, no fix is available yet (that includes the 21.7 development version). If you depend on DHCPv6, you'll have to use a separate DHCPv6 server for the time being. Since you have Windows clients and might have a Windows Server: The Microsoft DHCP server works just fine.

Cheers

Maurice
Title: Re: [21.1.5] IPv6 unstable with DHCPv6 and RA "Managed"
Post by: danielm on May 04, 2021, 10:27:01 pm
The "intermittently unresponsive" issue was introduced for both Router Advertisements (which broke SLAAC) and DHCPv6 when OPNsense switched to FreeBSD 12.1 in 20.7. For Router Advertisements, it was eventually solved by patching radvd. So SLAAC works again in 21.1. But for DHCPv6, no fix is available yet (that includes the 21.7 development version). If you depend on DHCPv6, you'll have to use a separate DHCPv6 server for the time being. Since you have Windows clients and might have a Windows Server: The Microsoft DHCP server works just fine.

Cheers

Maurice

Nice to know, thanks. The Windows machines are actually not in a local domain at all, they are cloud managed with azure, so Microsoft DHCP is not an option (and I probably wouldn't even do it with a local domain tbh). But I will simply switch the machines to SLAAC since that should work. We never "depended" on dhcp6 anyway, I just used it since it seems more convenient to be able to somewhat control the addresses, and we were already using it successfully on other subnets.
Title: Re: [21.1.5] IPv6 unstable with DHCPv6 and RA "Managed"
Post by: surly on May 04, 2021, 10:52:21 pm
I'm curious - if you set RA to "Assisted" instead of "Managed" does it survive any better?  Or perhaps it fails in the same way but typical user client traffic will flow because Windows falls back to SLAAC-obtained addresses?

Title: Re: [21.1.5] IPv6 unstable with DHCPv6 and RA "Managed"
Post by: danielm on May 04, 2021, 11:14:33 pm
I'm curious - if you set RA to "Assisted" instead of "Managed" does it survive any better?  Or perhaps it fails in the same way but typical user client traffic will flow because Windows falls back to SLAAC-obtained addresses?

The problem would be that in this case, connections would still fail because of the "bad" addresses coming and going, and TCP sessions going stale because of that. So I don't think I will experiment with that too much if SLAAC on its own avoids the problem for now.
Title: Re: [21.1.5] IPv6 unstable with DHCPv6 and RA "Managed"
Post by: danielm on May 05, 2021, 12:23:18 am
The "intermittently unresponsive" issue was introduced for both Router Advertisements (which broke SLAAC) and DHCPv6 when OPNsense switched to FreeBSD 12.1 in 20.7. For Router Advertisements, it was eventually solved by patching radvd. So SLAAC works again in 21.1. But for DHCPv6, no fix is available yet (that includes the 21.7 development version). If you depend on DHCPv6, you'll have to use a separate DHCPv6 server for the time being. Since you have Windows clients and might have a Windows Server: The Microsoft DHCP server works just fine.

Cheers

Maurice

Unfortunately, on the first system I tested, SLAAC also didn't work. The machines calculated their addresses correctly, but got the wrong router IP from the RA (from the wrong subnet), leading to no connectivity.
So now I have decided to turn off v6 on that interface entirely for now and wait it out until fixes appear. Hopefully, connectivity won't be affected too much by disabling v6 but I see no other sensible option for me right now.
Title: Re: [21.1.5] IPv6 unstable with DHCPv6 and RA "Managed"
Post by: Maurice on May 05, 2021, 12:44:22 am
The machines calculated their addresses correctly, but got the wrong router IP from the RA (from the wrong subnet), leading to no connectivity.

That's unexpected. By 'router IP' you mean the link-local source address of the RAs? I don't see how this could be affected by the RA flags. In other words, if this worked in 'Managed' mode but doesn't in 'Assisted', 'Stateless' or 'Unmanaged', something else must be going on. Anything else you changed during troubleshooting?

SLAAC is working rock solid here (multiple interfaces with a mix of 'Assisted', 'Stateless' and 'Unmanaged' RAs).

Cheers

Maurice
Title: Re: [21.1.5] IPv6 unstable with DHCPv6 and RA "Managed"
Post by: danielm on May 05, 2021, 03:20:49 pm
The machines calculated their addresses correctly, but got the wrong router IP from the RA (from the wrong subnet), leading to no connectivity.

That's unexpected. By 'router IP' you mean the link-local source address of the RAs? I don't see how this could be affected by the RA flags. In other words, if this worked in 'Managed' mode but doesn't in 'Assisted', 'Stateless' or 'Unmanaged', something else must be going on. Anything else you changed during troubleshooting?

SLAAC is working rock solid here (multiple interfaces with a mix of 'Assisted', 'Stateless' and 'Unmanaged' RAs).

Cheers

Maurice

So what happened was actually that they calculated an IP within their respective subnet, and they probably got the correct DNS server, but when they tried to DNS resolve opnsense, it would return an IP from the wrong subnet (which is not really what I wrote, sorry), and also the machines had no outbound connectivity to the internet or to other v6 enabled local subnets (which they should be able to access due to fw rules). So something went wrong with the routing, which is weird. Also weird is that I could even see outbound packets on WAN when for example running "ping -6 google.de" from affected machines. But no answer was returned. As I said, I didn't troubleshoot it deeply, because I didn't want to interfere too much with workers.