24.7.2 IPv6 woes

Started by CruxtheNinth, August 26, 2024, 08:28:06 AM

Previous topic - Next topic
Good Morning,

since the 24.7.2 update there is something odd with IPv6.
Besides that it takes incredibly long for DHCP6 on WAN to get an IP (for my ISP Deutsche Glasfaser  it can take normally 30 - 60 minutes but with 24.7.2 it takes multiple hours) something else is off.

Directly after the GUA is assigned IPv6 seems to work normally i.e ping6 google.de works, various ipv6 test websites show all is well.

After a few hours, seemingly random, IPv6 stops working. The GUA is still assigned, all Clients have an IP but traffic is no longer processed.

Looking at the Firewall log it seems all traffic originating from the GUA inside my network is dropped outbound on LAN with a state violation.








There is a test kernel for the ICMPv6 instabilities introduced in 24.7.1:

https://github.com/opnsense/src/issues/218#issuecomment-2308039278

# opnsense-update -zkr 24.7.2-nd


Cheers,
Franco

Quote from: franco on August 26, 2024, 08:31:22 AM
There is a test kernel for the ICMPv6 instabilities introduced in 24.7.1:

https://github.com/opnsense/src/issues/218#issuecomment-2308039278

# opnsense-update -zkr 24.7.2-nd


Cheers,
Franco

Thanks Franco, as a matter of fact i had this Kernel running already but it did not resolve the issue, so i reverted back to 24.7.2 default kernel to not mix up potential topics and issues before creating this post.

I can upgrade to 24.7.2-nd again and take some captures if needed

The other thing people mentioned was:

# opnsense-revert -z dhcp6c

(and clean reboot)

However, both changes done to dhcp6c were done on edge cases and should improve recovery behaviour which for DG it may not? I'll need logs on the bad version to make sense of it.


Cheers,
Franco

Quote from: franco on August 26, 2024, 08:39:37 AM
The other thing people mentioned was:

# opnsense-revert -z dhcp6c

(and clean reboot)

However, both changes done to dhcp6c were done on edge cases and should improve recovery behaviour which for DG it may not? I'll need logs on the bad version to make sense of it.


Cheers,
Franco


Thank you, i will test later this afternoon. If you need any debugs / pcaps / logs with the "bad" version please let me know what exactly you need.

Interfaces: Settings: IPv6 DHCP Log level "Info". The problem appears on the renew more often than initially although reports have been unclear to me so far. Setting this option needs a reboot too. So switch to bad version before reboot (opnsense-revert -r 24.7.2 dhcp6c). Let me have the system log when this stopped renewing correctly to inspect what is going on. I'm assuming the ISP has a stricter policy on ordering DHCP and DHCPv6 requests which could be annoying because there is no way to tell from the system.

My issue as of now is fully resolved with running:

opnsense-revert -r 24.7.1 dhcp6c
opnsense-update -zkr 24.7.2-nd

observation:

at first i attempted, as you wrote, with opnsense-revert -z dhcp6c, which installed dhcp6c-20240820 which also had no effect on the problem whatsoever.

Trying to get some logs with the 24.7.2: opnsense-revert -r 24.7.2 dhcp6c also installed dhcp6c-20240820 so i figured the previous attempted revert failed for some unknown reason.

Which led me to try opnsense-revert -r 24.7.1 dhcp6c - which fixed all problems, all clients fine, no suddenly missing IPv6, all stable for over 10 hours now with dhcp6c-20240710


> Which led me to try opnsense-revert -r 24.7.1 dhcp6c - which fixed all problems, all clients fine, no suddenly missing IPv6, all stable for over 10 hours now with dhcp6c-20240710

That's correct, yes. I fumbled that part.

I still need the bad logs. I suspect this is:

https://github.com/opnsense/dhcp6c/commit/14d87d18a71

And for som reason DG will not like that we let the RELEASE state go... but it's a clear recovery scenario...

Just want to tell that i have the exact same problem like @CruxtheNinth with DGF

Im first testing the reverted package and the new kernel.

If this works, i'm looking forward to test dhcp debug mode. But its not easy for me because four other people here are not happy about router reboots :)


It's still odd that DG is the only ISP where the change seems to fail as far as reports go. But it's not that DG didn't have a special reputation already. Maybe this is even an opportunity to contact them and ask to fix their infrastructure? ;)


Cheers,
Franco

Quote from: franco on August 27, 2024, 12:38:47 PM
It's still odd that DG is the only ISP where the change seems to fail as far as reports go. But it's not that DG didn't have a special reputation already. Maybe this is even an opportunity to contact them and ask to fix their infrastructure? ;)


Cheers,
Franco

i attached the system.log of yesterday, maybe you can spot anything odd.
I noticed some  XID mismatch events that did not happen the days before or today.

If your specific problem is DG, I can tell you this from my own experience with 3 OpnSense installations on that ISP:

1. Do not - I repeat: do not - use traffic shaping with that provider. I never got to the bottom of it, but I know for sure that if you use traffic shaping and generate a bit more traffic, IPv6 suddenly drops for a few minutes and then comes on again. I can repeat that test over an over and always get the same result: ONLY IPv6 is down - IPv4 continues to work as usual.

2. For the same reason, I have one single "out" rule on WAN to never expose RFC1918 source or destination IPs to the WAN. This can happen if misconfigured clients try to access default IPs that are not present in my own network(s), because OpnSense is the default router which in turn has the ISP provider as default route.

What seems to be the case is that DG drops connections when they see something that they think to be misus of one kind or another (like spoofed source IPs).


Other than that, my 3 DG installations work fine with OpnSense 24.7.2.
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 770 up, Bufferbloat A

Uwe, wasn't there something about the dreaded IPv4 connectivity switch?

It seems to be stuck in a RENEW loop, partially due to itself but I'm unsure why. We don't hit NoBind state so it could only be the patch that I mentioned earlier.

> pltime=3600 vltime=3600

This seems excessively unnecessary.

I'll keep digging.


Cheers,
Franco

from what i understand DG does not answer to DHCPv6 IA-NA / IA-PD solicits but will somewhen (30 to 60 minutes) send a DHCPv6 Advertise containing IA-NA/IA-PD that can be requested afterwards. Not sure if that helps