clients loosing ipv6 internet now and then

Started by dMopp, August 17, 2024, 08:40:47 AM

Previous topic - Next topic
Well... I tried opnsense-update -kr 24.7.2 and reboot, it did not help.
The occasional IPv6 outages are still there.

Now I ran opnsense-update -kr 24.7 to try with the previous kernel, aka did not downgrade any other packages.
I will report what happens with this setup. It will be clear in the next few hours...

BR, -sjm

August 21, 2024, 02:14:15 PM #31 Last Edit: August 21, 2024, 02:29:45 PM by Wirehead
The behavior is strange though, and I'm not sure if it's related to the provider itself. See below graphs (graphs indicating packet loss).
My internet provider for WAN01 (ipv6 - provider: "Telenet") is the same provider as user "cloudz". (This provider was in the local news for having issues with youtube-related issues since "some time" - I wonder if it's related to the loss we see, or if this is related to upstream FreeBSD)



As you can see, this link, has issues, probably (and I use it cautiously) since the upgrade to 24.7.1.
However, at the same time, I also have a WAN02 (also ipv6 - provider: "Proximus"); which doesn't exhibit the behavior.


(these small drops here, are purely from testing..)

@wirehead well, my guess would be that your other IPv6 provider just has different ND settings aka their router is more patient with your neighbor solicitation answers.

Before downgrading to 24.7 kernel I could clearly see my opnsense box having long delays every time when asked for neighbor solicitation, and clearly sometimes the operator's box hit some timeout and I could see short IPv6 outages, until ND worked again.

Well, now I am examining my IPv6 ND traffic with tcpdump and I cannot see any more delays in answering to neighbor solicitation requests! My opnsense is now responding to every ND packet immediately.

So far, it clearly looks like 24.7.1 kernel did break IPv6 neighbor discovery somehow.
Downgrading to 24.7 helped me and the IPv6 shenanigans disappeared. YMMV.

BR, -sjm

Well uh, OPNsense 24.7.2 just came out and I upgraded and rebooted my firewall.
Now the IPv6 ND shenanigans are back, as I expected. My opnsense does not bother answering to neighbor solicitation until after random-looking long delays like 10 seconds or so. Sometimes the delay is too much for the operator's device and I would see IPv6 outage until my opnsense would respond to ND again.

Proof:


15:50:43.362261 IP6 fe80::1afd:74ff:fec1:2acd > fe80::2e2:xxxx:yyyy:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:44.412213 IP6 fe80::1afd:74ff:fec1:2acd > fe80::2e2:xxxx:yyyy:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:45.442798 IP6 fe80::1afd:74ff:fec1:2acd > fe80::2e2:xxxx:yyyy:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:46.659362 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:47.682844 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:48.722961 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:50.021582 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:51.052573 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:52.082731 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:53.349779 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:54.402914 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:54.402945 IP6 fe80::2e2:xxxx:yyyy:3a25 > fe80::1afd:74ff:fec1:2acd: ICMP6, neighbor advertisement, tgt is fe80::2e2:xxxx:yyyy:3a25, length 32


Now I will again downgrade to 24.7 kernel and probably this particular problem will go away (again).

BR, -sjm

I've already made a note in the upstream report. Thanks for providing this feedback and an additional packet capture!

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280701#c31


Cheers,
Franco

Well no problem, I am happy to help whenever I can, and it is also for my own benefit!
Is there anything else I could try digging up?

Anyway I will confirm after a few hours that the IPv6 ND shenanigans have really disappeared for good.
Now I can immediately see the correct behaviour with tcpdump because I know what to look after.

BR, -sjm

August 21, 2024, 04:24:30 PM #36 Last Edit: August 21, 2024, 05:22:54 PM by Wirehead
Quote from: sjm on August 21, 2024, 02:38:57 PM
@wirehead well, my guess would be that your other IPv6 provider just has different ND settings aka their router is more patient with your neighbor solicitation answers.

Before downgrading to 24.7 kernel I could clearly see my opnsense box having long delays every time when asked for neighbor solicitation, and clearly sometimes the operator's box hit some timeout and I could see short IPv6 outages, until ND worked again.

Well, now I am examining my IPv6 ND traffic with tcpdump and I cannot see any more delays in answering to neighbor solicitation requests! My opnsense is now responding to every ND packet immediately.

So far, it clearly looks like 24.7.1 kernel did break IPv6 neighbor discovery somehow.
Downgrading to 24.7 helped me and the IPv6 shenanigans disappeared. YMMV.

BR, -sjm

Downgraded (opnsense-update -kr 24.7) -> I'll evaluate and see what happens.

edit: no more loss:


You lose the subsequent fixes, but that's it.

I'm also back on 24.7 now to test a theory...


Cheers,
Franco

I tried to roll back the kernel and unbound wasn't planning to start anymore. Back @ 24.7.2 with IPv6 turned off. Hope you can find the reason, @franco.

It's all a bit random so likely unrelated. Sorry.

August 21, 2024, 07:09:52 PM #41 Last Edit: August 21, 2024, 07:19:57 PM by sjm
Well... after running 24.7 kernel on an otherwise-24.7.2 OPNsense system for 4 hours, I can confirm that the IPv6 ND shenanigans are gone.

I have not observed any other weirdness or side-effects either with this unholy combo.
I am not running Unbound, I am using Pi-hole + Unbound on a separate box.

BR, -sjm

Also rolled back kernel. No Sideeffect currently. Also unbound is running.

Just FYI: the 24.7.3 update that rolled out today fixes this.

Have been running it for several hours now, and IPv6 ND works just fine as expected.