See title. It is fixable by restarting dhcpdv6. Internet access is back immediately after restarting the the ISC DHCP v6. This issue is new with the current release.
For now iam using from to restart the dhcp every hour ... :/
I've been struggling with IPv6 since the 24.7.1 update. High latency spikes, drops but never had to restart the DHCPv6 daemon tho. Just rolled back to 24.7_9 in a hope that fixes my issues.
@OPNsense team : something is terribly wrong in the IPv6 implementation in 24.7.1, just can't put my finger on it. Next to RRD stopping graphing on interface issues.
I bet its not opnsense, its a BSD issue ... :|
Quote from: cloudz on August 17, 2024, 09:04:07 AM
Next to RRD stopping graphing on interface issues.
See https://github.com/opnsense/core/issues/7753#issuecomment-2282723192
As for the original topic of the thread, DHCPv6 is absolutely not required for IPv6 to work, no information here to debug any issues.
Quote from: doktornotor on August 17, 2024, 09:12:57 AM
Quote from: cloudz on August 17, 2024, 09:04:07 AM
Next to RRD stopping graphing on interface issues.
See https://github.com/opnsense/core/issues/7753#issuecomment-2282723192
As for the original topic of the thread, DHCPv6 is absolutely not required for IPv6 to work, no information here to debug any issues.
What kind of information do you need, except that a restart of dhcpv6 fixing the issue ?
Quote from: doktornotor on August 17, 2024, 09:12:57 AM
Quote from: cloudz on August 17, 2024, 09:04:07 AM
Next to RRD stopping graphing on interface issues.
See https://github.com/opnsense/core/issues/7753#issuecomment-2282723192
As for the original topic of the thread, DHCPv6 is absolutely not required for IPv6 to work, no information here to debug any issues.
I think restarting DHCPv6 triggers something else to be restarted or repopulated. It might be the quick fix to an underlying problem. For me it's intermittent drops of up to 10 seconds - it looks like at those moments the devices on my lan don't have either a route or an IPv6 address. It shows because I use Uptime Kuma extensively to monitor servers inside and outside of my network -- and IPv6 keeps dropping since last week when I updated.
You have posted zero information about your IPv6 configuration, about the clients, info in the logs or whatever else. I would strongly suggest going back to basics:
1/ Disable the DHCPv6 nonsense altogether for all IPv6-enabled interfaces.
2/ Under router advertisement, set it to Unmanaged for all IPv6-enabled interfaces.
Now, go test again.
https://docs.opnsense.org/manual/radvd.html
Why? It works perfectly fine in 24.7 with the exact same config and it has been since a long time. It doesn't in 24.7.1
I use DHCPv6 with static leases for 2 subnets with managed RA, others have it set to unmanaged and rely on the ISP. On my client devices DHCPv6 works well. I don't have android or IoT devices on those subnets. I need it for audit and logging reasons.
I rolled back and no more issues. Stable connections all around.
Ok, good luck, have better ways to waste my time with.
Clients doesnt matter. Windows, Linux, macOS, iOS. All loosing IPv6 WAN Access. (OPNsense itself still has working IPv6 WAN)
OPNsense is configured like that:
PPPoE --> DHCPv6 over IPv4
Multiple VLANs, configured as Track Interface. RA is configured as Assisted. (except the SERVER VLAN, there iam using static IPV6 + Managed RA.)
I will NOT disable DHCPv6 because its used for static mappings AND was working before the latest update. Somehow there was a bug introduced with the lastest update and IAM reporting it here to seek for assistance. Disabling something which is required in MY setup, is, like my regular restart, not solving the issue, but putting some glue on an underlying issue.
Iam able to provide any kind of information iam asked for. I will also NOT roll back, even if this would fix my issue, because if so, i cant test some patches the devs might provide to fix the actual issue. (my regular restart is fine for now)
Quote from: dMopp on August 17, 2024, 11:52:52 AM
All loosing IPv6 WAN Access. (OPNsense itself still has working IPv6 WAN)
Wonderful, finally some relevant info. Now, perhaps look at the logs and see what's going on at the time when the
clients "lose IPv6 internet". Defining more precisely what does mean would help as well, such as whether the name resolution does not work, or ping via IP address does not work as well, or whatever.
As someone else noted above, restarting dhcpv6 merely triggers some action related to IPv6 that fixes things for you.
Quote
looks like at those moments the devices on my lan don't have either a route or an IPv6 address
hence my suggestion to get the damned DHCP out of the way to narrow down the issue. But then again, that issue might have nothing to do with your issues. This thread becoming a place for random complaints about IPv6 not working in completely different setups will not be very productive, I'm afraid.
I started just for my issue. Resolving works as well. At least on opnsense. Clients even can't reach the firewall over ipv6. I might find some time later on to trigger the issue again, for the weekend I will my workaround in place
Quote from: dMopp on August 17, 2024, 12:20:11 PM
Clients even can't reach the firewall over ipv6.
Reach how? Using the hostname? The GUA LAN IP? WAN IP? Some ULA? The link-local IP? Test all relevant of them. Post the ifconfig / ip a s / whatever output and route info from the client when it does not work. The routing firewall logs. Give people something to work with, the normal networking issues debugging sort of stuff!
Repeating "it's broken and worked before, it sucks" is not useful but pure waste of time.
Well... I can confirm some weird IPv6 connectivity issues too.
Please note: the ping was ran on the OPNsense box itself, but the results look like exactly the same if I try it from the internal network.
My IPv6 WAN connectivity just breaks somehow occasionally.
Here I was running IPv6 ping to my cloud vm every 3 seconds, with ping -n6 -i 3 -c 200 xxx.yyy.zzz
and when checking the results, every now and then I can see this kind of weirdness:
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=96 hlim=55 time=6.646 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=97 hlim=55 time=6.628 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=105 hlim=55 time=2126.610 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=106 hlim=55 time=6.617 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=107 hlim=55 time=6.510 ms
...
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=113 hlim=55 time=6.544 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=114 hlim=55 time=6.614 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=119 hlim=55 time=2084.833 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=120 hlim=55 time=6.610 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=121 hlim=55 time=6.538 ms
...
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=154 hlim=55 time=6.656 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=155 hlim=55 time=15.322 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=156 hlim=55 time=6.651 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=162 hlim=55 time=19.360 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=163 hlim=55 time=7.046 ms
16 bytes from 2a01:4f9:xxx:zzz::1, icmp_seq=164 hlim=55 time=6.548 ms
so... a bit randomly, total IPv6 blackout is observed for roughly 20 seconds.
Meanwhile, just for comparison, I have an OpenWrt box connected to the same WAN vlan and it does not exhibit this kind of IPv6 packet loss. So IMO problems in operator side are more or less ruled out and my OPNsense has something weird going on.
Futhermore, this problem began exactly after I upgraded to OPNsense 24.7.1.
If anyone has any good hints what to look after with tcpdump, I am all ears (and eyes).
BR, -sjm
The intermittent IPv6 connectivity breaks look like they happen at more or less random intervals. At least I cannot find any clear pattern. Usually there will be one or two breaks in a 10-15 minute period, but it can also run almost 30 minutes flawlessly.
I have a ping monitoring running on my raspi4 with Telegraf, monitoring 3 separate external IPv6 addresses.
Meanwhile, the IPv4 addresses in the same monitoring system are working just fine.
After OPNsense 24.7.1 upgrade, it looks like this.
https://imgur.com/a/ipv6-ping-loss-1WBVHcN (https://imgur.com/a/ipv6-ping-loss-1WBVHcN)
BR, -sjm
It seems someone found the reason :
https://github.com/opnsense/core/issues/7795
Well uh, I was able to observe similar looking behaviour.
root@opns:~ # tcpdump -n -nn -i vlan0.18 'icmp6 and not ip6[40] == 128 and not ip6[40] == 129'
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vlan0.18, link-type EN10MB (Ethernet), snapshot length 262144 bytes
20:53:32.931734 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has 2a0b:5c81:10:0:xxxx:yyyy:zzzz:3a25, length 32
20:53:34.002399 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has 2a0b:5c81:10:0:xxxx:yyyy:zzzz:3a25, length 32
20:53:35.042986 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has 2a0b:5c81:10:0:xxxx:yyyy:zzzz:3a25, length 32
20:53:38.957928 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has 2a0b:5c81:10:0:xxxx:yyyy:zzzz:3a25, length 32
20:53:40.002543 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has 2a0b:5c81:10:0:xxxx:yyyy:zzzz:3a25, length 32
20:53:41.052590 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has 2a0b:5c81:10:0:xxxx:yyyy:zzzz:3a25, length 32
20:53:42.259048 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has 2a0b:5c81:10:0:xxxx:yyyy:zzzz:3a25, length 32
20:53:43.282588 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has 2a0b:5c81:10:0:xxxx:yyyy:zzzz:3a25, length 32
20:53:44.322688 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has 2a0b:5c81:10:0:xxxx:yyyy:zzzz:3a25, length 32
20:53:46.333740 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has 2a0b:5c81:10:0:xxxx:yyyy:zzzz:3a25, length 32
20:53:47.362818 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has 2a0b:5c81:10:0:xxxx:yyyy:zzzz:3a25, length 32
20:53:47.362850 IP6 fe80::xxxx:yyyy:zzzz:3a25 > fe80::1afd:74ff:fec1:2acd: ICMP6, neighbor advertisement, tgt is 2a0b:5c81:10:0:xxxx:yyyy:zzzz:3a25, length 32
BR, -sjm
Is this due to the pf FreeBSD SA ICMP issue too?
You can review the 24.7.2 kernel:
# opnsense-update -kr 24.7.2
(requires a reboot(
Cheers,
Franco
Just installed & rebooted. Looks good so far.
Ok, 24.7.2 is due tomorrow. Let's wait and see in this case :)
I will test my issue as well. Maybe related, maybe not. Will report then :)
@Franco : Cried victory too fast. Still the same erratic behaviour. I've turned off IPv6 for now.
(https://cloudz.be/assets/screenshots/pingloss.png)
Maybe another upstream oddity. I can put it on a list of things to dig through -.-
Thanks, much appreciated.
I am not using OPNsense Quality graph because I see similar behavior on my connection for a long time now(dpinger). I test my connection with smokeping. Maybe you can use that also for testing IPv6?
Quote from: cloudz on August 21, 2024, 07:28:41 AM
@Franco : Cried victory too fast. Still the same erratic behaviour. I've turned off IPv6 for now.
(https://cloudz.be/assets/screenshots/pingloss.png)
I see you're using Telenet. I see similar behavior on Telenet, but not on Proximus. Telenet also seemingly issues with Youtube etc (see the news..) - Could be the provider. I'll test "opnsense-update -kr 24.7.2" as well.
But it's not happening on 24.7 and before. I do see very high pings towards Google -- so that's a routing issue. These pings are to the modem/ipv6 bridge itself.
It could still be repercussions of https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280701 introduced in 24.7.1 but I have absolutely no idea what the plan there is after the SA breaking it and then adding a few small patches on top that seem to fix the most apparent issues (included in 24.7.2), but perhaps not all of them.
Cheers,
Franco
Just to be sure have you tried reverting to 24.7 kernel instead?
Cheers,
Franco
Yes -- I went to the 24.7 kernel and that caused the drops to disappear but sometimes high spikes in CPU / interrupts which caused issues on IPv4 instead. But I did it with opnsense-revert & the update to the old kernel so might have caused some inconsistencies there.
Well... I tried opnsense-update -kr 24.7.2 and reboot, it did not help.
The occasional IPv6 outages are still there.
Now I ran opnsense-update -kr 24.7 to try with the previous kernel, aka did not downgrade any other packages.
I will report what happens with this setup. It will be clear in the next few hours...
BR, -sjm
The behavior is strange though, and I'm not sure if it's related to the provider itself. See below graphs (graphs indicating packet loss).
My internet provider for WAN01 (ipv6 - provider: "Telenet") is the same provider as user "cloudz". (This provider was in the local news for having issues with youtube-related issues since "some time" - I wonder if it's related to the loss we see, or if this is related to upstream FreeBSD)
(https://forum.opnsense.org/index.php?action=dlattach;topic=42270.0;attach=37183;image)
As you can see, this link, has issues, probably (and I use it cautiously) since the upgrade to 24.7.1.
However, at the same time, I also have a WAN02 (also ipv6 - provider: "Proximus"); which doesn't exhibit the behavior.
(https://forum.opnsense.org/index.php?action=dlattach;topic=42270.0;attach=37185;image)
(these small drops here, are purely from testing..)
@wirehead well, my guess would be that your other IPv6 provider just has different ND settings aka their router is more patient with your neighbor solicitation answers.
Before downgrading to 24.7 kernel I could clearly see my opnsense box having long delays every time when asked for neighbor solicitation, and clearly sometimes the operator's box hit some timeout and I could see short IPv6 outages, until ND worked again.
Well, now I am examining my IPv6 ND traffic with tcpdump and I cannot see any more delays in answering to neighbor solicitation requests! My opnsense is now responding to every ND packet immediately.
So far, it clearly looks like 24.7.1 kernel did break IPv6 neighbor discovery somehow.
Downgrading to 24.7 helped me and the IPv6 shenanigans disappeared. YMMV.
BR, -sjm
Well uh, OPNsense 24.7.2 just came out and I upgraded and rebooted my firewall.
Now the IPv6 ND shenanigans are back, as I expected. My opnsense does not bother answering to neighbor solicitation until after random-looking long delays like 10 seconds or so. Sometimes the delay is too much for the operator's device and I would see IPv6 outage until my opnsense would respond to ND again.
Proof:
15:50:43.362261 IP6 fe80::1afd:74ff:fec1:2acd > fe80::2e2:xxxx:yyyy:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:44.412213 IP6 fe80::1afd:74ff:fec1:2acd > fe80::2e2:xxxx:yyyy:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:45.442798 IP6 fe80::1afd:74ff:fec1:2acd > fe80::2e2:xxxx:yyyy:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:46.659362 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:47.682844 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:48.722961 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:50.021582 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:51.052573 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:52.082731 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:53.349779 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:54.402914 IP6 fe80::1afd:74ff:fec1:2acd > ff02::1:ff65:3a25: ICMP6, neighbor solicitation, who has fe80::2e2:xxxx:yyyy:3a25, length 32
15:50:54.402945 IP6 fe80::2e2:xxxx:yyyy:3a25 > fe80::1afd:74ff:fec1:2acd: ICMP6, neighbor advertisement, tgt is fe80::2e2:xxxx:yyyy:3a25, length 32
Now I will again downgrade to 24.7 kernel and probably this particular problem will go away (again).
BR, -sjm
I've already made a note in the upstream report. Thanks for providing this feedback and an additional packet capture!
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280701#c31
Cheers,
Franco
Well no problem, I am happy to help whenever I can, and it is also for my own benefit!
Is there anything else I could try digging up?
Anyway I will confirm after a few hours that the IPv6 ND shenanigans have really disappeared for good.
Now I can immediately see the correct behaviour with tcpdump because I know what to look after.
BR, -sjm
Quote from: sjm on August 21, 2024, 02:38:57 PM
@wirehead well, my guess would be that your other IPv6 provider just has different ND settings aka their router is more patient with your neighbor solicitation answers.
Before downgrading to 24.7 kernel I could clearly see my opnsense box having long delays every time when asked for neighbor solicitation, and clearly sometimes the operator's box hit some timeout and I could see short IPv6 outages, until ND worked again.
Well, now I am examining my IPv6 ND traffic with tcpdump and I cannot see any more delays in answering to neighbor solicitation requests! My opnsense is now responding to every ND packet immediately.
So far, it clearly looks like 24.7.1 kernel did break IPv6 neighbor discovery somehow.
Downgrading to 24.7 helped me and the IPv6 shenanigans disappeared. YMMV.
BR, -sjm
Downgraded (opnsense-update -kr 24.7) -> I'll evaluate and see what happens.
edit: no more loss:
(https://forum.opnsense.org/index.php?action=dlattach;topic=42270.0;attach=37197;image)
Does a kernel downgrade any side-effects?
You lose the subsequent fixes, but that's it.
I'm also back on 24.7 now to test a theory...
Cheers,
Franco
I tried to roll back the kernel and unbound wasn't planning to start anymore. Back @ 24.7.2 with IPv6 turned off. Hope you can find the reason, @franco.
It's all a bit random so likely unrelated. Sorry.
Well... after running 24.7 kernel on an otherwise-24.7.2 OPNsense system for 4 hours, I can confirm that the IPv6 ND shenanigans are gone.
I have not observed any other weirdness or side-effects either with this unholy combo.
I am not running Unbound, I am using Pi-hole + Unbound on a separate box.
BR, -sjm
Also rolled back kernel. No Sideeffect currently. Also unbound is running.
Just FYI: the 24.7.3 update that rolled out today fixes this.
Have been running it for several hours now, and IPv6 ND works just fine as expected.
Thanks for the update, @sjm — really appreciate your persistence and detailed tcpdump analysis throughout this. It's great to hear that 24.7.3 resolves the IPv6 ND issue for you after so much back-and-forth with kernel rollbacks. Your findings were super helpful in narrowing this down, especially highlighting how delayed neighbor solicitation responses were at the core.
Also good to see others like @Wirehead and @dMopp confirmed similar behavior and rollback workarounds without major side effects.
Let's hope 24.7.3 holds stable for everyone across setups. Thanks again to all for digging deep into this one.