hi guys,
im not sure how to describve this issue best so bear with me.
my ISP is Deutsche Telekom and im Using their PPPoE VDSL.
IPv6 was very stable until i upgraded to 24.7
now it looks like ip6 is sometimes working and sometimes not.
the ip address on the wan-interface is always there, which seems to be correct, and the connected computer also get ipv6 addresses correctly as far as i can tell from my testing.
if you try to use any ipv6 connection to the internet though its hit and miss.
if it works it will continue to work for considerable time and then suddenly simply stop for several seconds to minutes before coming back to life.
ipv4 at the same time has absolutely no issues at all.
by symptoms it looks like the firewall intermittently stops forwarding IPv6 and then continues on its own.
i would apreciate any tips on how to troubleshoot this.
Hi,
Can you try this patch?
https://github.com/opnsense/core/commit/287c13beb
# opnsense-patch 287c13beb
Cheers,
Franco
does it require a reboot, or should it come into effect immediately?
For some meaningful testing, I'd suggest at least unplugging/replugging the WAN cable.
> For some meaningful testing, I'd suggest at least unplugging/replugging the WAN cable.
Yes, good answer.
Cheers,
Franco
ok, will simply reboot the firewall, once i have time for it.
without it i dont see any visible effect of the patch.
will report back once i have rebooted
It's not very uncommon to see no effect when this only pertains to IPv6 renew events.
Cheers,
Franco
i have rebooted the firewall about an hour ago and ipv6 seems to run stable now. at least i have not seen any outtages while monitoring outgoing icmp traffic.
i will continue to monitor, but so far its looking good.
Thanks, if I get another independent confirmation I'll add that to the stable updates queue.
3 hours later my ipv6 is still going strong without issues.
just now i had a weird phenomenon.
i started my computer up after the night, it got an ipv6 ip from the correct prefix but couldnt reach the internet via ipv6.
i left a permanent ipv6 ping running and after a long time it started working just moments ago.
now im checking if it stays working.
there is a chance that this patch is only a partial fix. will reply later with more info.
while im writing this the ipv6 ping just failed again.
im not sure why it was working for the whole 12 hours yesterday after the patch and today its failing again.
the only thing that changed was a scheduled reboot of the firewall at 5 in the morning.
I doubt that there is a problem at all because that is expected behaviour: With SLAAC in your LAN, the clients will pick up an IPv6 only after they receive a SLAAC broadcast that only gets transmitted every few seconds.
That is why I set the Minimum and Maximum Interval values to 200 and 600, respectively for Router Advertisements.
im doing additional testing right now from multiple machines and vms on my network and im seeing some weird stuff.
the firewall is virtualized on a promox host.
it seems all physical machines on the network are having ipv6 issues intermittently but at different times, while the vm which has been running for a long time doesnt seem to have any issues (for as long as i have been testing today).
what i also noticed is that restarting the router advertisement service in the gui brings back connectivity on all physical machines immediately.
and i still dont get why it worked flawlessly for the rest of the day yesterday after applying the patch and rebooting the firewall.
very confusing.
@meyergru the clients pick up addresses just fine. its the forwarding that gets interrupted and autoresumes at different intervals.
i just checked the intervals are set to 200 and 600 by default it seems (i never touched them).
also see above the weird behaviour i have noticed.
i found one physical machine which doesnt seem to have issues. its a proxmox backup server based on debian 12.
i have only been testing for an hour or two so far today, but this machine at least didnt experience issues.
so that limits the machines that have issues to all my 3 physical machines running windows 11.
i do have one more windows 10 laptop to test. will get that from the shelf and run additional tests.
How do you hand out IPv6 on your LAN? SLAAC or DHCPv6 (or both?)
As Deutsche Telekom has dynamic IPv6, DHCPv6 can be problematic. If, for example, you offer DHCPv6 and a client uses it, it is up to the client to request a lease renewal, so it can be a long time after the prefix changes that the client uses the old prefix. With SLAAC only (aka "stateless"), this can be controlled by the minimum and maximum interval.
If you offer both (aka "assisted"), this can turn out to be different or different OSes. AFAIK, Promox itself does not use IPv6 at all per default, but VMs and LXCs can, if they want to. Thus I wonder: how did you verify that your Proxmox does not have any problems with IPv6?
Also, the virtualisation of your OpnSense can come into play as Proxmox contains a firewall as well.
im using slaac on the lan. the router advertisement service is set to unmanaged.
my proxmox host (PVE) doesnt use ipv6, the proxmox backup server (PBS) does and that one is a seperate physical machine from the PVE host.
the proxmox firewall is globally off on all levels (datacenter, host and vm).
this setup worked for 2 years without any issues and only started acting up with the upgrade to 24.7.
so you would recommend setting the router advertisement to statless instead of unmanaged?
what about the intervals?
does it make sense to lower the values there?
The difference between "stateless" and "unmanaged" is simply that "unmanaged" only sets the IPv6 adress, but nothing else (https://docs.opnsense.org/manual/radvd.html) - thus, the default gateway is not distributed via RA, neither are the DNS domains nor the DNS servers.
so additional testing with the win10 laptop revealed nothing new. it has the same issue as the other physical devices with the exception of the PBS, which is completely unbothered.
i set the router advertisement to stateless which made no visible difference.
does it make sense to lower the intervals?
i have set the min interval to 3 now and the max interval to 4.
on first glance it seems to be better now.
the laptop occasionally loses a ping (which might be wifi related).
the wired windows clients seem to be stable now as well, even though i have only implemented this like 5 min ago, so its too early to tell if it really improves things.
do such low value have any negative effect that i should be aware of?
I doubt that RA intervals of 3 seconds should be neccessary. It wounds more like you have another source of RAs sent to your network which interfere with your OpnSense RAs. It would explain why sending RAs at a smaller interval helps...
Are you sure that there is no other instance running? You already wrote that your OpnSense is a VM.
Quote from: meyergru on July 28, 2024, 01:02:02 PM
The difference between "stateless" and "unmanaged" is simply that "unmanaged" only sets the IPv6 adress, but nothing else (https://docs.opnsense.org/manual/radvd.html) - thus, the default gateway is not distributed via RA, neither are the DNS domains nor the DNS servers.
The default gateway definitely is in my installations - I do have "Advertise Default Gateway" checked, though.
Quote from: meyergru on July 28, 2024, 02:20:41 PM
I doubt that RA intervals of 3 seconds should be neccessary. It wounds more like you have another source of RAs sent to your network which interfere with your OpnSense RAs. It would explain why sending RAs at a smaller interval helps...
Are you sure that there is no other instance running? You already wrote that your OpnSense is a VM.
there is only one device which i have added to the network lately and thats an netgear orbi wifi mesh system in AP mode (other option would be router mode, which i dont want).
the ipv6 functionality is turned off in ap-mode and cant be turned on (greyed out).
i have no virtual machines or devices on my network that would act as a router otherwise.
but i will run a wireshark trace to see if there are any weird router advertisements coming in.
I use unmanaged RA on all my VLANs for different devices (Linux, Android, MS Windows) and advertise the default gateway too. Works for me before and after the upgrade without problems.
there does not seem to be any router advertisement besides the ones coming from my opnsense.
i filtered wireshark with icmpv6.type == 134 and see only opnsense advertisements.
I just noticed, that my IPv6 DHCP service crashes on a regular basis. I get a dynamic IPv6 via 6to4 tunnel from my Versatel provider. Up until the update from 24.1 to 24.7 it worked with "prevent release". How can I apply the patch?
different provider but the patch gets applied in the shell (ssh for example) with the command provided by franco in his post further up.
edit: the command is "opnsense-patch 287c13beb"
after 20 minutes of running wireshark i havent seen any RA from anything other than opnsense.
so no idea whats wrong.
edit: i have set the intervals back to 200 and 600 to check if i can reproduce/fix this issue at will by changing the values. will report with findings later
setting the values back to defaults did not bring back the issue.
rebooting the firewall also did not bring back the issue.
ipv6 keeps working for now.
wireshark still doesnt show any RA from anything other than opnsense.
i am at a loss here.
if anyone has any additional ideas what might be the cause of this i would be quite happy to hear them.
i will continue to monitor and we will see if the issue returns tomorrow, just like it returned today after working all afternoon/evening yesterday.
this morning the same issue is present again.
clients get ipv6 addresses, but no forwarding.
wireshark sees the RA only from opnsense again
i had to manually restart router advertisement to get the forwarding to start.
are there any specific logs i can look at to see if/how this part is misbehaving?
the general log in the gui doesnt give me anything useful.
You can look at the radvd config file (/var/etc/radvd.conf) and it its creation date and at the start times of radvd.
If a radvd restart fixes the problem and you see RAs at all, it does not seem to be that it stops sendings RAs, but its content. If the ISP changes the prefix (which I doubt they should within a running connection), radvd should get restarted automatically and thus the clients get the new prefix. All of this should be in the general log file.
Also, it you continually dump RAs on a client, you should see if/what differs in the RAs before/after a restart.
BTW: Telekom once did a "Zwangstrennung", which is now obsolete. Could it be the problem that it is still configured for your (old) account or that you somewhere have a mechanism that forces it in order to have that at a given time (when I still was with Telekom, I sure had that).
i had a cronjob configured that did a periodic interface reset at 03:00 at night, but i have disable that some time ago.
i compared a RA in wireshark before the restart and after the restart of the RA service and they are identical.
it seems to be advertising correctly even before restart, but for some reason traffic just doesnt pass.
i have attached 2 screenshots of the packets in wireshark. im not able to make out any differences
I do not get why a radvd restart can help if the RA content does not change and there are RAs sent before the restart.
Perhaps there are scripts that also restart other components depending on the radvd restart?
Quote from: tokade on July 29, 2024, 12:57:11 PM
Perhaps there are scripts that also restart other components depending on the radvd restart?
that would be interesting to know.
would make sense because as far as my limited understanding goes radvd does basically nothing but advertise things.
maybe someone with deeper knowledge of opnsense could comment on this.
is there anything that gets restarted whenever you restart radvd that might explain why ipv6 forwarding resumes immediately when radvd is restarted?
for now i am going to test if the intervals make a difference.
last night i had them at default 200/600, tonight i will leave them at 3/4 and see how the firewall behaves tomorrow morning.
Quote from: franco on July 27, 2024, 10:04:30 AM
Hi,
Can you try this patch?
https://github.com/opnsense/core/commit/287c13beb
# opnsense-patch 287c13beb
Cheers,
Franco
This worked for me.
latest update for me.
the firewall didnt have any issues forwarding ipv6 in the morning after setting the intervals to 3/4 in the evening before.
when set to 200/600 it had issues in the morning.
i am still not sure why or how this is and why noone else is seeing this.
I'am using Deutsche Telekom, too. On WAN side, you have to choose DHCPv6 with IPv6 Prefix Delegation 56. On LAN-Interfaces you should use Track Interface with an Prefix-ID.
Quote from: PhoenixRider on July 30, 2024, 07:26:23 PM
I'am using Deutsche Telekom, too. On WAN side, you have to choose DHCPv6 with IPv6 Prefix Delegation 56. On LAN-Interfaces you should use Track Interface with an Prefix-ID.
Or - in case of a business contract with a fixed /56 - just configure all internal interfaces statically.
Quote from: PhoenixRider on July 30, 2024, 07:26:23 PM
I'am using Deutsche Telekom, too. On WAN side, you have to choose DHCPv6 with IPv6 Prefix Delegation 56. On LAN-Interfaces you should use Track Interface with an Prefix-ID.
this is the case.
see attached screenshots.
and another screenshot
Don't enable "use IPv4 connectivity".
If it is a business line, why don't you configure LAN statically? Enable Router Advertisments, done.
Quote from: Patrick M. Hausen on July 30, 2024, 08:00:19 PM
Don't enable "use IPv4 connectivity".
If it is a business line, why don't you configure LAN statically? Enable Router Advertisments, done.
its not business. its a completely normal home vdsl with changing prefixes.
nothing to configure statically there.
and any guide i have ever found anywhere says you have to use "use ipv4 connectivity" to get a prefix.
if you dont, you dont get one.
but i can try that, no problem. give me a few minutes
Sorry, you are right. I got confused.
yup ipv6 broke, when i didnt use ipv4 connectivity. reenabled that.
Quote from: beisser on July 30, 2024, 07:56:53 PM
Quote from: PhoenixRider on July 30, 2024, 07:26:23 PM
I'am using Deutsche Telekom, too. On WAN side, you have to choose DHCPv6 with IPv6 Prefix Delegation 56. On LAN-Interfaces you should use Track Interface with an Prefix-ID.
this is the case.
see attached screenshots.
Request Prefix only have to be enabled! ;)
Quote from: PhoenixRider on July 30, 2024, 08:39:56 PM
Request Prefix only have to be enabled! ;)
Not here. I receive a single external address and a /56 from DTAG.
Quote from: Patrick M. Hausen on July 30, 2024, 08:43:37 PM
Quote from: PhoenixRider on July 30, 2024, 08:39:56 PM
Request Prefix only have to be enabled! ;)
Not here. I receive a single external address and a /56 from DTAG.
It's been going like this here for years with this option. ;) DTAG Home-Users only need a Prefix!
Quote from: PhoenixRider on July 30, 2024, 08:45:26 PM
It's been going like this here for years with this option. ;) DTAG Home-Users only need a Prefix!
I have Caddy listening on that external address as a reverse and SSL proxy for all my applications.
Of course Internet for systems on LAN will work with just a prefix and OPNsense itself can still rely on IPv4 for e.g. NTP. But we digress from the original problem.
for me ipv6 breaks if i select only prefix.
suddenly link local addresses is all i have.
no more prefix on lan or wan.
basically a downgrade to ipv4 only :)
i just tried.
Quote from: beisser on July 30, 2024, 09:09:48 PM
for me ipv6 breaks if i select only prefix.
suddenly link local addresses is all i have.
no more prefix on lan or wan.
basically a downgrade to ipv4 only :)
i just tried.
This must be a reason. With this config, my opnsense runs like a charm.