OPNsense Forum

Archive => 20.7 Legacy Series => Topic started by: direx on September 08, 2020, 07:53:05 am

Title: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: direx on September 08, 2020, 07:53:05 am
Hi,

I have a problem which was introduced after updating to 20.7:

After round about two days of uptime of my OPNsense box, IPv6 in my networks stops working. This has nothing to do with chaning prefix (mine chages every 180 days) but I figured out that radvd does not announnce the IPv6 prefix any more. This means all clients will lose IPv6 connectivity eventually.

Clicking the restart button for "radvd" in the web UI fixes this and clients re-gain their internet connectivity after this. The strange part is that radvd is always running (output before restart):

Code: [Select]
# ps aux|grep rad
root    42763   0.0  0.1 1061048  3196  -  Ss   Sun21       0:30.35 /usr/local/sbin/radvd -p /var/run/radvd.pid -C /var/etc/radvd.conf -m syslog

Between radvd restarts the radvd.conf and the output of "netstat -6an" does not change.

This really looks like a bug to me (radvd freezing) but I don't know how I can debug this. Any hints here on how to get to the root cause of the radvd issue? It looks like the "strace" command is not available so I am a little helpless here.


Regards,
direx
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: gpb on September 08, 2020, 07:37:39 pm
Yesterday after a cold boot, I didn't notice I had no IPv6 until 90 minutes later and it required me to save/apply an unchanged WAN interface followed by a save/apply an unchanged LAN interface.  Then routing started.  It looked like I had IPv6 addresses on hosts, but no connectivity (ipv6 monitored by Nagios ping).  There are a few more ipv6 threads that may be related (one solved by moving to 21.1 development version).  In my experience testing, unrelated to the above problem (maybe), it looks like radvd is not responding to host solicitations directly.  It advertises and I increased the frequency of that using manual settings. 

https://forum.opnsense.org/index.php?topic=18868.0
https://forum.opnsense.org/index.php?topic=18549.0
https://forum.opnsense.org/index.php?topic=18591.0

Just an FYI.  Oh, and I have not seen the problem you describe where radvd stops altogether.  You might want to try manual router adv settings.  Something definitely seems wrong as compared to 20.1.x series.

Cheers.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: lattera on September 08, 2020, 10:51:04 pm
I've experienced this issue, too.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: robgnu on September 09, 2020, 10:39:35 pm
I can confirm, too. Two OPNsense systems are affected by this issue.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: ilewis on September 18, 2020, 02:36:08 am
I registered on this forum just to say that I've been hit with this problem too.

A restart of the radvd service fixes the problem immediately, but radvd then stops working after 24-48 hours (ipv6 solicitation stops working) until you manually restart the service.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: franco on September 18, 2020, 08:35:06 am
Somehow the kernel hits a limit for multicast join/leave in FreeBSD 12. We haven't had the chance to debug this further and there seem to be no relevant patches in FreeBSD.


Cheers,
Franco
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: eddy on September 18, 2020, 11:56:33 pm
I'm new to OPNsense, but noticed a couple days ago IPv6 addresses were not being given to devices on the LAN side. A restart of radvd got it working again.

Would a potential work-around for this issue be to set up a cron job to pkill and then restart radvd once a day?
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: robgnu on September 19, 2020, 08:33:59 am
Better set the cronjob once per hour....
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: Patrick M. Hausen on October 21, 2020, 04:21:20 pm
Hey folks,

I just found that our office in Frankfurt is suffering the same problem. They have an uplink ISP that does not provide v6 at all, so we deployed an OPNsense (20.7) to route through a WireGuard tunnel to our main office in Karlsruhe.

Works great, if it wasn't for the router advertisements.

Since the OPNsense box in Frankfurt is the only router for v6 in the LAN, we are in control of everything, and Macs (our preferred developer platform) will probably honour almost anything - would it be a viable workaround to switch to DHCPv6 from SLAAC?

From https://github.com/opnsense/core/issues/4338 I get the issue is not quite fixed in the update planned for tomorrow?

Thanks!
Patrick

EDIT: I just found that DHCPv6 does not work without RA ... ok.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: Patrick M. Hausen on October 22, 2020, 11:12:13 am
One question:

Why are we using radvd at all? I assume this is this product?
http://www.litech.org/radvd/

The FreeBSD base system contains rtadvd which is running in production on our site without a single problem ...

Kind regards,
Patrick
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: marjohn56 on October 22, 2020, 04:38:34 pm
https://www.freebsd.org/cgi/man.cgi?query=radvd&apropos=0&sektion=0&manpath=FreeBSD+12.1-RELEASE+and+Ports&arch=default&format=html
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: Patrick M. Hausen on October 22, 2020, 04:40:57 pm
Yes - but that's a port of an external piece of software:
https://www.freshports.org/net/radvd/

rtadvd is in base and basically what we run everywhere if it's plain FreeBSD and not OPNsense:
https://www.freebsd.org/cgi/man.cgi?query=rtadvd&apropos=0&sektion=0&manpath=FreeBSD+12.1-RELEASE&arch=default&format=html

Why the extra package?
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: CloudHoppingFlowerChild on October 23, 2020, 04:19:43 am
Can the element in question be rolled back to what was used in 20.1? Add a watchdog to restart it or make restarting radvd an option in the cron task menu?
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: ivwang on October 23, 2020, 07:20:20 am
Can the element in question be rolled back to what was used in 20.1? Add a watchdog to restart it or make restarting radvd an option in the cron task menu?

Base HardenedBSD is also upgraded to 12 since 20.7 release. This issue might be more than just radvd as discussed over opnsense GitHub. So simply reverting back to old radvd package might not guarantee a solution. But not sure why radvd is used over FreeBSD native rtadvd..

For what it worths, I am restarting radvd via cron every 30 minutes as a stopgap for now
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: franco on October 23, 2020, 07:32:10 am
I think it's in the kernel since radvd is the same. Rollback is not easily possible in this regard.

As for why radvd and not rtadvd... radvd came before or was more reliable (think over 10 years ago) and nobody did the work to evaluate rtadvd migration since then. Maybe now that 12.1 is out and provides challenges to radvd it's time to do this evaluation, but even then the kernel issue might be affecting rtadvd too.

At this point it is still too early to tell.


Cheers,
Franco
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: marjohn56 on October 23, 2020, 09:12:17 am
@pmhausen - are you running it in place of radvd? If so have you modified dhcpd.inc to create the correct config file for it, or are you just calling it manually?


- Edit -


Hmm, looks the same... I'll  try it.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: Patrick M. Hausen on October 23, 2020, 09:35:48 am
@pmhausen - are you running it in place of radvd? If so have you modified dhcpd.inc to create the correct config file for it, or are you just calling it manually?
We are not running rtadvd on OPNsense. I just happen to work in an environment with about a hundred FreeBSD machines in total and on some of them we run rtadvd - the ones that are routers, of course.

And I was just puzzled OPNsense included a 3rd party package instead of using what is in base. Specifically because rtadvd has been in FreeBSD since 2000 when KAME IPv6 was integrated. Of course it is in base that long, because *some* router advertisement daemon is mandatory for a router ;)


Currently I look at the source to find what the two daemons might be doing differently. And I am completely flabbergasted when I browse the radvd source. In the original product, still in their git repo, the function

setup_allrouters_membership()

is just a stub with a single "return 0;" statement. The code to actually join the all routers multicast group was added by the port maintainer and just recently improved/fixed by @franco.
https://svnweb.freebsd.org/ports/head/net/radvd/files/patch-device-bsd44.c?view=log


Second, from the control flow the function should be called only once at startup of the daemon when the interface is initialised. So I wonder why the group is joined repeatedly (?) until some kernel bug kicks in. Is that the case or did I completely misunderstand the problem?


The actual code to join the mcast group looks more or less identical for both, rtadvd's is here:
https://svnweb.freebsd.org/base/releng/12.1/usr.sbin/rtadvd/if.c?revision=352546&view=markup

It's essentially
Code: [Select]
setsockopt(sock, IPPROTO_IPV6, IPV6_JOIN_GROUP, &mreq, sizeof(mreq))

Kind regards,
Patrick
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: marjohn56 on October 23, 2020, 10:43:32 am
Well I have it running on my test unit, no changes at present just a manual stop of radvd and manual start of rtadvd using the same config file etc, it's working. So I'll leave it running and see what happens. I think I'll do the same on my live router as that will get a lot more action. As a side note, radvd did fall over on my live machine this morning, so that proves the issue is still there, but it's very intermittent.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: skywalker007 on October 23, 2020, 11:23:19 am
Maybe a newbie question, but how would I set up a cronjob to restart radvd hourly from the UI?
I navigated to System/Settings/Cron but it seems to have only a list of predefined commands and doesn't allow custom commands. Or did I miss anything here?

thanks, Till
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: Patrick M. Hausen on October 23, 2020, 11:37:58 am
Maybe a newbie question, but how would I set up a cronjob to restart radvd hourly from the UI?
I navigated to System/Settings/Cron but it seems to have only a list of predefined commands and doesn't allow custom commands.
It came as a surprise to me too, that you cannot execute arbitrary commands via Cron, but here you go:

Create /usr/local/opnsense/service/conf/actions.d/actions_radvd.conf with e.g. this content:
Code: [Select]
[restart]
command:/usr/local/sbin/pluginctl -s radvd restart
type:script
description:Restart radvd

Enter this command:
Code: [Select]
service configd restart

Voila - new option in the Cron UI.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: skywalker007 on October 23, 2020, 12:07:19 pm
thank you. Worked like a charm!
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: mimugmail on October 23, 2020, 12:09:31 pm
It's important that it has a description
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: agh1701 on October 23, 2020, 06:59:30 pm
is this still a problem in 20.7.4?
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: Patrick M. Hausen on October 23, 2020, 07:13:29 pm
I suppose so.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: samsonmcnulty on November 02, 2020, 02:36:03 pm
Without looking at the logs I can't say for sure but I'm definitely still having what I assume is this problem. adding the cron job and we'll see how it goes.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: madj42 on November 13, 2020, 03:51:50 am
Just chiming in.  I'm having this issue as well. Found this:

https://forum.netgate.com/topic/142363/ipv6-broken-radvd-can-t-join-ipv6-allrouters-on-interface/137
https://github.com/pfsense/FreeBSD-ports/pull/773

@franco does this help narrow it down.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: hb9cwp on December 23, 2020, 04:55:55 am
Having the same issue with OPNsense 20.7.7_1-amd64 on APU4 hardware. Have setup the work-around with daily restart of radvd by cron for now (using the Cron UI as pointed out by pmhausen Reply #19 above https://forum.opnsense.org/index.php?topic=19032.msg90983#msg90983 ).
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: samsonmcnulty on December 23, 2020, 06:21:31 pm
For those just joining the party, see https://github.com/opnsense/core/issues/4338#issuecomment-732397405

we have a working fix and pull request. running
Code: [Select]
opnsense-patch 9a4a908 will replace radvd with rtadvd and seems to rectify the issue for everyone.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: hb9cwp on December 25, 2020, 09:05:35 am
For those just joining the party, see https://github.com/opnsense/core/issues/4338#issuecomment-732397405

Thank you for the pointer.

The opnsense-patch 9a4a908 applied cleanly to OPNsense 20.7.7_1-amd64 and rtadvd is running since 15+ hours after reloading the WebUI and restarting the Router Advertisement Daemon manually, but executed no reboot so far, to avoid loss of connectivity and BGP route flaps upstream.

Now keeping an eye on it as rtadvd approaches the 20 hours mark whereabout radvd got stuck, started to fill the router log with its messages, and required a restart.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: franco on December 25, 2020, 09:33:35 pm
I've dug through kernel changes for FreeBSD 12 to find something that would indicate radvd stopped working the way it used to when we were still on 11. Although I'm not sure this isn't the new reality I can't say that moving radvd to rtadvd is the obvious solution if we unterstand that radvd works pretty much how we want to and all we did was move from FreeBSD 11 to 12.

I'm also trying a new approach for the BSD-based fix that FreeBSD carries exclusively (not part of upstream for diversity reasons most likely) more closely resembling the way that rtadvd handles its multicast group join internally.

Hopefully that will make radvd usable again in OPNsense in 2021 for the affected users.


Cheers,
Franco
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: marjohn56 on December 27, 2020, 08:46:09 am
I was a little concerned as I said to you that when I created the rtadvd patch there appears to be features in radvd that have no equivalent in rtadvd. However, after a couple of months of people running it there seems to be nothing that doesn't work the way it should. I've just added a second commit to allow for remote log functions to work.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: IsaacFL on December 28, 2020, 04:16:58 am
I was looking through the old changes and from what I could tell from the comments, the reason pfsense moved from rtadvd to radvd, was they were having problems with CARP and VIPs in ipv6 at the time with rtadvd.  Maybe rtadvd has solved the issues from back then.

It is possible that from FreeBSD 11 to 12, something else in the network stack introduced a dependency on rtadvd.

Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: samsonmcnulty on December 28, 2020, 05:18:22 am
I was looking through the old changes and from what I could tell from the comments, the reason pfsense moved from rtadvd to radvd, was they were having problems with CARP and VIPs in ipv6 at the time with rtadvd.  Maybe rtadvd has solved the issues from back then.

It is possible that from FreeBSD 11 to 12, something else in the network stack introduced a dependency on rtadvd.
I was just having issues in a lab set up with rtadvd and carp vip's. Not sure it isn't kvm and virtualized network related but it definitely seemed to be a contributing factor.

Sent from my Pixel 4 XL using Tapatalk

Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: marjohn56 on December 28, 2020, 09:50:30 am
With CARP etc, it's likely that RTADVD needs to be signalled to stop listening on the interfaces and only become active when needed, that's shouldn't be too difficult to implement if RTADVD became the default daemon. CARP isn't something I play with, so I've not looked into it.
Title: Re: radvd stops announcing IPv6 prefix after a while (radvd freeze?)
Post by: blusens on February 14, 2022, 12:03:07 am
I've been using the cron script to restart radvd and dhcpv6 since 2020. I've been on 22.1 for a few weeks and hadn't the issue anymore. Yay!