radvd stops announcing IPv6 prefix after a while (radvd freeze?)

Started by direx, September 08, 2020, 07:53:05 AM

Previous topic - Next topic
@pmhausen - are you running it in place of radvd? If so have you modified dhcpd.inc to create the correct config file for it, or are you just calling it manually?


- Edit -


Hmm, looks the same... I'll  try it.
OPNsense 24.7 - Qotom Q355G4 - ISP - Squirrel 1Gbps.

Team Rebellion Member

Quote from: marjohn56 on October 23, 2020, 09:12:17 AM
@pmhausen - are you running it in place of radvd? If so have you modified dhcpd.inc to create the correct config file for it, or are you just calling it manually?
We are not running rtadvd on OPNsense. I just happen to work in an environment with about a hundred FreeBSD machines in total and on some of them we run rtadvd - the ones that are routers, of course.

And I was just puzzled OPNsense included a 3rd party package instead of using what is in base. Specifically because rtadvd has been in FreeBSD since 2000 when KAME IPv6 was integrated. Of course it is in base that long, because *some* router advertisement daemon is mandatory for a router ;)


Currently I look at the source to find what the two daemons might be doing differently. And I am completely flabbergasted when I browse the radvd source. In the original product, still in their git repo, the function

setup_allrouters_membership()

is just a stub with a single "return 0;" statement. The code to actually join the all routers multicast group was added by the port maintainer and just recently improved/fixed by @franco.
https://svnweb.freebsd.org/ports/head/net/radvd/files/patch-device-bsd44.c?view=log


Second, from the control flow the function should be called only once at startup of the daemon when the interface is initialised. So I wonder why the group is joined repeatedly (?) until some kernel bug kicks in. Is that the case or did I completely misunderstand the problem?


The actual code to join the mcast group looks more or less identical for both, rtadvd's is here:
https://svnweb.freebsd.org/base/releng/12.1/usr.sbin/rtadvd/if.c?revision=352546&view=markup

It's essentially
setsockopt(sock, IPPROTO_IPV6, IPV6_JOIN_GROUP, &mreq, sizeof(mreq))


Kind regards,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Well I have it running on my test unit, no changes at present just a manual stop of radvd and manual start of rtadvd using the same config file etc, it's working. So I'll leave it running and see what happens. I think I'll do the same on my live router as that will get a lot more action. As a side note, radvd did fall over on my live machine this morning, so that proves the issue is still there, but it's very intermittent.
OPNsense 24.7 - Qotom Q355G4 - ISP - Squirrel 1Gbps.

Team Rebellion Member

Maybe a newbie question, but how would I set up a cronjob to restart radvd hourly from the UI?
I navigated to System/Settings/Cron but it seems to have only a list of predefined commands and doesn't allow custom commands. Or did I miss anything here?

thanks, Till
System1: Qotom Q310G4
System2: APU2C4

Quote from: skywalker007 on October 23, 2020, 11:23:19 AM
Maybe a newbie question, but how would I set up a cronjob to restart radvd hourly from the UI?
I navigated to System/Settings/Cron but it seems to have only a list of predefined commands and doesn't allow custom commands.
It came as a surprise to me too, that you cannot execute arbitrary commands via Cron, but here you go:

Create /usr/local/opnsense/service/conf/actions.d/actions_radvd.conf with e.g. this content:

[restart]
command:/usr/local/sbin/pluginctl -s radvd restart
type:script
description:Restart radvd


Enter this command:

service configd restart


Voila - new option in the Cron UI.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

thank you. Worked like a charm!
System1: Qotom Q310G4
System2: APU2C4



I suppose so.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Without looking at the logs I can't say for sure but I'm definitely still having what I assume is this problem. adding the cron job and we'll see how it goes.


Having the same issue with OPNsense 20.7.7_1-amd64 on APU4 hardware. Have setup the work-around with daily restart of radvd by cron for now (using the Cron UI as pointed out by pmhausen Reply #19 above https://forum.opnsense.org/index.php?topic=19032.msg90983#msg90983 ).

For those just joining the party, see https://github.com/opnsense/core/issues/4338#issuecomment-732397405

we have a working fix and pull request. running opnsense-patch 9a4a908 will replace radvd with rtadvd and seems to rectify the issue for everyone.

Quote from: samsonmcnulty on December 23, 2020, 06:21:31 PM
For those just joining the party, see https://github.com/opnsense/core/issues/4338#issuecomment-732397405

Thank you for the pointer.

The opnsense-patch 9a4a908 applied cleanly to OPNsense 20.7.7_1-amd64 and rtadvd is running since 15+ hours after reloading the WebUI and restarting the Router Advertisement Daemon manually, but executed no reboot so far, to avoid loss of connectivity and BGP route flaps upstream.

Now keeping an eye on it as rtadvd approaches the 20 hours mark whereabout radvd got stuck, started to fill the router log with its messages, and required a restart.

I've dug through kernel changes for FreeBSD 12 to find something that would indicate radvd stopped working the way it used to when we were still on 11. Although I'm not sure this isn't the new reality I can't say that moving radvd to rtadvd is the obvious solution if we unterstand that radvd works pretty much how we want to and all we did was move from FreeBSD 11 to 12.

I'm also trying a new approach for the BSD-based fix that FreeBSD carries exclusively (not part of upstream for diversity reasons most likely) more closely resembling the way that rtadvd handles its multicast group join internally.

Hopefully that will make radvd usable again in OPNsense in 2021 for the affected users.


Cheers,
Franco