radvd crash loop

Started by alouch, August 18, 2020, 06:58:45 PM

Previous topic - Next topic
Hello, and please excuse me if this is not in the correct section.

I'm facing an issue with my setup, which I hoped would disappear with the upgrade to 20.7 but it's still there and it's driving me nuts.
I'm not really sure when it all started, but somewhere after the 20.1 I think.
I have a working v4/v6 setup, but sometimes i'm loosing v6 connectivity because radvd enters a crash loop, and I have to restart the service for my lan devices to get an IPv6 again.

I have absolutely no clue on what might be the issue.
I'm confident enough with the OS to do some debug, but I would gladly have advices on what/where to look for.

Here you can see the /var/log/routing.log, it's stop/start looping for several minutes, then settle (but not working as there are no RA on the LAN), until I finally restart from the GUI, and my LAN servers receive their IPv6

Aug 18 10:52:49 alhena radvd[6106]: version 2.18 started
Aug 18 10:52:54 alhena radvd[55578]: exiting, 1 sigterm(s) received
Aug 18 10:52:54 alhena radvd[55578]: sending stop adverts
Aug 18 10:52:54 alhena radvd[55578]: removing /var/run/radvd.pid
Aug 18 10:52:54 alhena radvd[55578]: returning from radvd main
Aug 18 10:52:54 alhena radvd[57388]: version 2.18 started
Aug 18 10:53:07 alhena radvd[76325]: exiting, 1 sigterm(s) received
Aug 18 10:53:07 alhena radvd[76325]: sending stop adverts
Aug 18 10:53:07 alhena radvd[76325]: removing /var/run/radvd.pid
Aug 18 10:53:07 alhena radvd[76325]: returning from radvd main
Aug 18 10:53:07 alhena radvd[23393]: version 2.18 started
Aug 18 10:53:09 alhena radvd[34864]: exiting, 1 sigterm(s) received
Aug 18 10:53:09 alhena radvd[34864]: sending stop adverts
Aug 18 10:53:09 alhena radvd[34864]: removing /var/run/radvd.pid
Aug 18 10:53:09 alhena radvd[34864]: returning from radvd main
Aug 18 10:53:09 alhena radvd[2912]: version 2.18 started
Aug 18 10:53:11 alhena radvd[52245]: exiting, 1 sigterm(s) received
Aug 18 10:53:11 alhena radvd[52245]: sending stop adverts
Aug 18 10:53:12 alhena radvd[52245]: removing /var/run/radvd.pid
Aug 18 10:53:12 alhena radvd[52245]: returning from radvd main
Aug 18 10:53:12 alhena radvd[49668]: version 2.18 started
Aug 18 10:56:48 alhena radvd[68189]: exiting, 1 sigterm(s) received
Aug 18 10:56:48 alhena radvd[68189]: sending stop adverts
Aug 18 10:56:48 alhena radvd[68189]: removing /var/run/radvd.pid
Aug 18 10:56:48 alhena radvd[68189]: returning from radvd main
Aug 18 10:56:48 alhena radvd[92788]: version 2.18 started
Aug 18 10:56:50 alhena radvd[23062]: exiting, 1 sigterm(s) received
Aug 18 10:56:50 alhena radvd[23062]: sending stop adverts
Aug 18 10:56:50 alhena radvd[23062]: removing /var/run/radvd.pid
Aug 18 10:56:50 alhena radvd[23062]: returning from radvd main
Aug 18 10:56:51 alhena radvd[73169]: version 2.18 started
Aug 18 10:56:55 alhena radvd[26003]: exiting, 1 sigterm(s) received
Aug 18 10:56:55 alhena radvd[26003]: sending stop adverts
Aug 18 10:56:55 alhena radvd[26003]: sendmsg: Network is down
Aug 18 10:56:55 alhena radvd[26003]: sendmsg: Network is down
Aug 18 10:56:55 alhena radvd[26003]: sendmsg: Network is down
Aug 18 10:56:55 alhena radvd[26003]: removing /var/run/radvd.pid
Aug 18 10:56:55 alhena radvd[26003]: returning from radvd main
Aug 18 10:56:56 alhena radvd[42053]: version 2.18 started
Aug 18 10:56:56 alhena radvd[58343]: sendmsg: Network is down
Aug 18 10:56:56 alhena radvd[58343]: sendmsg: Network is down
Aug 18 10:56:56 alhena radvd[58343]: sendmsg: Network is down
Aug 18 10:57:06 alhena radvd[58343]: exiting, 1 sigterm(s) received
Aug 18 10:57:06 alhena radvd[58343]: sending stop adverts
Aug 18 10:57:06 alhena radvd[58343]: removing /var/run/radvd.pid
Aug 18 10:57:06 alhena radvd[58343]: returning from radvd main
Aug 18 10:57:06 alhena radvd[99393]: version 2.18 started
Aug 18 10:57:08 alhena radvd[52398]: exiting, 1 sigterm(s) received
Aug 18 10:57:08 alhena radvd[52398]: sending stop adverts
Aug 18 10:57:08 alhena radvd[52398]: removing /var/run/radvd.pid
Aug 18 10:57:08 alhena radvd[52398]: returning from radvd main
Aug 18 10:57:08 alhena radvd[79273]: version 2.18 started
Aug 18 10:57:11 alhena radvd[31680]: exiting, 1 sigterm(s) received
Aug 18 10:57:11 alhena radvd[31680]: sending stop adverts
Aug 18 10:57:11 alhena radvd[31680]: removing /var/run/radvd.pid
Aug 18 10:57:11 alhena radvd[31680]: returning from radvd main
Aug 18 10:57:11 alhena radvd[87462]: version 2.18 started
Aug 18 10:57:20 alhena radvd[9061]: exiting, 1 sigterm(s) received
Aug 18 10:57:20 alhena radvd[9061]: sending stop adverts
Aug 18 10:57:20 alhena radvd[9061]: removing /var/run/radvd.pid
Aug 18 10:57:20 alhena radvd[9061]: returning from radvd main
Aug 18 10:57:20 alhena radvd[51268]: version 2.18 started
Aug 18 10:57:22 alhena radvd[55961]: exiting, 1 sigterm(s) received
Aug 18 10:57:22 alhena radvd[55961]: sending stop adverts
Aug 18 10:57:22 alhena radvd[55961]: removing /var/run/radvd.pid
Aug 18 10:57:22 alhena radvd[55961]: returning from radvd main
Aug 18 10:57:22 alhena radvd[35898]: version 2.18 started
Aug 18 10:57:25 alhena radvd[77070]: exiting, 1 sigterm(s) received
Aug 18 10:57:25 alhena radvd[77070]: sending stop adverts
Aug 18 10:57:25 alhena radvd[77070]: removing /var/run/radvd.pid
Aug 18 10:57:25 alhena radvd[77070]: returning from radvd main
Aug 18 10:57:25 alhena radvd[64564]: version 2.18 started
Aug 18 10:57:33 alhena radvd[86410]: exiting, 1 sigterm(s) received
Aug 18 10:57:33 alhena radvd[86410]: sending stop adverts
Aug 18 10:57:33 alhena radvd[86410]: removing /var/run/radvd.pid
Aug 18 10:57:33 alhena radvd[86410]: returning from radvd main
Aug 18 10:57:33 alhena radvd[25581]: version 2.18 started
Aug 18 10:57:36 alhena radvd[39425]: exiting, 1 sigterm(s) received
Aug 18 10:57:36 alhena radvd[39425]: sending stop adverts
Aug 18 10:57:36 alhena radvd[39425]: removing /var/run/radvd.pid
Aug 18 10:57:36 alhena radvd[39425]: returning from radvd main
Aug 18 10:57:36 alhena radvd[43023]: version 2.18 started
Aug 18 10:57:38 alhena radvd[81747]: exiting, 1 sigterm(s) received
Aug 18 10:57:38 alhena radvd[81747]: sending stop adverts
Aug 18 10:57:38 alhena radvd[81747]: removing /var/run/radvd.pid
Aug 18 10:57:38 alhena radvd[81747]: returning from radvd main
Aug 18 10:57:38 alhena radvd[40649]: version 2.18 started
[At this moment I noticed that it wasn't working and restarted from the GUI]
Aug 18 13:32:31 alhena radvd[87591]: exiting, 1 sigterm(s) received
Aug 18 13:32:31 alhena radvd[87591]: sending stop adverts
Aug 18 13:32:31 alhena radvd[87591]: removing /var/run/radvd.pid
Aug 18 13:32:31 alhena radvd[87591]: returning from radvd main
Aug 18 13:32:31 alhena radvd[13600]: version 2.18 started



Thanks

Not sure what the solution to this is, but you could enable manual RA settings for that interface under interfaces.  When you enable this under interface, you then have some options under services via DHCPv6 and Router Advertisements.  I've been looking at radvd for another issue...not making headway...but it's not crashing.

HP T730/AMD  RX-427BB/8GB/500GB SSD
HP NC365T 4-PORT

Hi there,

Without system log of the same time period this is impossible to analyse.


Cheers,
Franco

August 19, 2020, 09:47:50 AM #3 Last Edit: August 19, 2020, 09:58:20 AM by alouch
Thanks for the reply gpb and franco.

I should have detailed the setup a bit more.

My wan interface (igb0_vlan832) obtains IPv4/6 by DHCP
LAN interface (igb1 physical nic, with several trunked vlan) : Static IPv4 for all and "track interface" (wan) for IPv6 for 3 of them
Then Router Advertisements is configured as "unmanaged" (SLAAC) for these 3 vlan.

I've looked at the system log during the event, and I noticed a link state change on my lan port which I can't explain. Could it be an hardware issue ?
The box running opnsense is a well-known Qotom device, having 4 Intel I211 nics. So using the igb driver, and according https://www.freebsd.org/cgi/man.cgi?igb(4) I enabled Checksum offload, TSO, and HW vlan acceleration.

The system.log grep'd on the same timeframe is too verbose to put it here, so please find it on posted on pastebin :
https://pastebin.com/84jXVBTz

First thing I'd try is just disabling the offloading hw acceleration as recommended on that page.  If you still have the problem then you can work forward from there.

https://docs.opnsense.org/manual/interfaces_settings.html
HP T730/AMD  RX-427BB/8GB/500GB SSD
HP NC365T 4-PORT

Thanks gpb,

I'll try that and report back. I put faith in the driver support from freebsd, but that may be broken indeed.

Hello,

just to report that so far so good. No more crash of the interface or radvd since the hardware accel was disabled on the interface.
When I'll have more time, I may dig into this to check which option is causing this. It's a shame that the freebsd manpage for this module advertise the hardware/software compatibility for these options though.

Anyway, thanks to both of you.