OPNsense Forum

Archive => 20.7 Legacy Series => Topic started by: Greelan on December 28, 2020, 01:43:26 AM

Title: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: Greelan on December 28, 2020, 01:43:26 AM
I've noticed a quirky issue that I'm hopeful someone has seen before or can suggest troubleshooting steps for. I've searched online but can't find any similar situations.

Almost exactly every 48 hours (around 3am every second day), I see a LAN detached event in OPNsense's logs, for example:

2020-12-25T03:00:14 opnsense[939] /usr/local/etc/rc.linkup: DEVD Ethernet detached event for lan

This causes dhcp6c to restart, so it goes through its process of sending a release, soliciting for an IPv6 address/prefix on the WAN, and getting an advertise of WAN GUA and prefix. Sometimes it gets to the point of requesting the address/prefix. It is interrupted though by another detached event, this time on one of the VLAN interfaces (say OPT1) that is on the same interface as the LAN.

The process then repeats, cycling through every VLAN (OPT2, OPT3 etc).

Then there is a series of attached events, again for the LAN interface and every VLAN in sequence. For each attached event, dhcp6c restarts and goes through its process (or part of its process).

This whole process of detached and attached events then repeats itself, sometimes once, sometimes two or more times.

This all lasts maybe 30 to 45 seconds in the logs. Most times it stops after a while and everything seems to return to normal.

But on occasion it causes dhcp6c to fail. A few minutes after the attached/detached events cycle stops, dhcp6c reports an "XID mismatch", and then dhclient goes into a cycle of "Creating resolv.conf" every 15 minutes.

The end result is that the WAN GUA and prefix disappear, and there is no external IPv6 connectivity. IPv4 is unaffected.

Any ideas?

Versions:
OPNsense 20.7.7_1-amd64
FreeBSD 12.1-RELEASE-p11-HBSD
OpenSSL 1.1.1i 8 Dec 2020
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: Greelan on December 31, 2020, 06:49:27 AM
Hoping someone has some thoughts on this - happened again this morning on cue and I lost external IPv6 connectivity. Seems now to happen every 4 or 6 days

There must be something that is running to a schedule that causes the LAN detached event to happen on such a specific timetable. But I don't know whether it is OPNsense or something else
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: Greelan on January 06, 2021, 01:58:45 AM
I've now resorted to running a custom cronjob a bit after 3am each day to check for external IPv6 connectivity and if there is none to restart dhcp6c. Bit hacky, but means I don't need to check and restart manually every few days.

I really would like to solve the underlying issue though. Still no-one out there with any thoughts?
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: donatom3 on January 10, 2021, 06:12:34 PM
Mind sharing the cron job. I too have noticed the ipv6 disconnection issue and having to manually restart radvd to get external ipv6 working again. I never dug into the logs to see what was happening when it started.
I'll keep an eye for it.
Title: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: Greelan on January 10, 2021, 09:05:31 PM
No problem. My script is a simplified version of something marjohn56 posted in the forum for an unrelated but similar issue. I run the cronjob every minute for 5 minutes from 3.03am each day.

Contents of /usr/local/sbin/ping6_check.sh:

#!/bin/sh
# Script to test IPv6 connectivity and restart dhcp6c if necessary

# Try a few pings to Cloudflare's IPv6 servers.
# Quit immediately if we get a single frame back.
# If neither server responds at all then restart dhcp6c.

counting=$(ping6 -o -c 10 2606:4700:4700::1111 | grep 'received' | awk -F',' '{ print $2 }' | awk '{ print $1 }')

if [ $counting -eq 0 ]; then

  counting=$(ping6 -o -c 10 2606:4700:4700::1001 | grep 'received' | awk -F',' '{ print $2 }' | awk '{ print $1 }')

  if [ $counting -eq 0 ]; then

    # Restart dhcp6c

    service dhcp6c restart

  fi
fi


Contents of /usr/local/opnsense/service/conf/actions.d/actions_ping6_check.conf:

[load]
command:/usr/local/sbin/ping6_check.sh
parameters:
type:script
message:starting IPv6 connectivity check
description:Run IPv6 check


After setting these up, as root run the following to get the job to appear in the cronjob list in the GUI:

service configd restart

On a hunch yesterday I had an idea that Sensei might be behind this behaviour - for example it might be doing some sort of refresh or update or health check every 2 days. I had Sensei configured on the LAN interface. As I check I have disabled it for the time being and will see whether the behaviour continues.

After my hunch a search revealed your post about Sensei and IPv6 (https://forum.opnsense.org/index.php?topic=9521.msg55708#msg55708) which show logs somewhat similar to what I have been seeing. Which makes me think my hunch may be right.
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: Greelan on January 11, 2021, 10:37:34 PM
Can confirm it is Sensei causing this issue. I was "due" for a LAN detached/attached sequence this morning and it didn't occur with Sensei off

I will raise this with the Sensei folks
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: mb on January 22, 2021, 03:57:25 AM
Hi @Greelan, thanks for the heads-up!.

Yes we confirm that this is due to Sensei engine restarting at 3am. This was a dirty workaround to avoid some nasty netmap bugs.Basically a restart of the process also refreshed netmap internal data structures. When the process exits / starts, hence the netmap closes/opens the interface causing interface down/ups events.

Since netmap is a lot stable now, we beleive we don't need this anymore. We've have removed this in the upcoming 1.7 release.

1.7 is scheduled for tomorrow/this weekend. Stay tuned.
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: Greelan on January 22, 2021, 04:02:25 AM
@mb, aah, I see. Thanks for the explanation. "Dirty workaround" indeed! Particularly as the interfaces went down and up multiple times, which I think is what led to dhcp6c borking.

I look forward to the new release, and assuming all is well being able to re-enable the Sensei engine.
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: Greelan on January 26, 2021, 12:40:47 PM
Have now installed Sensei 1.7 on OPNsense 20.7.8 and re-enabled it. Will monitor and report back if anything negative. Thanks
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: mb on January 26, 2021, 04:34:00 PM
@Greelan, thanks, looking forward to it.
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: Greelan on January 27, 2021, 12:19:50 AM
Dumb question. Now that Security, App Controls and Web Controls have been merged under Policies, do I need to separately configure the default policy to select the interface and the VLANs on that interface? Previously I'd understood that it was sufficient simply to select the parent protected interface (LAN in my case) under Configuration > General. Currently under Policy Configuration in the default policy the LAN interface is not selected and no VLANs are specified. Thanks
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: mb on January 27, 2021, 01:01:40 AM
@Greelan, not really. Default policy already matches everything that passes through your protected interfaces which  you've set up in Configuration.
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: Greelan on January 27, 2021, 01:04:31 AM
Got it, thanks. Presumably then the options in the policy could be used to exclude VLANs if desired, or is that only possible with the premium edition?
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: mb on January 27, 2021, 01:07:05 AM
Yes, correct. That requires the creation of additional policies which are part of paid subscriptions.
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: Greelan on January 27, 2021, 01:07:17 AM
Also, are you saying that VLANs aren't protected unless specifically selected under Configuration? I'd understood previously that only the parent interface needed to be selected in order to protect VLANs, and indeed that it was not desirable to select VLANs specifically as that could cause issues
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: mb on January 27, 2021, 01:09:07 AM
You are right. If you select the parent interface, you are also protecting the vlans on it.
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: Greelan on January 27, 2021, 01:11:13 AM
Thanks, @mb, appreciate the input
Title: Re: Regular LAN detached event, sometimes results in failure of resolv.conf and IPv6
Post by: mb on January 27, 2021, 01:13:22 AM
Always a pleasure