OPNsense Forum

English Forums => 23.7 Legacy Series => Topic started by: enpassant on November 22, 2023, 06:07:34 pm

Title: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: enpassant on November 22, 2023, 06:07:34 pm
I have 23.7.8 installed and had the same problem with 23.7.8_1. Periodically, the system will crash, and Unbound will cycle on and off, but reporting of Unbound will become inactive.

I have had these challenges over the last few weeks. Sometimes there's a crash, but other times not. I am unsure which logs are most helpful, but I am attaching a screenshot of the system-log files-general and the unbound reporting screen. Any help will be appreciated. I tried doing everything I knew before writing this post.
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: enpassant on November 22, 2023, 06:08:19 pm
Here's another screenshot.
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: CJ on November 22, 2023, 09:13:08 pm
Something is going on with your system.  What's going all with all of the arp failures?
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: enpassant on November 22, 2023, 09:50:25 pm
I have no idea. I am trying to avoid totally rebuilding the system. If there is a guide on best ways to rebuild after a failure, let me know.
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: knebb on November 23, 2023, 10:53:28 am
Hi,

I gues you configured something with an static ARP entry. The message you see above comes from arp trying to insert such an entry.

Check if you have something like this configured (and you possibly changed IPs since them).

See this thread (https://forum.opnsense.org/index.php?topic=33127.0), it might be helpful in finding such entries.

/KNEBB
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: franco on November 23, 2023, 11:05:00 am
Can you post the CPU info from the dashboard system widget?


Cheers,
Franco
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: enpassant on November 23, 2023, 04:58:09 pm
I sure can, @Franco. See below.

Yesterday, I reset the factory and applied a configuration backup. I checked the logs, and the ARP stuff is still happening.

@Knebb and @CJ, I set up all my static IPs with ARP. I thought that was best practice for security purposes. Let me know if I am wrong about that. At this point, I am just waiting for the next crash to get more information to share.
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: CJ on November 26, 2023, 06:13:46 pm
@Knebb and @CJ, I set up all my static IPs with ARP. I thought that was best practice for security purposes. Let me know if I am wrong about that. At this point, I am just waiting for the next crash to get more information to share.

I just use static leases.  You can configure them to automatically get added to Unbound DNS.
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: enpassant on November 26, 2023, 07:35:29 pm
Got it, CJ. I can remove that ARP designation in my static leases. Is there any benefit to ARP other than if I need it for WOL? At some point, when I first made my static leases, I checked that box and have not looked back since. I also don't know how long I have had these ARP errors.

I moved my DNS to AdGuard Home on a docker in hopes of addressing the Unbound crashing. I have it using Unbound upstream, removed blocklisting from Unbound, and am letting AdGuard do the filtering. I hope this solves my problem until I figure out the configuration that will keep my system stable over the long run. I will keep folks posted.
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: knebb on December 03, 2023, 10:33:58 am
Any updates?

I guess it was related to the static arp.

Static ar is only needed in very rare circumstances and usually only for very few hosts. So recommendation is to NOT use static arp unless you really know what you are doing.

Use static DHCP mappings and it should be fine!

/KNEBB


Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: enpassant on December 03, 2023, 04:56:52 pm
Update: I think there were two separate issues. I did remove the ARP entries. Even when unticking ARP on the main page of the DHCP interfaces, I still had to go and remove it from each static entry. I kept getting errors and I am still watching.

Also, I stopped using Unbound for my filtering. I have seen other posts about folks losing functionality after 23.7.5 or so, which correlated with my crashes. I am now using AdGuard in a docker. It’s working better but wish I had all the network management happening in OPNsense.

This is much more stable now and have had no crashes or freezes since the changes. I will report back if there are other issues.
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: Patrick M. Hausen on December 03, 2023, 05:09:45 pm
You can run AdGuard Home on OPNsense. There's a plugin in the community repository.
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: CJ on December 05, 2023, 07:56:19 pm
Also, I stopped using Unbound for my filtering. I have seen other posts about folks losing functionality after 23.7.5 or so, which correlated with my crashes. I am now using AdGuard in a docker. It’s working better but wish I had all the network management happening in OPNsense.

I'm using Unbound with DoT and DNSBL with the latest OPNsense version with no problems.  I don't think the people having Unbound issues are the majority of folks.
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: enpassant on December 07, 2023, 04:39:40 pm
Also, I stopped using Unbound for my filtering. I have seen other posts about folks losing functionality after 23.7.5 or so, which correlated with my crashes. I am now using AdGuard in a docker. It’s working better but wish I had all the network management happening in OPNsense.

I'm using Unbound with DoT and DNSBL with the latest OPNsense version with no problems.  I don't think the people having Unbound issues are the majority of folks.
Got it. I am continuing to troubleshoot my setup. I had another crash overnight. Unbound was unresponsive and I had to reboot to get back up this morning.
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: enpassant on December 25, 2023, 05:59:15 pm
Hey Y'all,

I am coming back to this post after rebuilding my system. I reinstalled everything from the beginning, rebuilt the interfaces and DHCP settings, rebuilt all static addresses without ARP, etc. I am still getting periodic crashes every couple of days where Unbound becomes unresponsive; however, the other interfaces that do not use Unbound work when the Unbound interfaces crash.

I am no longer doing filtering with Unbound. I am using AGH and forwarding to Unbound for upstream DNS. I am copying the log entries around a crash that happened this morning. Let me know if you see anything that could help further diagnose this issue. Thanks to anyone who can help. This has become frustrating, especially after spending so much time rebuilding everything from everything from scratch. I don't know what I am doing wrong.

One more thing: I am using DoT settings in Unbound for Quad9 on port 853. I am not sure if that makes a difference.

I am also at the point of trying to use BIND as an alternative to Unbound, but the learning curve seems a bit intense for me right now. I will take any help offered. Thank you!

---

2023-12-25T04:41:56-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: ROUTING: entering configure using 'opt1'   
2023-12-25T04:41:55-08:00   Notice   kernel   <6>igb2: link state changed to UP   
2023-12-25T04:41:55-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for opt1(igb2)   
2023-12-25T04:41:52-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for opt1(igb2)   
2023-12-25T04:41:51-08:00   Notice   kernel   <6>igb2: link state changed to DOWN   
2023-12-25T04:39:08-08:00   Error   configctl   error in configd communication Traceback (most recent call last): File "/usr/local/sbin/configctl", line 66, in exec_config_cmd line = sock.recv(65536).decode() socket.timeout: timed out   
2023-12-25T04:37:09-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dns (execute task : unbound_configure_do())   
2023-12-25T04:37:08-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dns (execute task : dnsmasq_configure_do())   
2023-12-25T04:37:08-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dns ()   
2023-12-25T04:37:08-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dhcp (execute task : dhcpd_dhcp_configure())   
2023-12-25T04:37:08-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dhcp ()   
2023-12-25T04:37:08-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure ipsec (execute task : ipsec_configure_do(,opt1))   
2023-12-25T04:37:08-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure ipsec (,opt1)   
2023-12-25T04:37:08-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: ROUTING: entering configure using 'opt1'   
2023-12-25T04:37:08-08:00   Notice   kernel   <6>igb2: link state changed to UP   
2023-12-25T04:37:08-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for opt1(igb2)   
2023-12-25T04:37:04-08:00   Notice   kernel   <6>igb2: link state changed to DOWN   
2023-12-25T04:37:04-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for opt1(igb2)   
2023-12-25T04:35:31-08:00   Error   configctl   error in configd communication Traceback (most recent call last): File "/usr/local/sbin/configctl", line 66, in exec_config_cmd line = sock.recv(65536).decode() socket.timeout: timed out   
2023-12-25T04:34:00-08:00   Notice   send_telemetry.py   telemetry data collected 14 records in 0.10 seconds @2023-12-25 12:33:28.457554   
2023-12-25T04:33:32-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dns (execute task : unbound_configure_do())   
2023-12-25T04:33:31-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dns (execute task : dnsmasq_configure_do())   
2023-12-25T04:33:31-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dns ()   
2023-12-25T04:33:31-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dhcp (execute task : dhcpd_dhcp_configure())   
2023-12-25T04:33:31-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dhcp ()   
2023-12-25T04:33:31-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure ipsec (execute task : ipsec_configure_do(,opt1))   
2023-12-25T04:33:31-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure ipsec (,opt1)   
2023-12-25T04:33:31-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: ROUTING: entering configure using 'opt1'   
2023-12-25T04:33:31-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for opt1(igb2)   
2023-12-25T04:33:31-08:00   Notice   kernel   <6>igb2: link state changed to UP   
2023-12-25T04:33:24-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for opt1(igb2)   
2023-12-25T04:33:23-08:00   Notice   kernel   <6>igb2: link state changed to DOWN   
2023-12-25T04:33:23-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dns (execute task : unbound_configure_do())   
2023-12-25T04:33:22-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dns (execute task : dnsmasq_configure_do())   
2023-12-25T04:33:22-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dns ()   
2023-12-25T04:33:22-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dhcp (execute task : dhcpd_dhcp_configure())   
2023-12-25T04:33:22-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure dhcp ()   
2023-12-25T04:33:22-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure ipsec (execute task : ipsec_configure_do(,opt1))   
2023-12-25T04:33:22-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: plugins_configure ipsec (,opt1)   
2023-12-25T04:33:22-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: ROUTING: entering configure using 'opt1'   
2023-12-25T04:33:22-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for opt1(igb2)   
2023-12-25T04:33:21-08:00   Notice   kernel   <6>igb2: link state changed to UP   
2023-12-25T04:33:18-08:00   Notice   kernel   <6>igb2: link state changed to DOWN   
2023-12-25T04:33:18-08:00   Notice   opnsense   /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for opt1(igb2)   
2023-12-25T04:31:00-08:00   Notice   root   reload filter for configured schedules
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: cookiemonster on December 25, 2023, 10:32:57 pm
Code: [Select]
Notice   kernel   <6>igb2: link state changed to DOWN This is the most relevant entry (entries) on this log snippet. It tells that the interface for some reason, became "detached" from the system. Entries for services going down and up following the interface what we would expect.
As to why the interface is going down is the next step in the diagnostic.
It can be a physical problem like a loose connection but doesn't have to be.
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: enpassant on December 26, 2023, 03:58:14 pm
Code: [Select]
Notice   kernel   <6>igb2: link state changed to DOWN This is the most relevant entry (entries) on this log snippet. It tells that the interface for some reason, became "detached" from the system. Entries for services going down and up following the interface what we would expect.
As to why the interface is going down is the next step in the diagnostic.
It can be a physical problem like a loose connection but doesn't have to be.
Thank you for the reply. I will check with the vendor and do hardware tests.
Title: Re: 23.7.8 crashes periodically and unbound reporting becomes inactive. Need assist.
Post by: enpassant on January 10, 2024, 05:42:28 pm
I am returning to this to let folks know I found a solution. The combination of these has my network back to normal and no crashes or errors in the system logs lately.

1. I am now using Suricata on the LAN and WAN interfaces.
2. I am now using AdGuard Home to do filtering with Unbound as the upstream resolver.
3. I was getting some interface flapping on igb2, my management NIC. I deleted that interface and moved my management network to a VLAN.
4. I was getting a wireguard interface loop, so I set up a restart instance in Monit using these instructions: https://forum.opnsense.org/index.php?topic=35919.0

Since making these changes, Unbound has been working smoothly - no issues, no crashes, and reporting is working. I also enjoy the AdGuard interface for filtering. Problems solved.