Unbound service routinely stopping/crashing following 20.7.7 update

Started by deejacker, December 18, 2020, 09:22:56 AM

Previous topic - Next topic

[/quote]

Also, "break a device" is used opportunistically here. The device isn't bricked. The admin can still do something (if actually necessary, see first point).

Cheers,
Franco
[/quote]

My device was "not functional". I had to reinstall as it was not possible to login nor do anything as no screen would paint not anything else. I reinstalled and restored from a previous backup, choose not to use Unbound and things are back to normal.

I used my word choices on purpose as I want to make sure the feelings are conveyed as the update will no doubt cause problems for many other people until it is removed or a new update is provided. It's just bad form.






You should be able to ssh into the router still
you can then select the shell option
and issue   opnsense-revert -r 20.7.6 unbound
then reboot the router manually or with sudo reboot

Quote from: Archanfel80 on December 20, 2020, 09:45:33 PMSo its pretty much affect almost everyone, not just a few people.

I respectfully disagree with generalisation due to the aforementioned points.

Quote from: Archanfel80 on December 20, 2020, 09:45:33 PMDisabled unbound and using dnsmasq solve the issue.

Yes, that actually works, too.

Quote from: Animosity022 on December 20, 2020, 10:18:32 PMMy device was "not functional". I had to reinstall as it was not possible to login nor do anything as no screen would paint not anything else.

I thought we were talking about Unbound here. If this was Unbound it would to have caused a kernel panic and disintegrated the root file system due to a forced reboot. I have no reports that suggest that this is the case. That also would be scenario where a hotfix would be necessary if one existed for either Unbound or the kernel to stop crashing the OS itself.


Cheers,
Franco

Quote from: Animosity022 on December 20, 2020, 10:18:32 PMMy device was "not functional". I had to reinstall as it was not possible to login nor do anything as no screen would paint not anything else.

I thought we were talking about Unbound here. If this was Unbound it would to have caused a kernel panic and disintegrated the root file system due to a forced reboot. I have no reports that suggest that this is the case. That also would be scenario where a hotfix would be necessary if one existed for either Unbound or the kernel to stop crashing the OS itself.


Cheers,
Franco
[/quote]

No reports? So I'm making up my situation for what point? Others are making it up posting as well?

It didn't cause a kernel panic. It caused my system to be not functional as I said. Login screen didn't work. Bandwidth went to almost nothing as pages wouldn't display. I'd assume it was due to the repeated crashes and backlog on the system.

All in all, good luck as you tend to be combative with users that are reporting bugs and I'm moving on to a different, more stable solution when many users as seen in just this thread report something, you belittle our situation and refuse to listen to feedback.

Guys/Gals , please be courteous when reporting and be specific with the problem and make sure it's related to the open thread/topic.

Without giving more information how it crash/logs/etc , i dont think one can help much, except from guessing or by experience guessing.

For myself I dont have any issue at all with unbound since upgraded from 20.7 -> 20.7.6 -> 20.7.7_1. I only run max 23 hours per day and the system will be power-off. Since I saw the report of unbound terminate abnormally, i keep tab on the unbound processes & logs, it's been fine for me - no issue starting from cold boot, no issue of process terminating abnormally, no unbound SIGSEGV/segfault .

I do have custom setting for unbound; e.g.
Bind to certain interface only
DNS over TLS <few IPaddress@853>
DNSSEC enabled
Message Cache size 10MB
Access List <few internal IPaddress>


Of course I do want similar patched applied similar in freebsd ports (https://github.com/NLnetLabs/unbound/issues/376 OR https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=251821) if this fixes for certain scenario. If there is a need of example on how people log an issue , please refer to the links. They provide at least some log info, description, and possible with scenario, etc.


For the login screen that didnt work, I'm guessing seperate issue that was posted in this forum and/or, something related to HTTP redirect that is fixed on 20.7.7_1 OR related to https://forum.opnsense.org/index.php?topic=20514.0

In all best cases all update/upgrade works, some times we have to work with temporary solution and ofcourse have a permanent solution at a later time. OPNSense provide us an option to roll back , so i dont see any issue with it. I am in no position to say in what scenario warrant a pull of package / release, I'm confident OPNSense team will be able to make the right decision.

If there's something critical to your production and not able to single handed and deal with the situation , i guess it's best to subscribe to the business support.

p/s: my first post , yes i register just for purpose of posting in this specific thread. I dont usually want to post and i just wanted to read topic that interest me. what makes me post this , i guess i cannot escape from my conscious of xxxxxx - does not matters :)

I also have perfectly working systems upgraded from 20.7.6 to 20.7.7_1. Unbound working without issues. So, I can confirm that this issue is not on "every system".

I have learned this the hard way: always have an 1:1 test system if you are running a critical production system. And 1:1 includes: identical hardware, identical software and identical config. Otherwise, be prepared for issues whatever OS you are running.

Systems have so many applications interacting with each other that it is difficult to see in advance the possible problems. It might be a small config difference which causes the systems to behave differently after upgrades.

First I upgrade the test system and leave it running for a while. If it is running ok I continue to the production systems, gradually. This makes it possible to identify the issues and figure the workarounds if the software patches are not yet ready.

I know that in many systems upgrades are done on fly to production systems (hurry, money, whatever the reason is). To be honest, I would not like to use such systems, even for free. There are issues with every OS and software. But the worst problem is that the admin is not sure what he or she is doing.

Here's the latest Unbound revision 1 from FreeBSD ports to try:

# pkg add -f https://pkg.opnsense.org/FreeBSD:12:amd64/20.7/misc/unbound-1.13.0_1.txz

Quote from: guest15389 on December 21, 2020, 03:06:47 PMNo reports? So I'm making up my situation for what point? Others are making it up posting as well?

I think you are overreacting. If you reply to a thread about Unbound that your upgrade was bad that has nothing to do with Unbound. In fact, any update can be bad however small if the file system or disk disintegrates or file system full. Since I haven't seen a health audit I can't possibly say how bad it was.

Quote from: guest15389 on December 21, 2020, 03:06:47 PMyou belittle our situation and refuse to listen to feedback

Maybe that is true. But maybe listening goes both ways?

From day to day experience I just want to say that I have broken my production systems a number of times with preproduction testing. It's just the way it is and I am grateful for every bug that doesn't happen in production releases as some of them have forced a full reinstall these systems.


Cheers,
Franco

I updated 3 systems to 20.7.7_1 about 24 h ago, no problems with unbound here...
kind regards
chemlud
____
"The price of reliability is the pursuit of the utmost simplicity."
C.A.R. Hoare

felix eichhorns premium katzenfutter mit der extraportion energie

A router is not a switch - A router is not a switch - A router is not a switch - A rou....

Came here with this problem... unbound had been crashing roughly 6 times a day since the update.

I tried the unbound regression, but it may have not solved the problem as unbound fell over again about an hour later.

Will keep monitoring this thread and I'll be watching to see if the same problem keeps repeating.

I too had Unbound stop after running for a period of time, quickly after a reboot, then after restarting the service it ran for longer before stopping. All other functions seem to work normally, but anything that relied on DNS failed (obviously). For me reverting to the previous version has stopped the service from frequently stopping.

Is this related to:
https://github.com/NLnetLabs/unbound/issues/376
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=251821

(Found these threads before finding this thread here.)

Quote from: franco on December 21, 2020, 08:50:50 PM
Here's the latest Unbound revision 1 from FreeBSD ports to try:

# pkg add -f https://pkg.opnsense.org/FreeBSD:12:amd64/20.7/misc/unbound-1.13.0_1.txz

Cheers,
Franco

Updated unbound on my system to the provided revision. Will report back with feedback.

I only came here to say that I also have not experienced any issues with unbound crashing since upgrading to 20.7.7 on release day. I'm sorry to those that have had issues and based on the flurry of activity surrounding this, there clearly is an issue that is affecting *some* users. But not all users are having this problem.

I complained earlier about the "6 unbound failures a day". 

Just wanted to say that regressing to the previous version of unbound appears to have solved the problem.

Quote from: miruoy on December 22, 2020, 10:40:36 AM
Quote from: franco on December 21, 2020, 08:50:50 PM
Here's the latest Unbound revision 1 from FreeBSD ports to try:

# pkg add -f https://pkg.opnsense.org/FreeBSD:12:amd64/20.7/misc/unbound-1.13.0_1.txz

Cheers,
Franco

Updated unbound on my system to the provided revision. Will report back with feedback.

Unbound has been running stable for 24 hours now on the new revision. Issue appears resolved on my end.

Just to confirm I am observing this problem on 20.7.7 as well. I have reverted unbound as per the instructions above.