unbound DNS seemt to stop resolving DNS

Started by AndroGen, October 13, 2024, 10:31:51 AM

Previous topic - Next topic
October 13, 2024, 10:31:51 AM Last Edit: October 13, 2024, 04:30:18 PM by AndroGen
Hi All,

Please accept apologies, as I am relatively new to OPNsense (in a process of migration from pfSense).
After the upgrade to 24.7.6 Unbound DNS stops resolving.

If I use DNS ip addresses in the System -> Settings -> General, and deactivate Unbound DNS.
All seems to be working.

if I remove DNS IPs from General Settings and activate Unbound DNS - DNS resolver does not work.
This was not the problem before the update.

What has changed prior to this:
I wanted to activate ClavAM, but system stated that it needs to be updated first.
I did update the system, rebooted, installed ClavAM, C-ICAP and rebooted.
after a short period if time iftop (running on the console) stopped resolving names and started showing only IPs, then access to internet dropped.

Stopping ClamAV, C-ICAP did not help.
I've removed all rules, NAT DNS redirection, Blacklists on Unbound DNS - nothing helps.

Where and how to troubleshoot the problem - here I need helps as I am a bit lost now.

Further details (added):

Trying to resolve DNS name this is what happens:


root@<OPNSENSE>:~ # nslookup www.cnn.com
;; Got SERVFAIL reply from 127.0.0.1
Server:         127.0.0.1
Address:        127.0.0.1#53

** server can't find www.cnn.com: SERVFAIL


during this run log (Level 3) captures following:

root@<OPNSENSE>:/var/log/resolver # tail -n 20 -f ./resolver_20241013.log
<31>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66103"] [86469:2] debug: iterator[module 1] operate: extstate:module_wait_reply event:module_event_noreply
<30>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66104"] [86469:2] info: iterator operate: query api.crowdsec.net.<DOMA.IN>. A IN
<30>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66105"] [86469:2] info: processQueryTargets: api.crowdsec.net.<DOMA.IN>. A IN
<31>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66106"] [86469:2] debug: configured stub or forward servers failed -- returning SERVFAIL
<31>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66107"] [86469:2] debug: return error response SERVFAIL
<31>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66108"] [86469:2] debug: cache memory msg=133580 rrset=132184 infra=11490 val=0
<30>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66109"] [86469:2] info: 192.168.18.60 debug.opendns.com. TXT IN
<31>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66110"] [86469:2] debug: worker request: max UDP reply size modified (1280 to max-udp-size)
<31>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66111"] [86469:2] debug: iterator[module 1] operate: extstate:module_state_initial event:module_event_pass
<30>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66112"] [86469:2] info: resolving debug.opendns.com. TXT IN
<30>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66113"] [86469:2] info: processQueryTargets: debug.opendns.com. TXT IN
<30>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66114"] [86469:2] info: sending query: debug.opendns.com. TXT IN
<31>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66115"] [86469:2] debug: sending to target: <.> 149.112.112.112#853
<31>1 2024-10-13T16:00:43+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66116"] [86469:2] debug: cache memory msg=133580 rrset=132184 infra=11490 val=0
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66117"] [86469:2] info: 127.0.0.1 www.cnn.com. A IN
<31>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66118"] [86469:2] debug: iterator[module 1] operate: extstate:module_state_initial event:module_event_pass
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66119"] [86469:2] info: resolving www.cnn.com. A IN
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66120"] [86469:2] info: processQueryTargets: www.cnn.com. A IN
<31>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66121"] [86469:2] debug: configured stub or forward servers failed -- returning SERVFAIL
<31>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66122"] [86469:2] debug: return error response SERVFAIL
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66123"] [86469:2] info: dnsbl_module: attempting to open pipe
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66124"] [86469:2] info: dnsbl_module: successfully opened pipe
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66125"] [86469:2] info: 127.0.0.1 www.cnn.com. A IN SERVFAIL 0.000000 0 29
<31>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66126"] [86469:2] debug: cache memory msg=133769 rrset=132184 infra=11490 val=0
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66127"] [86469:2] info: 127.0.0.1 1.opnsense.pool.ntp.org.<DOMA.IN>. A IN
<31>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66128"] [86469:2] debug: iterator[module 1] operate: extstate:module_state_initial event:module_event_pass
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66129"] [86469:2] info: resolving 1.opnsense.pool.ntp.org.<DOMA.IN>. A IN
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66130"] [86469:2] info: processQueryTargets: 1.opnsense.pool.ntp.org.<DOMA.IN>. A IN
<31>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66131"] [86469:2] debug: configured stub or forward servers failed -- returning SERVFAIL
<31>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66132"] [86469:2] debug: return error response SERVFAIL
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66133"] [86469:2] info: 127.0.0.1 1.opnsense.pool.ntp.org.<DOMA.IN>. A IN SERVFAIL 0.000000 0 51
<31>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66134"] [86469:2] debug: cache memory msg=133980 rrset=132184 infra=11490 val=0
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66135"] [86469:3] info: 127.0.0.1 1.opnsense.pool.ntp.org.<DOMA.IN>. A IN
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66136"] [86469:3] info: 127.0.0.1 1.opnsense.pool.ntp.org.<DOMA.IN>. A IN SERVFAIL 0.000000 1 51
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66137"] [86469:2] info: 127.0.0.1 1.opnsense.pool.ntp.org.<DOMA.IN>. AAAA IN
<31>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66138"] [86469:2] debug: iterator[module 1] operate: extstate:module_state_initial event:module_event_pass
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66139"] [86469:2] info: resolving 1.opnsense.pool.ntp.org.<DOMA.IN>. AAAA IN
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66140"] [86469:2] info: processQueryTargets: 1.opnsense.pool.ntp.org.<DOMA.IN>. AAAA IN
<31>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66141"] [86469:2] debug: configured stub or forward servers failed -- returning SERVFAIL
<31>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66142"] [86469:2] debug: return error response SERVFAIL
<30>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66143"] [86469:2] info: 127.0.0.1 1.opnsense.pool.ntp.org.<DOMA.IN>. AAAA IN SERVFAIL 0.000000 0 51
<31>1 2024-10-13T16:00:46+02:00 <OPNSENSE>.<DOMA.IN> unbound 86469 - [meta sequenceId="66144"] [86469:2] debug: cache memory msg=134191 rrset=132184 infra=11490 val=0
^C
root@<OPNSENSE>:/var/log/resolver #


Names of the server <OPNSENSE> and domains <DOMA.IN> are replaced.

this might be a clue to the problem:
debug: configured stub or forward servers failed -- returning SERVFAIL

Any idea what could it be? Where to look at?


please try restore unbound config only in firmware backup. I suffered this problem when i updated from 24.7.5.
After restoration, my opnsense and unbound back to normal.

an update on the progress. It might be helpful in case you might face the same issue.

The story with a bit of guessing (as not enough knowledge / evidence in some areas ):

I've taken most of my block lists uses in pfSense installation and applied them to new OPNsense set-up.
The number of these lists is relatively big (many are overlapping, but still add some additional filters).

I've added them to Services -> Unbound DNS -> Blacklist
Their upload took rather long time and.. (here guessing) this download and/or construction of "consolidated" blacklist has failed.
Whether this has happened before or after I've added my (also long) whitelist entries - I do not know.

The fact is:
This led to the inconsistency (which I could not localize exactly), but this resulted in Unbound DNS not working properly, or better to say at all.
OPNsense was redirecting DNS requests to "default" DNSs maintained in System -> Settings -> General
(if no DNS there - name resolution was completely failing)

What is did to get it "resolved":

  • cleaned Unbound DNS settings (removing other than existing, standard single selected blacklist there)
  • removed or deactivated most of rules
  • made a backup
  • reinstalled OPNsense from scratch
  • updated newly installed OPNsense
  • restored the configuration
  • remediated plugins
Since then system seems to be working (monitoring the names resolution using iftop in console).

Difference to what was before re-installation - memory consumption has dropped significantly.
Performance seems to be also back to normal (before that it was very sluggish)

Questions (to get a bit more insights):

  • Where and how does OPNsence store downloaded blocklists?
  • Is there any way to add whitelist not using the UI, but rather a file on the disk?
  • What is correct format of this whitelist file / entries? (searching on this topic has not provided much insights)
  • How to make update of the block lists a bit more time distributed?
    (in pfsense each individual blocklist could have individual download schedule, allowing this process being more distributed in time)
any guidance on how to troubleshoot such issues (at least link to already conducted discussions on this topic) would be very helpful and appreciated.

and - the documentation could be a bit enhanced - the information related to the backlisting is rather short, what leaves a lot of questions, especially for newcomers.

Quote from: raywan on October 13, 2024, 07:58:48 PM
please try restore unbound config only in firmware backup.
Thanks for the suggestion.
Question: How to restore it?

Actually, i used pfsense more than 10 years then switch to opnsense before few months ago. I also have my own blocklist used in pfsense but i didn't import from pfsense to opnsense unbound because i definitely sure it is not 100% compatible with opnsense unbound. So, i copy all the blocklist and firewall rules into my notepad then create all the rules one by one. It takes few hours for me to migrate all the pfsense setting to opnsense unbound.
Now, my firewall table entries and unbound blocklist are 4618097 and 2876467 respectively are running blazingly fast with n305 cpu and 32GB DDR5.
Go back to you question, before you do any changes of setting, please go to system=>configurations=>backups. you can backup the null or default setting into your hard driver before you start to change anything in the opnsense. If you suffer any problems due to changing of setting, just restore the suspected area then reboot your opnsense. Then everything will be back to normal again.

Quote from: raywan on October 14, 2024, 06:52:36 PM
So, i copy all the blocklist and firewall rules into my notepad then create all the rules one by one. It takes few hours for me to migrate all the pfsense setting to opnsense unbound.
Now, my firewall table entries and unbound blocklist are 4618097 and 2876467 respectively are running blazingly fast with n305 cpu and 32GB DDR5.

I would be very curious to learn your approach, as the same for me, I have collected blacklists, which block a lot of trackers and adds, and would be glad to move this to OPNsense installation.

Quote
Go back to you question, before you do any changes of setting, please go to system=>configurations=>backups. you can backup the null or default setting into your hard driver before you start to change anything in the opnsense.

Yes, this is what I usually do.
In this specific case it was rather very early step in the process, hence I did not make "enough" backups, not frequently enough.

October 15, 2024, 04:20:45 PM #6 Last Edit: October 15, 2024, 05:43:02 PM by raywan


You are welcome. If you want my unbound blocklist, i can share to you. I just install a ultimate blocklist as core blocklist to remove 99% ads for daily web surfing. For the Bad IP filtering, i have added tenth badIP blocklists in firewall aliases which contribute firewall tables entries about 420000. Those entries block more than 90% scanner, hacker....daily. I haven't installed any IDS/IPS except crowdsec in opnsense because it will slow down the speed at the background. I think it is good enough to block most of hacker/scanner for daily use.


Unbound DNS stopped resolving names again   ::)

What happened (guessing) as this has happened not instantly, but at some point of time.
I've added few lists to the blocklist and ... it stopped working.

The same problem


root@<OPNSENSE>:~ # nslookup www.cnn.com
;; Got SERVFAIL reply from 127.0.0.1
Server:         127.0.0.1
Address:        127.0.0.1#53

** server can't find www.cnn.com: SERVFAIL


Assumption: something internally gets corrupted.
the only error in the unbound DNS Log:

2024-10-26T17:39:57 Error unbound [74508:3] error: SSL_handshake syscall: Operation timed out
2024-10-26T17:39:56 Error unbound [74508:1] error: SSL_handshake syscall: Operation timed out


Where to look to understand the root case of the problem?
and how to "repair" / "redeploy" DNS to get possibly corrupted parts being fixed?

Quote from: raywan on October 15, 2024, 04:20:45 PM
If you want my unbound blocklist, i can share to you. I just install a ultimate blocklist as core blocklist to remove 99% ads for daily web surfing. For the Bad IP filtering, i have added tenth badIP blocklists in firewall aliases which contribute firewall tables entries about 420000. Those entries block more than 90% scanner, hacker....daily.

Hi raywan,
Could you share how you achieve this? How do you created your own list and how do you make it available on you opnSense system?

I've seen problems show in Unbound when using large lists. Why and why not on other distributions? I can't answer that.
My suggestion for OPN: don't use them and Unbound will be 100% solid. Instead use them in AdGuardHome on OPNSense. Straight forward installation. Add a repo, get an update of packages with the new repo, install and configure with a couple of clicks. Then add the lists on AdGH. And you get additional functionality too.

Quote from: cookiemonster on October 28, 2024, 10:59:29 PM
My suggestion for OPN: don't use them and Unbound will be 100% solid. Instead use them in AdGuardHome on OPNSense. Straight forward installation. Add a repo, get an update of packages with the new repo, install and configure with a couple of clicks. Then add the lists on AdGH. And you get additional functionality too.

Do I understand this right, that AdGuardHome takes over as resolver? Unbound unneeded?

Can be but I am not suggesting that setup.
AdGH does only ad blocking with lists. Other lists can also be added.
Then it uses Unbound as upstream resolver.
client -> AdGH -> Unbound -> Root servers (or others if you prefer)

Quote from: cookiemonster on October 29, 2024, 11:17:38 AM
Can be but I am not suggesting that setup.
AdGH does only ad blocking with lists. Other lists can also be added.
Then it uses Unbound as upstream resolver.
client -> AdGH -> Unbound -> Root servers (or others if you prefer)

OK! So I let adGH use port 53 and then AdGH asks Unbound on another port. Witch mean I don't have to change clients, since they already uses OpnSense(aka Unbound) as resolver.

Exactamundo!
Just remember when you setup AdGH on OPN (do it from the mimugmail repo so it integrates with OPN better) to tick the "Primary DNS" to enable it on port 53.