OPNsense Forum

English Forums => 22.1 Legacy Series => Topic started by: nZsdD22TPRf8P on April 17, 2022, 03:44:34 pm

Title: Unbound is slow?
Post by: nZsdD22TPRf8P on April 17, 2022, 03:44:34 pm
Hey everyone,

Writing this to attempt to get assistance in debugging an issue I've been having in my network ever since migrating to Opnsense.

My network setup is pretty simple: I have a Opnsense firewall/router, a switch and 3 Openwrt APs. I migrated over to Opnsense from an openwrt router/firewall with the exact same setup.

I've always had my DNS set up with DNS over TLS going to the cloudflare servers. Unbound is set up with this configuration with just adds a `Forward .`directive to the config and causes all queries to be forwarded instead of resolved recursively. Additionally, I'm having Unbound register DHCP leases and have a simple DNS blocklist.

Ever since migrating to this set up I've noticed that especially applications on phones get extremely slow. Sometimes applications like reddit or imgur will straight up refuse to load pages. This is apparent more on phones than on laptops, and I haven't noticed it happening via wired connections.

This points to a Wifi issue right? However, I've noticed that immediately after restarting unbound everything starts working perfectly again, from a few tens of minutes to a couple of hours and then starts happening again.

The unbound metrics don't point to anything out of the ordinary and there's also nothing weird in the logs. The behavior here seems like cache related, but I can't explain all the symptoms.

How would you go about debugging this issue?

Thanks
Title: Re: Unbound is slow?
Post by: Koldnitz on April 17, 2022, 04:54:58 pm
What kind of hardware are you running Opnsense on processor / ramwise?

You have confirmed that everything is unchecked / no DNS servers are set in Networking part of System: Settings: General?

Please provide what the statistics tab is showing in Services:Unbound.

To troubleshoot this I recommend turning off the blocklist.  From reading these forums I have noticed that that functionality can add problems.

I do not use blocklist, have unbound set up to do dns over tls (using cloudflare), and resolve recursively.

These are my statistics (i7-7500u with 16gigs of ram):

Code: [Select]
Thread 0
Recursion time (average): 0.079423
Recursion time (median): 0.0789569
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 1329
Cache misses: 1329
Cache hits: 1741
Zero TTL: undefined
Prefetch: 124
Queries: 3070
Thread 1
Recursion time (average): 0.083162
Recursion time (median): 0.0833995
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 1376
Cache misses: 1376
Cache hits: 1762
Zero TTL: undefined
Prefetch: 131
Queries: 3138
Thread 2
Recursion time (average): 0.084281
Recursion time (median): 0.08
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 1246
Cache misses: 1246
Cache hits: 1697
Zero TTL: undefined
Prefetch: 134
Queries: 2943
Thread 3
Recursion time (average): 0.082296
Recursion time (median): 0.0823693
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 1301
Cache misses: 1301
Cache hits: 1770
Zero TTL: undefined
Prefetch: 109
Queries: 3071
Times
Now: 1650206772.324366
Uptime: 27955.731622
Elapsed: 27955.731622
Total
Recursion time (average): 0.082267
Recursion time (median): 0.0811814
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 5252
Cache misses: 5252
Cache hits: 6970
Zero TTL: undefined
Prefetch: 498
Queries: 12222

Cheers,
Title: Re: Unbound is slow?
Post by: nZsdD22TPRf8P on April 17, 2022, 05:27:03 pm
Hey, thank you for your reply.

Quote
What kind of hardware are you running Opnsense on processor / ramwise?

Code: [Select]
OPNsense 22.1.5-amd64
FreeBSD 13.0-STABLE
OpenSSL 1.1.1n 15 Mar 2022

Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz (4 cores, 8 threads)

16Gb RAM


Quote
You have confirmed that everything is unchecked / no DNS servers are set in Networking part of System: Settings: General?

All clear, see attachment.

Quote
Please provide what the statistics tab is showing in Services:Unbound.

Code: [Select]

Thread 0
Recursion time (average): 0.066831
Recursion time (median): 0.0528201
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 798
Cache misses: 798
Cache hits: 580
Zero TTL: undefined
Prefetch: 0
Queries: 1378

Thread 1
Recursion time (average): 0.068602
Recursion time (median): 0.0534201
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 820
Cache misses: 820
Cache hits: 632
Zero TTL: undefined
Prefetch: 0
Queries: 1452

Thread 2
Recursion time (average): 0.066081
Recursion time (median): 0.051281
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 876
Cache misses: 876
Cache hits: 3669
Zero TTL: undefined
Prefetch: 0
Queries: 4545

Thread 3
Recursion time (average): 0.068600
Recursion time (median): 0.0515462
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 812
Cache misses: 812
Cache hits: 1674
Zero TTL: undefined
Prefetch: 0
Queries: 2486

Thread 4
Recursion time (average): 0.069425
Recursion time (median): 0.0513333
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 841
Cache misses: 841
Cache hits: 1621
Zero TTL: undefined
Prefetch: 0
Queries: 2462

Thread 5
Recursion time (average): 0.061700
Recursion time (median): 0.0518752
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 834
Cache misses: 834
Cache hits: 1575
Zero TTL: undefined
Prefetch: 0
Queries: 2409

Thread 6
Recursion time (average): 0.073829
Recursion time (median): 0.0522964
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 886
Cache misses: 886
Cache hits: 576
Zero TTL: undefined
Prefetch: 0
Queries: 1462

Thread 7
Recursion time (average): 0.066778
Recursion time (median): 0.0506368
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 831
Cache misses: 831
Cache hits: 589
Zero TTL: undefined
Prefetch: 0
Queries: 1420
Times
Now: 1650208872.638634
Uptime: 20893.715675
Elapsed: 20893.715675
Total
Recursion time (average): 0.067770
Recursion time (median): 0.0519011
TCP usage: 0
IP ratelimited queries: 0
Recursive replies: 6698
Cache misses: 6698
Cache hits: 10916
Zero TTL: undefined
Prefetch: 0
Queries: 17614

The weird part is that I don't see anything wrong with these stats, but I've tried multiple times and when unbound gets restarted the problem goes away.

Quote
To troubleshoot this I recommend turning off the blocklist.  From reading these forums I have noticed that that functionality can add problems.

I do not use blocklist, have unbound set up to do dns over tls (using cloudflare), and resolve recursively.

These are my statistics (i7-7500u with 16gigs of ram):

I've attempted this in the past but went back to the blocklist when I noticed that the behavior was the same.

What I find odd is that Unbound is reporting "recursive replies" when, if you enable DNS over TLS in the settings, this is the config that gets generated:

Code: [Select]
# Forward zones over TLS
server:
  tls-cert-bundle: /etc/ssl/cert.pem

forward-zone:
  name: "."
  forward-tls-upstream: yes
  forward-addr: 1.1.1.1@853#cloudflare-dns.com
  forward-addr: 1.0.0.1@853#cloudflare-dns.com
  forward-addr: 2606:4700:4700::64@853
  forward-addr: 2606:4700:4700::6400@853

Which if I'm not mistaken should cause Unbound to just work in forward mode for all domains.
Title: Re: Unbound is slow?
Post by: Koldnitz on April 17, 2022, 05:58:43 pm
Do you have a rule making sure that all DNS queries are forwarded to the router?  You said your phones behave weirdly.  I think I have read that both apple devices and android devices will use their own preferred DNS servers at times.

I have made rules to force everything to the router, the only devices that circumvent this use DNS overs HTTPS, and according to Sensei everything is working for the most part.

From the looks of it, you are definitely using Unbound's recursive functionality.  You could try putting Cloudflare's servers in the System:Settings:General area.  If I remember correctly, that is how you set Unbound into forward mode, but I'm not sure if it works with DNS over TLS.

Your numbers are better than mine and I generally do not notice anything unless the statistics approach 0.6 to 1.0, so I am confused as well.

Do you notice any slowdown over something wired into the switch?  It could be something to do with your WIFI.  Still the fact that resetting Unbound fixes it is suspect, but I think resetting Unbound resets the whole network (watch logs it definitely resets a lot of stuff on my network).  There is a chance it is something else and resetting Unbound fixes it in a way different than what you think (this would be way beyond my understanding though).

I am by no means an expert, but I have fiddled with Unbound a lot, and lately I never have problems unless I turn on receive side scaling.

Here are some links:

This explains some of the options the gui gives
https://nlnetlabs.nl/documentation/unbound/unbound.conf/ (https://nlnetlabs.nl/documentation/unbound/unbound.conf/)
optimization from unbound project people
https://nlnetlabs.nl/documentation/unbound/howto-optimise/ (https://nlnetlabs.nl/documentation/unbound/howto-optimise/)
This guys explains a lot of settings
https://calomel.org/freebsd_network_tuning.html (https://calomel.org/freebsd_network_tuning.html)
Check tools
https://www.dnsperf.com/ (https://www.dnsperf.com/)

Cheers,
Title: Re: Unbound is slow?
Post by: activescott on September 23, 2022, 12:32:20 am
I ran across this thread with very similar experience to the OP. I migrated from pfSense using the same hardware and same configuration (manually reconfigured) and found DNS to be terribly slow despite unbound stats not showing an obvious culprit.

Under System > Log Files > Backend (/ui/diagnostics/log/core/configd) I did notice a couple dozen of this peculiar message:

Code: [Select]
2022-09-22T12:06:41-07:00 Error configd.py Timeout (120) executing : unbound stats

I couldn't put my finger on what it was, but it does seem that once I changed IPv6 configuration on the WAN interface to "None" rather than the default DHCP (my ISP doesn't offer IPv6) and disabled it on the LAN interface as well, unbound restarted and things have been smooth since and I don't see that message any longer.

I'll update the thread if I learn more.


For prosperity here are some details of my setup:

Code: [Select]
Versions:

OPNsense 22.7.4-amd64
FreeBSD 13.1-RELEASE-p2
OpenSSL 1.1.1q 5 Jul 2022

CPU type:
Intel(R) Celeron(R) CPU J3160 @ 1.60GHz (4 cores, 4 threads)