Unbound not caching results?

Started by CJ, March 05, 2023, 03:53:57 PM

Previous topic - Next topic
The new reporting has been illuminating a lot of things I never realized.

I believe I have Unbound mostly default configured.  Not everything lists what the default is so it's hard to tell.  I have the cache set to 10k hosts and TTL set to 900 seconds.

However, I'm seeing dns requests in the reporting that are only a minute or two apart and the second request gets resolved via recursion and not the cache.

Looking at my statistics, I'm only at 10k queries, but half of them were cache misses.  I would have expected that to be higher.

Any ideas as to what I can check?

Quoteand TTL set to 900 seconds.
you mean "TTL for Host Cache entries"? it's not RR cache
if you need to force a long TTL then "Minimum TTL for RRsets and messages" should be used imho

Quote from: Fright on March 05, 2023, 06:42:27 PM
Quoteand TTL set to 900 seconds.
you mean "TTL for Host Cache entries"? it's not RR cache
if you need to force a long TTL then "Minimum TTL for RRsets and messages" should be used imho

That's what I ended up discovering and using.  I'm getting a lot more cache hits since I added that.

I also increased the cache disk size but I'm not sure if that made any difference.  The statistics doesn't list any cache details other than hits and misses.

you can dump unbound cache to a file to be sure cache works and see current TTLs for the records

Do not set "minimum ttl" to high. Some server require the requesting the "new" response.

Better set "Serve Expired Responses", so the latency is still very low, but the cache is more accurate.

Quote from: Fright on March 06, 2023, 01:54:40 PM
you can dump unbound cache to a file to be sure cache works and see current TTLs for the records

I can see the records coming from the cache and the TTLs for them.  A lot of them, such as NTP servers have a very short TTL and it expires by the time the next request comes along.  And that cycle just keeps repeating where several of my devices are making queries right after the TTL expires.

Quote from: cgone on March 06, 2023, 02:49:10 PM
Do not set "minimum ttl" to high. Some server require the requesting the "new" response.

Better set "Serve Expired Responses", so the latency is still very low, but the cache is more accurate.

It's only set to 900 right now.  I considered increasing it to 3600 but I'm not sure I want to do that for exactly the reasons you mentioned.

I need to look into serve expired more.  I'm not familiar enough with how it works to decide if I want to use it yet.

Quote from: cgone on March 06, 2023, 02:49:10 PM
Do not set "minimum ttl" to high. Some server require the requesting the "new" response.

Better set "Serve Expired Responses", so the latency is still very low, but the cache is more accurate.

This "Serve Expires Responses" sounds interesting to me. At least the DNS client could get an answer (an expired answer) but in the meantime Unbound tries to grab a fresh answer from the internet. As the TTL for the obsolete response will be 30 seconds (checked it on Unbound docs site), so if the client tries to connect to the wrong server, the 30second TTL may already expire on the clientside. So the DNS client may try to re-query Unbound. And by that time the Unbound may already have the fresh new valid response. Thats a real-life usage of this feature?

March 06, 2023, 04:45:24 PM #7 Last Edit: March 06, 2023, 04:47:36 PM by CJRoss
On a related note, do any of you know of documentation for Unbound statistics?  I seem to be gitting a 50/50 hit/miss ratio and I'm trying to make sure I understand what it's meaning before I change anything more.

I ask because I'm not sure where things like local-data and blocklists fall in the unbound statistics.

Additionally, I thought I'd have more prefetchs.  I'm only seeing about 1%.

Websites these days set ridiculous TTLs like 1 minute. It's possible to force minimum TTL time to override what the site dictates. More reading: https://blog.apnic.net/2019/11/12/stop-using-ridiculously-low-dns-ttls/

That APNIC article wasnt really useful. Otherwise said, wasnt really telling the ONE UNIVERSAL TRUTH. For example, the comment section pointed out many different scenarios, where the very low TTL is still a must. The writer of the article was seeing the whole topic through only his own limited perspective, and not considering other factors.