I'm not sure what happened here so I'm just going to list out the facts as I know them and hopefully someone can point me in a good direction to investigate.
A domain name on my network stopped resolving. None of my machines would resolve it. My network is configured to use an upstream DoT provider and block all other DNS queries.
Going to the DNS Lookup page on OPNSense showed that it was unable to resolve the domain. If I put my upstream DoT IP in the Server field OPNSense was able to resolve the domain.
I'm using the Steven Black blocklist in the Unbound DNS settings, but checking the list showed that the domain wasn't on there.
At this point I had to do some errands and when I came back later, OPNSense was able to resolve the domain without putting my upstream DoT provider in the Server field of the DNS Lookup page.
Any ideas what could have happened? I don't think the DNS Lookup page uses DoT but I'd be surprised if the provider served different responses over DoT than the standard protocol. Could Unbound have gotten a bad resolution attempt and cached it?
Thanks.
Quote from: CJRoss on November 30, 2022, 03:16:21 PM
I'm not sure what happened here so I'm just going to list out the facts as I know them and hopefully someone can point me in a good direction to investigate.
A domain name on my network stopped resolving. None of my machines would resolve it. My network is configured to use an upstream DoT provider and block all other DNS queries.
Going to the DNS Lookup page on OPNSense showed that it was unable to resolve the domain. If I put my upstream DoT IP in the Server field OPNSense was able to resolve the domain.
I'm using the Steven Black blocklist in the Unbound DNS settings, but checking the list showed that the domain wasn't on there.
At this point I had to do some errands and when I came back later, OPNSense was able to resolve the domain without putting my upstream DoT provider in the Server field of the DNS Lookup page.
Any ideas what could have happened? I don't think the DNS Lookup page uses DoT but I'd be surprised if the provider served different responses over DoT than the standard protocol. Could Unbound have gotten a bad resolution attempt and cached it?
Thanks.
You are using a blocklist in Unbound. Any small error in this blocklist will bring Unbound down.
My personal advice: Do not use Unbound with blocklists. Use either bind or Adguard Home with the blocklists.
KH
@KHE
Quotesmall error in this blocklist will bring Unbound down
this is very outdated information imho
@CJRoss
QuoteCould Unbound have gotten a bad resolution attempt and cached it?
yes, negative cache is 5min by default. but this is for nxdomain and nodata answers only.
QuoteGoing to the DNS Lookup page on OPNSense showed that it was unable to resolve the domain
this page does not provide debug info for the missing answer (was it nxdomain, nodata or something else)
better to use dig\drill in shell in this case imho
Quote from: KHE on November 30, 2022, 11:51:55 PM
You are using a blocklist in Unbound. Any small error in this blocklist will bring Unbound down.
My personal advice: Do not use Unbound with blocklists. Use either bind or Adguard Home with the blocklists.
KH
Considering that neither the blocklist changed nor was Unbound restarted between when the problem happened and it resolved itself, I have to say that you are incorrect.
Quote from: Fright on December 01, 2022, 09:31:48 AM
@CJRoss
QuoteCould Unbound have gotten a bad resolution attempt and cached it?
yes, negative cache is 5min by default. but this is for nxdomain and nodata answers only.
Hmm. I believe the issue persisted over five minutes while I was testing but I can't confirm as I was in a hurry.
Quote from: Fright on December 01, 2022, 09:31:48 AM
QuoteGoing to the DNS Lookup page on OPNSense showed that it was unable to resolve the domain
this page does not provide debug info for the missing answer (was it nxdomain, nodata or something else)
better to use dig\drill in shell in this case imho
Agreed, but that would have required me to stop blocking DNS traffic to everything but my OPNsense, enable SSH, or go to the machine for local console. :)
Or are you asking to try against Unbound and see what it returns? I tried that but forgot that Ubuntu has it's own local cache server that it resolves against and then didn't have time to do anything else.
In a bit of interesting timing, it just happened again with a different domain.
The DNS Lookup page returned a valid result for the domain. I then tried nslookup on a windows machine and it gave me a DNS timeout when attempting to reach OPNsense. I reran the command and it resolved.
QuoteDNS timeout when attempting to reach OPNsense. I reran the command and it resolved
unbound reloads?
Quote from: Fright on December 01, 2022, 03:14:48 PM
QuoteDNS timeout when attempting to reach OPNsense. I reran the command and it resolved
unbound reloads?
Not sure what you're asking. I didn't do any steps other than use the DNS Lookup page and run nslookup twice. Nothing else on my network appeared to have a problem, although they're on different interfaces.
I mean maybe at this point for some reason the unbound is being reloaded (updating the blocklists or some)
(should be visible by different entries in the unbound and backend logs)
Quote from: Fright on December 01, 2022, 04:53:30 PM
I mean maybe at this point for some reason the unbound is being reloaded (updating the blocklists or some)
(should be visible by different entries in the unbound and backend logs)
No reloads since the blocklist update last night. The only thing in the logs since then are the generate keytag query entries which according to a pfsense post just indicate that I'm using DNSSEC.
https://forum.netgate.com/topic/161563/unbound-question-generate-keytag-query-_ta-4f66