Hi all,
I used to have a lancache server set up on my home network but I removed it when I got symmetric gigabit internet since it actually slowed down downloads. However, I am having issues where Steam downloads just keep failing and need resetting with cache clearing etc. I can see in the Pihole logs that Steam is looking up lancache.steamcontent.com. If I do an nslookup lancache.steamcontent.com from my desktop, it comes back with correct public IPs. However, if I do a docker exec -it unbound dig lancache.steamcontent.com to run a dig from my unbound container, it returns:
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 19292
;; flags: qr rd ra ; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;; steamcontent.com. IN A
;; ANSWER SECTION:
steamcontent.com. 0 IN A 192.168.1.92
;; AUTHORITY SECTION:
;; ADDITIONAL SECTION:
;; Query time: 5 msec
;; SERVER: 127.0.0.11
;; WHEN: Thu Aug 15 13:29:37 2024
;; MSG SIZE rcvd: 50
I have no idea why it's doing this. If I force lancache.steamcontent.com to resolve to 0.0.0.0 in my desktop's /etc/host file, Steam downloads start working fine. So this dig issue must be related.
I can't see why it thinks that server is still at 192.168.1.92 though - there is zero configuration for unbound or phiole that mentions steamcontent.com or 192.168.1.92 that I can find by grepping inside or outside the containers. So where is it getting this IP from and how do I make it realise that it needs to go and find the proper public IP? Restarting the container with a fresh volume doesn't work. The only config file I am mounting in is this one:
server:
interface: 0.0.0.0@5053
do-ip6: no
do-daemonize: no
access-control: 127.0.0.1/32 allow
#access-control: 192.168.0.0/16 allow
access-control: 172.16.0.0/12 allow
#access-control: 10.0.0.0/8 allow
include: /etc/unbound/a-records.conf
##################
# LOGGING & STATS
##################
logfile: /var/unbound/unbound.log
verbosity: 1
statistics-interval: 600
statistics-cumulative: yes
###########
# SECURITY
###########
# Trust glue only if it is within the server's authority
harden-glue: yes
# Require DNSSEC data for trust-anchored zones, if such data is absent, the zone becomes BOGUS
harden-dnssec-stripped: yes
# Don't use Capitalization randomization as it known to cause DNSSEC issues sometimes
# see https://discourse.pi-hole.net/t/unbound-stubby-or-dnscrypt-proxy/9378 for further details
use-caps-for-id: no
# Ensure privacy of local IP ranges
private-address: 192.168.0.0/16
private-address: 169.254.0.0/16
private-address: 172.16.0.0/12
private-address: 10.0.0.0/8
private-address: fd00::/8
private-address: fe80::/10
##############
# PERFORMANCE
##############
# Reduce EDNS reassembly buffer size.
# Suggested by the unbound man page to reduce fragmentation reassembly problems
#edns-buffer-size: 1472
edns-buffer-size: 1232
# Perform prefetching of close to expired message cache entries
# This only applies to domains that have been frequently queried
prefetch: yes
num-threads: 2
msg-cache-slabs: 2
rrset-cache-slabs: 2
infra-cache-slabs: 2
key-cache-slabs: 2
outgoing-range: 450
num-queries-per-thread: 225
rrset-cache-size: 200m
msg-cache-size: 100m
so-sndbuf: 2m
so-rcvbuf: 2m
so-reuseport: yes
Your config file states:
include: /etc/unbound/a-records.conf
That's the first place I would look. Clearly your DNS server is getting this record from somewhere.
Also, since you know you're looking for the string "steamcontent.com" you could just do something like:
grep -rni "steamcontent" .
...to recursively show all files containing that string wherever you have your config files.
Yeah I've done a grep for that, there's nothing in there that isn't either from a log file or a commented out line.
My /etc/unbound/a-records.conf file is just empty:
# A Record
#local-data: "somecomputer.local. A 192.168.1.1"
# PTR Record
#local-data-ptr: "192.168.1.1 somecomputer.local."
As I said, it's getting it from somewhere, since there is no public A-record for steamcontent.com, so either it's configured by you in unbound or pihole, or you're getting the DNS records from an external DNS service that allows you to set the override there.
steamcontent.com doesn't resolve publicly but lancache.steamcontent.com does, that's what I'm talking about. I realise the bit I posted above was for steamcontent.com but I get the same result either way:
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 6067
;; flags: qr rd ra ; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;; lancache.steamcontent.com. IN A
;; ANSWER SECTION:
lancache.steamcontent.com. 0 IN A 192.168.1.92
;; AUTHORITY SECTION:
;; ADDITIONAL SECTION:
;; Query time: 2 msec
;; SERVER: 127.0.0.11
;; WHEN: Thu Aug 15 15:28:27 2024
;; MSG SIZE rcvd: 59
The log doesn't say much either:
[1723735832] unbound[1:0] notice: init module 0: subnetcache
[1723735832] unbound[1:0] warning: subnetcache: prefetch is set but not working for data originating from the subnet module cache.
[1723735832] unbound[1:0] notice: init module 1: validator
[1723735832] unbound[1:0] notice: init module 2: iterator
[1723735832] unbound[1:0] info: start of service (unbound 1.20.0).
A grep -RniI "steamcontent" --exclude="*.log" . comes back empty.
It's a fresh container with just that one config file mounted, so where can it be getting the record from and how can I find that out?
Regardless, it's not getting a "correct" response but one that you have overridden somewhere, which you can probably verify by running a command that specifies the DNS server that you want to ask, in this example Google:
# dig @8.8.8.8 lancache.steamcontent.com
(...)
;; QUESTION SECTION:
;lancache.steamcontent.com. IN A
;; ANSWER SECTION:
lancache.steamcontent.com. 2775 IN CNAME origin-tier2.steampipe.steamcontent.com.
origin-tier2.steampipe.steamcontent.com. 95 IN CNAME steampipe-origin-tier2.steamcontent.com.
steampipe-origin-tier2.steamcontent.com. 112 IN CNAME cache-origin.steampipe.steamcontent.akadns.net.
cache-origin.steampipe.steamcontent.akadns.net. 60 IN CNAME dist-sto1.discovery.steamserver.net.
dist-sto1.discovery.steamserver.net. 24 IN A 162.254.198.12
dist-sto1.discovery.steamserver.net. 24 IN A 162.254.198.13
;; Query time: 48 msec
;; SERVER: 8.8.8.8#53(8.8.8.8 ) (UDP)
;; WHEN: Thu Aug 15 17:29:45 CEST 2024
;; MSG SIZE rcvd: 266
You're getting your response from 127.0.0.11 which I assume is the docker-compose internal DNS, which in turn (unless configured otherwise) probably just gets the response from whatever DNS you have configured your system to use.
Whether that is unbound, or pi-hole forwarding requests to unbound, or something else I have no idea since you haven't provided any information on how you have set this up (or what role opnsense plays in it).
I mean, yes, I know I must've overridden it somewhere even though I think I've blatted it everywhere. I am not sure what 127.0.0.11 is because my containers have specific IPs configured:
HOST_IP=192.168.1.89
IP_SUBNET=172.18.0.0/24
PIHOLE_IP=172.18.0.100
UNBOUND_IP=172.18.0.101
ORBITAL_SYNC_IP=172.18.0.102
I shall investigate to see what that IP points to. The host machine has DNS configured to both this instance of unbound and my backup one (which has identical config files, in theory):
addresses: [172.18.0.100, 192.168.1.93]
You're probably right that it's some internal docker IP but it's worth investigating. I should note that my backup instance of unbound does not have this issue:
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 40956
;; flags: qr rd ra ; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;; lancache.steamcontent.com. IN A
;; ANSWER SECTION:
lancache.steamcontent.com. 10800 IN CNAME origin-tier2.steampipe.steamcontent.com.
origin-tier2.steampipe.steamcontent.com. 300 IN CNAME steampipe-origin-tier2.steamcontent.com.
steampipe-origin-tier2.steamcontent.com. 300 IN CNAME cache-origin.steampipe.steamcontent.akadns.net.
cache-origin.steampipe.steamcontent.akadns.net. 60 IN CNAME dist-lhr1.discovery.steamserver.net.
dist-lhr1.discovery.steamserver.net. 60 IN A 162.254.196.12
dist-lhr1.discovery.steamserver.net. 60 IN A 162.254.196.11
;; AUTHORITY SECTION:
;; ADDITIONAL SECTION:
;; Query time: 164 msec
;; SERVER: 127.0.0.11
;; WHEN: Thu Aug 15 15:44:02 2024
;; MSG SIZE rcvd: 255
Finally sorted it, thanks for the tips.
It turns out I was being led on a wild goose chase, thinking the issue was with my primary instance of unbound because running dig on it was coming back with the incorrect local IP. It turns out that because of the host's DNS settings, that container was unable to use its own paired (primary) instance of Pihole to route DNS requests. Thus, running dig in the primary unbound container was actually routing the DNS request to my secondary Pihole. And that's where the problem was: I hadn't removed the offending Lancache dnsmasq.d files from my secondary instance of Pihole.
The fact that dig was listing the wrong IP on the primary instance of unbound turned out to be a red herring!