Unbound + dnsmasq

Started by morik_opnsense, November 28, 2023, 08:49:41 PM

Previous topic - Next topic
November 28, 2023, 08:49:41 PM Last Edit: January 19, 2024, 07:49:42 PM by morik_opnsense
Hello experts,
I have the following setup:
Internet<--DoT-->Unbound(also maintains DHCP mappings)<--(normal_DNS)-->Pihole<--(normal_DNS)-->clients
clients are configured w/ Pihole addresses. Pihole is configured with Unbound as upstream DNS. Unbound is configured with DNS over TLS for WAN resolution.

Recently, I purchased a roborock S8 vacuum cleaner. Created a firewall rule to allow VLAN_x traffic to certain FQDN (mqtt-us.roborock.com)over 8883 port. It worked great for a day. App stopped working the next day. A quick dig revealed the issue. Destination IP addresses had changed. So, I manually updated the address in firewall rule. Great. Day#2 same issue. Same solution. Day#3 same issue. So on and so forth. A more elegant solution was required.


$ dig mqtt-us.roborock.com

; <<>> DiG 9.10.6 <<>> mqtt-us.roborock.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46538
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;mqtt-us.roborock.com. IN A

;; ANSWER SECTION:
mqtt-us.roborock.com. 583 IN CNAME mqtt-slb-1st-1913472363.us-east-1.elb.amazonaws.com.
mqtt-slb-1st-1913472363.us-east-1.elb.amazonaws.com. 60 IN A 44.209.56.31
mqtt-slb-1st-1913472363.us-east-1.elb.amazonaws.com. 60 IN A 52.7.27.196
mqtt-slb-1st-1913472363.us-east-1.elb.amazonaws.com. 60 IN A 54.235.188.250

;; Query time: 355 msec
;; SERVER: a.b.c.d#53
;; WHEN: Tue Nov 28 11:11:30 PST 2023
;; MSG SIZE  rcvd: 159


Their servers are hosted in AWS fronted via application loadbalancers. Therefore, A/AAAA addresses keep changing (not just rotating).

The answer to this problem seems to lie in https://forum.opnsense.org/index.php?topic=27650.0 thread. Meaning, use dnsmasq to resolve a wildcard / specific_domain, store result in an alias which is used in Opnsense firewall rule. Details of which points to https://github.com/opnsense/core/issues/4145. Great! I'd like to try it. But, my issue: how to enable dnsmasq with unbound just for those domains?

It took a long time to make end-to-end DNS flows in the home setup functional. How does one go about enabling dnsmasq to work together w/ unbound with minimal changes? Because unbound is running on 53, at minimum, i'll assume i can't run dnsmasq on the same port? If case be, how to configure dnsmasq and only have it respond to wildcard queries, from unbound, related to mqtt*.roborock.com? google gods aren't showing mercy.

Help please.

while i can't say i endorse hacky workarounds i understand you gotta do what you gotta do to make shit work.

i gather that you're using aliases+pf rules to lock down traffic to the furthest extent possible on what reads like an iot vlan. if the devices only have access to manually-whitelisted endpoints and your vacuum wants to reach a mystery machine hosted on AWS for which you can't make a threat assessment, why not just allow access to any & all AWS ranges? the list is published & updated1 but you'll either need to parse the json (e.g. with jq ) or find a reliable source of these same data in a format consumable by opnsense...i'm sure such a thing exists. you might also use that step to remove any ranges not associated with EC2 (your hostname resolves to an elastic load balancer host and all IPs it distributes, in your case, are EC2).

stated otherwise, if it were to start connecting to another EC2 server(s) tomorrow (which it already has and will continue to do), you'd have no better insight into its purpose than you do already so you'd allow it. is it a broad swath of ip space? absolutely. are you going to mitigate newly introduced threats from that one device by restricting its traffic to a subset of ranges on the same service?  i'd say its a near zero chance but i'm sure some swingin' d**k cybersec expert with more certs to their name than they have common sense would disagree.

also imho you should call the alias for vacuum endpoints 'this_shit_sucks' for obvious reasons.

add'l point... pihole-FTL shares enough code base with dnsmasq to describe it as being functionally similar, and as such it will parse and leverage config files in /etc/dnsmasq.d/. from a workflow perspective these options are considered before its forwarding mechanism (to resolvers defined via pihole UI) kicks in. ootb, nothing that can meet your desire for pf rules unless you leveraged it to generate a locally-hosted reference for alias lookup via url on opnsense (e.g. alias = http://192.168.69.69/this_shit_sucks).

add'l point 2...if you insist upon addressing it with opnsense you should still be able to make it happen by:

  • change dnsmasq listen port to something besides unbound's port 53, e.g. 53053
  • add override for roborock.com at opnsense > domain overrides, specifying 127.0.0.1@53053 as authoritative nameserver
  • figure out what cli voodoo is required to add/modify dnsmasq configs on opnsense (making sure they persist)
  • cross fingers, pray to the networking gods regularly, and research alternatives in the interim
add'l point 3...subjectively, trying to do *more* with dns on opnsense vs. moving it elsewhere has led to more problems than solutions for me (e.g. host blocking via unbound DNSBL on opnsense will give you heartburn, not exploring the adguard community plugin is an exercise in sanity preservation, i could go on). i'm sure circumstances such as migrating to unbound as default, changing ddns clients, etc don't help matters much but i'm not in the business of making excuses for time wasted.

1 https://ip-ranges.amazonaws.com/ip-ranges.json

@firewall 100% agree. The only real solution to a problem like this is to isolate that damn device. If you are in the lucky situation to have WiFi infrastructure that can do multiple SSIDs mapped to VLANs, then give that vacuum its own VLAN and permit access to the whole Internet but not anything local. Case closed in my opinion.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Another option would be to block all access except to the AWS ASIN but that still leaves a big chunk open.

folks, please allow me to apologize for the long delay in response. I had a family emergency to attend. Please allow me some time to look through the responses and revert with questions. Thank you for your continued support.

Quote from: Patrick M. Hausen on November 30, 2023, 08:32:48 PM
@firewall 100% agree. The only real solution to a problem like this is to isolate that damn device. If you are in the lucky situation to have WiFi infrastructure that can do multiple SSIDs mapped to VLANs, then give that vacuum its own VLAN and permit access to the whole Internet but not anything local. Case closed in my opinion.
Thank you for the suggestion. I did consider the possibility of introducing a VLAN dedicated for just one machine. For pre- 802.11ax devices (STAs), that'd imply a different BSSID for this VLAN i.e. it'll be a separate WLAN network. Which implies wastage  from an OA&M perspective (management) but more channel bandwidth wastage. Starting 802.11ax, the feature for MBSSID can be used; but most IoT devices don't support it.

January 19, 2024, 08:10:22 PM #6 Last Edit: January 21, 2024, 04:22:22 PM by morik_opnsense
Quote from: firewall on November 30, 2023, 08:08:10 PM
i gather that you're using aliases+pf rules to lock down traffic to the furthest extent possible on what reads like an iot vlan. if the devices only have access to manually-whitelisted endpoints and your vacuum wants to reach a mystery machine hosted on AWS for which you can't make a threat assessment, why not just allow access to any & all AWS ranges? the list is published & updated1 but you'll either need to parse the json (e.g. with jq ) or find a reliable source of these same data in a format consumable by opnsense...i'm sure such a thing exists. you might also use that step to remove any ranges not associated with EC2 (your hostname resolves to an elastic load balancer host and all IPs it distributes, in your case, are EC2).
Thank you for the very well thought of response, and the pointers. To balance "convenience" viz-a-viz security but more importantly the wife-factor, for now, I ended up changing the firewall rule to allow *:8083. I'm unfamiliar w/ jq but will look into parsing json via jq for integration w/ opnsense. 

Quote from: firewall on November 30, 2023, 08:08:10 PM
add'l point... pihole-FTL shares enough code base with dnsmasq to describe it as being functionally similar, and as such it will parse and leverage config files in /etc/dnsmasq.d/. from a workflow perspective these options are considered before its forwarding mechanism (to resolvers defined via pihole UI) kicks in. ootb, nothing that can meet your desire for pf rules unless you leveraged it to generate a locally-hosted reference for alias lookup via url on opnsense (e.g. alias = http://192.168.69.69/this_shit_sucks).
I must apologize but i'm unable to understand the proposed solution here. Make some change to pihole which then influences firewall rule execution at opnsense? Please do elaborate at your convenience.

Quote from: firewall on November 30, 2023, 08:08:10 PM
add'l point 2...if you insist upon addressing it with opnsense you should still be able to make it happen by:

  • change dnsmasq listen port to something besides unbound's port 53, e.g. 53053
  • add override for roborock.com at opnsense > domain overrides, specifying 127.0.0.1@53053 as authoritative nameserver
  • figure out what cli voodoo is required to add/modify dnsmasq configs on opnsense (making sure they persist)
  • cross fingers, pray to the networking gods regularly, and research alternatives in the interim
Ha ha. I'm considering replacing the aging pihole w/ adguard on Opnsense itself. So, introducing dnsmasq and the complications associated w/ changing network infrastructure (e.g. I have a shit time getting Active Directory to work in this setup as such) is not my preferred route at this point. But, beggars can't be choosers.

Quote from: firewall on November 30, 2023, 08:08:10 PM
add'l point 3...subjectively, trying to do *more* with dns on opnsense vs. moving it elsewhere has led to more problems than solutions for me (e.g. host blocking via unbound DNSBL on opnsense will give you heartburn, not exploring the adguard community plugin is an exercise in sanity preservation, i could go on). i'm sure circumstances such as migrating to unbound as default, changing ddns clients, etc don't help matters much but i'm not in the business of making excuses for time wasted.
You don't recommend adguard plug-in on Opnsense? If not, what about clients-->adguard (e.g. as a vm) --> opnsense unbound --> interwebs? Opnsense blocklists are brilliant but troubleshooting why a particular website is getting blocked is worse (to me) than pulling my own tooth. I see folks recommending adguard on Opnsense to ease off this troubleshooting?

Create an Alias with these two hosts

- mqtt-us.roborock.com.

- mqtt-slb-1st-1913472363.us-east-1.elb.amazonaws.com


Apply the Alias as Destination on the FW rule.


That should be all. Nothing else to do with dnsmasq or pi-hole

@newsense, much like what i described as 'hacky workarounds' in responding to OP's specific request for a dnsmasq-based solution, this too is only as viable as the alias updates it relies upon.

@morik_opnsense, i'd suggest either allowing to all AWS ranges (see below) or further isolating the device. if your concern is based on possible phone-home to china do both.

establish initial list:
pkg install jq
curl https://ip-ranges.amazonaws.com/ip-ranges.json  | jq -r '.prefixes | keys[] as $k | "\(.[$k] | .ip_prefix)"' > /usr/local/www/aws_ranges.txt


establish update mechanism:
- firewall > aliases > add
- type 'url table (ips)', refresh frequency 5 days should be fine, content = http://127.0.0.1/aws_ranges.txt

then your pf rule just needs to block all vacuum traffic with exception of aws_ranges alias.

more fun with add'l points: after poking around at the hosts listed at https://www.virustotal.com/gui/domain/roborock.com/relations i'd hire a maid instead. :)

Quote from: morik_opnsense on January 19, 2024, 08:10:22 PM
You don't recommend adguard plug-in on Opnsense?

i'd never use nor recommend it, and tbh i feel those that do are comfortable with disregarding opsec in favor of convenience.
https://www.reddit.com/r/Adguard/comments/t05r9v/russian_app_safe/
https://www.reddit.com/r/Adguard/comments/t05r9v/comment/iqtcxqg/


Quote from: firewall on March 06, 2024, 04:15:19 AM
@morik_opnsense, i'd suggest either allowing to all AWS ranges (see below) or further isolating the device. if your concern is based on possible phone-home to china do both.

establish initial list:
pkg install jq
curl https://ip-ranges.amazonaws.com/ip-ranges.json  | jq -r '.prefixes | keys[] as $k | "\(.[$k] | .ip_prefix)"' > /usr/local/www/aws_ranges.txt


establish update mechanism:
- firewall > aliases > add
- type 'url table (ips)', refresh frequency 5 days should be fine, content = http://127.0.0.1/aws_ranges.txt

then your pf rule just needs to block all vacuum traffic with exception of aws_ranges alias.

What benefit does this provide over just creating an ASN alias for AWS?  https://ipinfo.io/AS16509

Quote from: CJ on March 06, 2024, 04:13:02 PM
What benefit does this provide over just creating an ASN alias for AWS?  https://ipinfo.io/AS16509

A1: OP is solutioning for pass/allow rule(s), which, as i'm sure you'd agree, should be as conservative as possible.

A2: not creating a pass/allow rule for every range under jeffrey's jurisdiction.

A3: sourcing a known-good list of ranges directly from its controlling parties vs. $unknown_place_opnsense_gets_and_maintains_its_asn_ranges

A4: accounting for the manner in which amz publishes those ranges. see 'Note' in block at top of https://docs.aws.amazon.com/vpc/latest/userguide/aws-ip-ranges.html

do you even firewall bro?  :P