Roku DNS storm is impacting OPNsense

Started by OPNenthu, Today at 11:44:12 AM

Previous topic - Next topic
I'm seeing this exact issue: https://github.com/FreshTomato-Project/freshtomato-arm/issues/268

My parents have a Roku box that just started flooding DNS to its telemetry endpoints which are blocked by DNSBL policy.  I'm seeing millions of requests in the reporting period (I think OPNsense keeps last 24 hrs).  The only issue is that it's causing log buildup which is overwhelming the system.  Memory use went from ~20% (baseline) to over 60%, mostly due to Unbound's logger (attached).  The Unbound reporting is taking half a minute to load.  Also seeing slowdowns in the live view and Firewall widget loading.

You cannot view this attachment.

What I did remotely was to force the WiFi client to reconnect via the UniFi console.  Unfortunately it immediately started spamming DNS again once it reconnected.  For the moment I've blocked the device from internet access.

The recommendation in the GH link is to redirect the telemetry endpoint to some blackhole IP instead of 0.0.0.0.  I think that could end up being a maintenance issue if the hostnames change, so I'm wondering if I can instead rate limit the DNS requests just from this device?  A quick forum search seems to indicate there's no way to do that, but I'm not sure.  Appreciate tips on how to best proceed (short of throwing the Roku in the trash).
N5105 | 8/250GB | 4xi226-V | Community

Why not redirect the Roku to 127.0.0.1, letting it spam itself?
- Jim

Today at 12:34:08 PM #2 Last Edit: Today at 12:40:42 PM by OPNenthu
It was tried (4th comment in the ticket) and apparently only worked initially.

Worth a shot, though.

(EDIT): I think the problem is that I would have to set up an alias with the specific telemetry endpoints to use as the destination in the DNAT rule.  Roku apparently has many such endpoints.  I can't keep such a list manually updated and reliable.

For example, in my logs it's spamming "brewster.logs.roku.com" but in the logs in the linked ticket it's spamming "bayside.logs.roku.com".
N5105 | 8/250GB | 4xi226-V | Community

Today at 12:50:08 PM #3 Last Edit: Today at 12:54:26 PM by Monviech (Cedrik)
You might be able to use a firewall overload table combined with a block rule.

If your DNS rule matches and too many requests are sent then the client will be added to the defined overload table.

Then with a block rule before the dns allow rule, that client will then be blocked for some time.

But that would block all DNS traffic of that client. So kinda moot if it should still be allowed "something" and only telemetry should be blackholed.

If the telemetry endpoints are all under some certain wildcard domains you could also use a dnsmasq ipset alias to banish them to the shadow realm.
https://docs.opnsense.org/manual/dnsmasq.html#firewall-alias-ipset

Hardware:
DEC740

In AdGuard Home you could do something like this:

'rewrites':
  - 'domain': *.logs.roku.com
    'answer': 127.0.0.1

And in Unbound:

Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Today at 01:39:55 PM #5 Last Edit: Today at 01:42:47 PM by OPNenthu
Thank you, I added the host override in DNS but it looks like the Roku doesn't want to shut up.  I tried both 127.0.0.1 and 192.168.254.254.  The log spam continues.

Quote from: Monviech (Cedrik) on Today at 12:50:08 PMBut that would block all DNS traffic of that client. So kinda moot if it should still be allowed "something" and only telemetry should be blackholed.

If the telemetry endpoints are all under some certain wildcard domains you could also use a dnsmasq ipset alias to banish them to the shadow realm.
https://docs.opnsense.org/manual/dnsmasq.html#firewall-alias-ipset

Yeah, I need for it to stay connected so they can watch TV and only the DNS spam should be controlled.  A firewall rule for all *.logs.roku.com would let me disable logging so that could possibly do the trick, as long as they don't change it.  Thanks, will look into this.

Long term, it would be nice for OPNsense to have a rate limiting function :)
N5105 | 8/250GB | 4xi226-V | Community

Today at 02:21:18 PM #6 Last Edit: Today at 02:24:04 PM by Monviech (Cedrik)
Well opnsense does have a rate limiting function, its the PF overload tables and/or request limiters. But it can only limit by source and destination and other layer 3 identifiers.

Since your destination is semi dynamic, it could only rate limit a source, and that would then block the whole source, or drop requests that you might want.

A rate-limiter would do the same, it couldn't identify which traffic you want, and which you don't, because there is no metadata to use here (other than source and destination...)

If you want to rate limit on a higher OSI Layer (e.g. on the application layer by checking DNS packet contents itself) the firewall would already have to process more again which is the same as simply allowing the processing to hit the DNS daemon anway since it can handle that load if the hardware itself can handle the load.

And then we get into CDN territory, since you need beefier hardware to filter out requests that should be rate limited before it hits your own smaller hardware. Push expensive filtering/classification away from the smaller origin system. On a home firewall, that extra layer usually does not exist, so the resolver still has to receive and inspect the request before it can decide what to do with it.
Hardware:
DEC740

Quote from: OPNenthu on Today at 11:44:12 AMAppreciate tips on how to best proceed (short of throwing the Roku in the trash).
IMHO that's the best thing to do after reading this : https://discourse.pi-hole.net/t/what-domain-to-whitelist-to-unblock-roku-tv-schedule/86284
And now your topic too... :'(

Quote from: OPNenthu on Today at 01:39:55 PMLong term, it would be nice for OPNsense to have a rate limiting function :)
Pi-Hole has the limit DNSmasqd has too :
Quote-0, --dns-forward-max=<queries>

Set the maximum number of concurrent DNS queries. The default value is 150, which should be fine for most setups. The only known situation where this needs to be increased is when using web-server log file resolvers, which can generate large numbers of concurrent queries.
So in theory the one in OPNsense should behave the same :)
Weird guy who likes everything Linux and *BSD on PC/Laptop/Tablet/Mobile and funny little ARM based boards :)