Rules association

Started by cookiemonster, September 07, 2023, 05:45:53 PM

Previous topic - Next topic
Hello.
I really need assistance to spot my error in setting firewall rules.
It is a recent problem, my setup has been working fine for months. The changes that have occurred recently are beyond my understanding of OPN.

Problem:
My LAN clients have been losing DNS resolution regularly.
When it happens, all digs to their configured server 192.168.5.1:53 time out, they have been working for hours.

Setup:
OPNsense 23.1.11_1-amd64
Unbound listens on all interfaces on port 5353.
AdguardHome listens on all interfaces on port 53 of OPN.
Only IPV4 in this installation.
System > Settings > General | DNS servers is empty.
Allow DNS server list to be overridden by DHCP/PPP on WAN is not ticked.
Do not use the local DNS service as a nameserver for this system is ticked.
DHCPv4 has empty fields in the DNS servers field for each interface. Unbound sends DHCPv4 responses pointing the clients to 192.168.5.1 as DNS server, clearly to TCP/UDP port 53.


Outbound rules (Hybrid mode):

Interface Source Source Port Destination Destination Port NAT Address NAT Port Static Port Description
LAN LAN net tcp/udp/ * 192.168.5.1/32 tcp/udp/ 53 Interface address * NO Prevents hardcoded DNS clients from giving unexpected source error after DNS redirected to Adguard. 

Has a match local tag "forward"


OPT1 OPT1 net tcp/udp/ * 192.168.6.1/32 tcp/udp/ 53 Interface address * NO Prevents hardcoded DNS clients from giving unexpected source error after DNS redirected to Adguard. 

Has a match local tag "forward"



NAT:PORT FORWARD:
Source Destination NAT
Interface Proto Address Ports Address Ports IP Ports Description
<-> LAN TCP/UDP * * ! LAN net 53 (DNS) 192.168.5.1 53 (DNS) LAN-Redirect DNS requests to internal DNS resolver Adguard

This rule has a Set local tag "forward-AdG-LAN". Match local tag is empty. NAT reflection is "Use system default". Filter rule association is "Rule LAN-Redirect DNS requests to internal DNS resolver Adguard" which is the rule I modified today to have that label so I could identify it in this association.
Before today the local tag as "forward" and the associated rule was "Redirect DNS requests to internal DNS resolver Adguard"

|>      OPT1 TCP/UDP * * ! OPT1 net 53 (DNS) 192.168.6.1 53 (DNS) OPT1-Redirect DNS requests to internal DNS resolver Adguard

This rule has a Set local tag "forward-AdG-OPT1". Match local tag is empty. NAT reflection is "Use system default".

This second rule had to be recreated after removing and re-adding the interface OPT1. I guess that this is the reason it isn't shown as a linked rule i.e. Filter rule association is "none".

Firewall: Rules: LAN
Protocol Source Port Destination Port Gateway Schedule Description
IPv4 TCP/UDP * * 192.168.5.1 53 (DNS) * * LAN-Redirect DNS requests to internal DNS resolver Adguard


This rule I can't edit. I guess it was automatically created by the NAT reflection originally.

Diagnostic
- Suspecting post OPN upgrade from 22.7 to 22.3 I removed and re-added the interface in assigments page.
- Elimination process. I could see a lot of stalled Zenarmor socket-type connections. Changed Zenarmor from native to emulated, then stopped completely. Reboots have happened since. Presently not runnning.
- Elimination process. No errors seen in Unbound logs, dmesg, system logs, no logging available in AdGuard for the service but logs from clients stop.
- When it happens, I can ssh to OPN. dig commands to local host port 53 time out. dig to 5353 succeed.

I suspect I have setup rule reflections incorrectly. Why it is a sudden problem I suspect it was latent. The OPT1 interface is only started being used very recently.

How do I need to interpret the match rule functionality? Does it mean if I have two NAT Port Forward rules that have the same set local tag will confuse the NAT outbound rules if they have the same match local tag and even on different interfaces? In which case, we need set and match to be on the same interfaces only (normally)?
Shouldn't this scenario still not be a problem after all because even if the match sends the packets to the interface that wasn't where it came from, in this case the service is listening in all interfaces?
For now I have prepended or appended an interface name on each rule I can to try to match "set" to the correct "match" but when I re-enabled the OPT1 port forward rule, the problem re-occurred. I will try doing it and resetting states just in case.
From docs: "NAT rules are always processed before filter rules! So for example, if you define a NAT : port forwarding rules without a associated rule, i.e. Filter rule association set to Pass, this has the consequence, that no other rules will apply!"
So is this my problem, that I need to associate with a rule, but I can't tell which rule is the one to use for OPT1 when the list of rules I can identify are the one for Rule LAN-Redirect... , Rule Amazonia... and a bunch of others that just have "Rule".

Please advise.

OK I think I have restored the rule association and tried different permutations of the port forward to a rule and although I am not yet clear about these links, I have narrowed down the trigger and can ask differently.

AdGuardHome running on OPN on all interfaces, on port 53. The DHCPv4 service gives to clients the DNS server to use and that is AdGuardHome on the interface ip. AdGuardHome has the single ip of the interface LAN as its upstream, on custom port 5353 which is Unbound.



Unbound settings, listening on all interfaces on port 5353. Custom options sending DNS queries to localhost 127.0.0.1 port 8053. This is a stub resolver getdns/stubby that then sends the queries out to external encrypted resolvers aka DoT.



All up to here works perfectly fine and has done for a long while.
I have these NAT Outbound rules but only the first one is enabled.


I have the following NAT port forward to force clients that want to bypass (unencrypted) DNS queries, sending the queries to AdGuard on the interface address for each network and here is my problem . As soon as I enable the one for the OPT1 interface, clients on the LAN interface time out waiting for responses to their DNS queries. But I fail to see my error, why does this happen?



A quick firewall live log shows these queries being sent to the "wrong" interface 192.168.6.1:53 but they originate on the LAN interface network 192,168.5.0/24 so the first port forward has sent them to AdG on.
So I'm sure I have created some misdirection but I can't see it.







Anyone can please spot it?

This is hard on a forum.
So the question is "why if I enable a NAT:Port forward rule for the interface OPT1, traffic goes to interface LAN" ?, with all the (post 2) as complete setup information.


No need for ! LAN net  as source or ! OPT1 net for that matter.


Your rule should intercept all DNS quries from every lan/vlan and redirect to 192.168.5.1:53

To avoid modifying the same thing in 100 places remove the rules from the LAN/VLANs and create a Firewall: NAT: Port Forward rule as follows:

Interface - select all inerfaces in scope
Proto - TCP/UDP
source - any

destination - any
destination port - DNS

Redirect to 192.168.5.1 Port 53

================================================================

As a bonus, clone the new rule, change Proto to UDP, destination port NTP redirect to 127.0.0.1 port NTP --- to have everthing in the network sync time from the FW

Thanks for the suggestion newsense, I will likely try it.
Any thoughts on why it currently behaves the way it does and breaks things? Surely it is possible to port forward traffic incoming into each interface to the service running on it. This is what I'm trying to understand what I am doing wrong.
# sockstat -l
USER     COMMAND    PID   FD PROTO  LOCAL ADDRESS         FOREIGN ADDRESS
unbound  unbound    85075 5  udp4   *:5353                *:*
unbound  unbound    85075 6  tcp4   *:5353                *:*
unbound  unbound    85075 7  udp4   *:5353                *:*
unbound  unbound    85075 8  tcp4   *:5353                *:*
unbound  unbound    85075 9  tcp4   127.0.0.1:953         *:*
root     AdGuardHom 16088 13 tcp4   192.168.5.1:8080      *:*
root     AdGuardHom 16088 14 udp46  *:53                  *:*
root     AdGuardHom 16088 15 tcp46  *:53                  *:*
root     stubby     65272 3  udp4   127.0.0.1:8053        *:*
root     stubby     65272 4  tcp4   127.0.0.1:8053        *:*
root     stubby     65272 5  udp6   ::1:8053              *:*
root     stubby     65272 6  tcp6   ::1:8053              *:*

No NAT rule is needed, that's only compounding to the general confusion. :)


For clients that you don't want to be intercepted by the rule you can do the following:

Create Alias <alias name> - add your machines there.

On the DNS port forward rule make the Source <alias name> and check Source / Invert option


Essentially you'll redirect anything but said alias to AGH



I appreciate the attempt to help but I'm more interested in understanding the behaviour that should work but appears that doesn't, -why it does what it does-, than applying a workaround or different approach.
The NAT outbound rule works for one interface but the same rule for the other seems to cause a loop.
Why? If you could give an insight on that, I'll be most grateful.

I said it a few times already, NAT has nothing to do with what you're trying to accomplish here.

The only things needed are port forward rule(s) and an occasional alias for more complex scenarios.



September 28, 2023, 04:24:02 PM #9 Last Edit: September 29, 2023, 12:19:30 AM by cookiemonster
Right, and that's what is puzzling.
I am asking exactly that, ignore NAT, why the port forward rule for LAN all is good.
As soon as I enable the identical one for OPT1, DNS packets get from LAN clients to LAN. Picture 3, port forward.
edit: it should read:
As soon as I enable the identical one for OPT1, DNS packets get from LAN clients 192.168.5.0/24 to OPT1 address 192.168.6.1. Picture 3, port forward.

Because of the negation in the destination field, on both rules. Destination should be ANY Destination Port 53.



You can't control intra lan/vlan trafic from the FW to begin with, that is what the switch is for

Not trying to be contrary here. The LAN clients are connecting to the switch and the switch to the LAN port of the firewall, yes. But the OPT1 interface on the firewall there's nothing there except a single appliance client. So the traffic between them must go through the firewall.
You are making me think of the problem in different ways so thank you. If you have any more thoughts, please share.
This I got wrong "As soon as I enable the identical one for OPT1, DNS packets get from LAN clients to LAN. Picture 3, port forward."
It should read: "As soon as I enable the identical one for OPT1, DNS packets get from LAN clients 192.168.5.0/24 to OPT1 address 192.168.6.1. Picture 3, port forward.".

Well I found a fix but I won't claim to understand how it works and makes the problem dissappear.
The NAT Port Forward rule for LAN is left as is with NAT Reflection "use system default" ie. Yes.
The NAT Port Forward rule for OPT1 is changed to "Disable".
In other words I needed to disable NAT reflection only for the second interface. Otherwise it creates the bouncing of clients from LAN to OPT1. No idea why but will have to come back to try to understand it later. So far I fail to grasp the behaviour of packets when both are enabled.