I think I have discovered a bug with NAT and firewall (solved)

Started by chall88, January 13, 2025, 10:08:37 PM

Previous topic - Next topic
So you come calling a bug a misconfiguration based on not knowing how the product works; starting the thread without sings of courtesy (not even a Hello), not thanking anyone but finding a way to have a side dig at someone who took time out to try to help you (one of the most knowledgeable networking gurus out there).
Nice.

January 14, 2025, 01:30:57 PM #16 Last Edit: January 14, 2025, 01:59:05 PM by chall88
Quote from: cookiemonster on January 14, 2025, 10:55:31 AMSo you come calling a bug a misconfiguration based on not knowing how the product works; starting the thread without sings of courtesy (not even a Hello), not thanking anyone but finding a way to have a side dig at someone who took time out to try to help you (one of the most knowledgeable networking gurus out there).
Nice.

One must undo several pre-configurations to NAT http and https without using the firewall bypass mode "pass"

You tinkered until you got it working but we still don't really know what went wrong initially.

You probably didn't have to stop listening on WAN. Port forward will have taken precedence (I have ssh on all interfaces).
If you want to port forward 80 and 443 on WAN, it's indeed wise to move the web GUI off of these (At least for SSH, it is unnecessary).
I have no clue how anti-lockout (likely on LAN) could get in the way of WAN port forwards. (here again, I had this in place for my ssh tests).

January 15, 2025, 03:28:09 PM #18 Last Edit: January 15, 2025, 03:30:38 PM by chall88
Quote from: EricPerl on January 14, 2025, 10:32:03 PMYou tinkered until you got it working but we still don't really know what went wrong initially.

You probably didn't have to stop listening on WAN. Port forward will have taken precedence (I have ssh on all interfaces).
If you want to port forward 80 and 443 on WAN, it's indeed wise to move the web GUI off of these (At least for SSH, it is unnecessary).
I have no clue how anti-lockout (likely on LAN) could get in the way of WAN port forwards. (here again, I had this in place for my ssh tests).


That success also didnt carry over when I replicated those steps at work. I factory reset it and went through the fairly minimal set of steps involved in creating a port forward and it just behaves the same each time. Works with pass, doesn't without

I even had my boss who has been using opnsense since it forked from pfsense attempt it and he couldn't get it to work either. We did manage to load a  config from a different firewall and edit preexisting entries/linked rules and see it work but creating new ones doesn't, so we know the hardware is good. This appears to be some issue with the generation and linking of these things

I found this searching for similar experiences and found this in a bug report thread about opnsense not generating reflective snat/dnat rules properly, which mirrors my experience exactly.

I don't know what to say, I've spent about 10 hours banging my head against the wall on making a portforward that doesn't involve 'pass'. It's really not that complicated or difficult a task, and I feel this is a genuine bug.



https://imgur.com/a/6zghwJK

Then please provide the exact steps on how to reproduce the bug. Best in an issue on Github. Thanks!
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

  • Factory Reset
  • Software Version is Business 24.10.1
  • Assign WAN to 1 interface DHCP
  • Assign Lan to 1 interface static
  • Physically connect device with nginx listening port 80 to lan and verify it can ping the gateway
  • Move gui to 33400, disable http --> https redirect
  • NAT rule WAN 33400 to lan 33400, use pass so it works and you can work from the wider network
  • NAT rule TCP in on WAN Address port 9999, out on the ip of the physically connected nginx instance on 80. Use associated rule
  • test it by going to http://wanip:9999
  • No work
  • Delete rule
  • Recreate rule using pass
  • Works


I dug deeper into this yesterday on the firewall log

For associated rule I see
ingest on wan
Ingest translated on wan
outgest on lan

For pass I see
 ingest on wan
outgest on lan

For state tracking using pfctl -s state
I see SYN:ESTABLISHED on associtated
I see ESTABLISHED:ESTABLISHED on pass

On wireshark, I see no acks for my syns on the tcp handshake. So I think something is off with the reverse flow back out of translation


I've got to run, I can be more thorough with details later on

Quote from: chall88 on January 15, 2025, 04:19:59 PM
  • NAT rule WAN 33400 to lan 33400, use pass so it works and you can work from the wider network
Why would you do that instead of creating a firewall rule from:*/* to:WAN address/33400, allow?

Quote from: chall88 on January 15, 2025, 04:19:59 PM
  • NAT rule TCP in on WAN Address port 9999, out on the ip of the physically connected nginx instance on 80. Use associated rule
I expect this to work with an associated rule and I will take the time to recreate that setup. Tomorrow, probably.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

January 15, 2025, 06:02:34 PM #22 Last Edit: January 15, 2025, 06:08:01 PM by chall88
Quote from: Patrick M. Hausen on January 15, 2025, 04:34:03 PM
Quote from: chall88 on January 15, 2025, 04:19:59 PM
  • NAT rule WAN 33400 to lan 33400, use pass so it works and you can work from the wider network
Why would you do that instead of creating a firewall rule from:*/* to:WAN address/33400, allow?

Quote from: chall88 on January 15, 2025, 04:19:59 PM
  • NAT rule TCP in on WAN Address port 9999, out on the ip of the physically connected nginx instance on 80. Use associated rule
I expect this to work with an associated rule and I will take the time to recreate that setup. Tomorrow, probably.


I got in that habit because i was originally using the webgui as the test for this and moved to an actual external device to rule out any sort of weirdness with local interfaces.

also I forgot to put in the directions, you need to turn off the blocks for private networks on wan

I ran a slightly different test (minimal changes to use 80 on WAN):
  • Software Version is Business 24.7.10_2, virtualized within my internal network
  • Factory Reset
  • Assigned WAN to vtnet0, Assigned LAN to vtnet1 at the console (missed the prompt during boot...)
  • Connected to Web GUI, ran through the wizard accepting defaults, reconnected to default IP
  • Installed nginx on Ubuntu desktop connected to LAN (same machine used to access OPN)
  • Disabled http --> https redirect in OPN, freeing 80 (I didn't bother moving HTTPS to alternate port)
  • NAT rule TCP in on WAN Address port 80, redirected to the ip of the nginx instance on 80. Used associated rule
  • Accessed http://wanip from my internal network (after I adjusted FW rule on my main OPN to allow that leg
  • Welcome to nginx! page displayed

After disabling the redirect, I checked that the port was no longer used using 'sockstat | grep :80'.
I also checked that the anti-lockout rules no longer included 80 (not that it matters much because it's on LAN).
After creating the PF rule, I also verified the presence of the FW rule.

For the sake of it, I deleted the PF rule, recreated one with 9999 on WAN, also with associated rule.
I obviously had to add a rule on my main OPN too (main PC to spare OPN WAN leg), and success as well.
I re-enabled web GUI redirect and accessed from my phone. Still success.
While I was at it, I edited the rule from 9999 back to 80, and it still works (although the internal server listens on 80, it is ignored since the redirect rule takes over). Note that the WAN rule stays unchanged throughout this section, because it features the redirect target port 80 which remained unchanged.



Quote from: EricPerl on January 15, 2025, 11:12:10 PMI ran a slightly different test (minimal changes to use 80 on WAN):
  • Software Version is Business 24.7.10_2, virtualized within my internal network
  • Factory Reset
  • Assigned WAN to vtnet0, Assigned LAN to vtnet1 at the console (missed the prompt during boot...)
  • Connected to Web GUI, ran through the wizard accepting defaults, reconnected to default IP
  • Installed nginx on Ubuntu desktop connected to LAN (same machine used to access OPN)
  • Disabled http --> https redirect in OPN, freeing 80 (I didn't bother moving HTTPS to alternate port)
  • NAT rule TCP in on WAN Address port 80, redirected to the ip of the nginx instance on 80. Used associated rule
  • Accessed http://wanip from my internal network (after I adjusted FW rule on my main OPN to allow that leg
  • Welcome to nginx! page displayed

After disabling the redirect, I checked that the port was no longer used using 'sockstat | grep :80'.
I also checked that the anti-lockout rules no longer included 80 (not that it matters much because it's on LAN).
After creating the PF rule, I also verified the presence of the FW rule.

For the sake of it, I deleted the PF rule, recreated one with 9999 on WAN, also with associated rule.
I obviously had to add a rule on my main OPN too (main PC to spare OPN WAN leg), and success as well.
I re-enabled web GUI redirect and accessed from my phone. Still success.
While I was at it, I edited the rule from 9999 back to 80, and it still works (although the internal server listens on 80, it is ignored since the redirect rule takes over). Note that the WAN rule stays unchanged throughout this section, because it features the redirect target port 80 which remained unchanged.




I tried this at home also virtualized on community 24.7.11_2 and it worked properly. I wonder if something is off in Business 24.10.1

January 16, 2025, 04:35:32 PM #25 Last Edit: January 16, 2025, 04:38:04 PM by chall88
I'm going to try to re-image this hardware from scratch. Something is very wrong with this in either the software or hardware.

I switched it to 24.7 community that I had gotten to work in my virtual lab last night and now the packet filter doesn't function properly with tcp traffic at all.

This reminds me why I had the 33400 natted to internal as a 'pass' nat rule that Eric was asking about. I had figured out I needed to do it that way for it to work in the beginning and kept doing it.  I bet its been busted tcp filtering for wan the whole time and the 'pass' mode circumvents the firewall.

I shut off the packet filter with console cable.
Connected on ssh
did an allow in on wan 22
did an allow in on wan 33400
did an allow icmp on wan
re-enabled the packet filter
The icmp works.
All the traffic on wan on those tcp ports doesn't, even after reboot.

When I take these actions on my virtual opnsense in kvm/qemu it works properly. On the real hardware it doesn't.

The hardware is a dec 2700 opnsense brand firewall.

If a reimage from scratch doesn't get it, i'll see about getting the business support people involved and possibly RMA'ing this hardware.

Here's a picture of my WAN rules for allow, and a picture of my ssh session being very not allowed when turning the packet filter back on

https://imgur.com/a/L082gow

Is this a "real WAN", i.e. ISP connection or a lab situation?

In a lab with WAN connected via Ethernet and some default gateway, if your PC from which you try SSH is connected to the same network, you need to disable the gateway enforcement on WAN. OPNsense will send the reply packets to the GW instead of your PC otherwise.

Firewall > Settings > Advanced > Disable force gateway

I also check

Firewall > Settings > Advanced > Disable anti-lockout

because I don't like "magic intransparent features". And the way it's implemented it can also get in the way of UI or SSH from other interfaces in my experience but that might have been my mistake at the time.

HTH,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

January 16, 2025, 04:52:13 PM #27 Last Edit: January 16, 2025, 05:02:02 PM by chall88
>In a lab with WAN connected via Ethernet and some default gateway, if your PC from which you try SSH is connected to the same network, you need to disable the gateway enforcement on WAN. OPNsense will send the reply packets to the GW instead of your PC otherwise.

If this mattered pings wouldnt work?

I just tried this and could ssh to it.

I don't understand what this is doing, is this dnat or d rdr?

Its all on 99.99.99.0/24 and the gateway is 99.99.99.99

This also explains why in my virtual lab it worked, the wan was dhcp'ing in KVM/QEMU's natted subnetwork. The networks were not flat between my host and the wan interface


>because I don't like "magic intransparent features"

This very thing may have just cost me about 20 hours of tinkering

January 16, 2025, 05:13:48 PM #28 Last Edit: January 16, 2025, 05:20:29 PM by Patrick M. Hausen
Quote from: chall88 on January 16, 2025, 04:52:13 PMIf this mattered pings wouldnt work?

How did you permit ICMP echo on WAN? Floating rule? Order is different then, but I'd have to check the details. I think the routing override (see below) applies to them as well. But, possibly your WAN gateway would just forward the ICMP echo reply internally without making much of a fuss about it. Compared to a TCP packet that has SYN/ACK but it had never seen the initial SYN. So if your gateway is also a firewall, that would explain it.

Quote from: chall88 on January 16, 2025, 04:52:13 PMI just tried this and could ssh to it.
I don't understand what this is doing, is this dnat or d rdr?

From the docs:
QuoteOutgoing packets from this firewall on an interface which has a gateway will normally use the specified gateway for that interface. When this option is set the route will be selected by the system routing table instead.

This will create a high priority automatic policy routing entry like this:
pass out route-to (pppoe0 62.156.*.*) inet from (pppoe0) to ! (pppoe0:network) flags S/SA keep state allow-opts label "badf9fd7b03523686df3cda925091a44"

overriding the routing table - specifically in your lab bypassing "local" connections and ARP.

HTH,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

What a journey. This seems to work as it should now, with the setting enabled after changing the way I was coming at it. 

I just pulled my cord into the 99.99.99.0 space and setup rules in our actual firewall to allow flow between my main workstation ip and vlan 999's network and with that extra layer the 99.99.99.99 gateway got it to me.

It ended up being an automagical redirecter setting in the firewall/advanced section. Making the scenario simpler when troubleshooting is what bit me.

Gateway was probably very confused why it was getting packets with responses to things it never asked.

Thank you for your help.