Lots of blocked traffic despite allow all rules

Started by JimIFN, June 12, 2025, 12:50:37 AM

Previous topic - Next topic
Hello all:

I'm having a heck of a time getting my OPNsense install working correctly.  I am a small ISP with two BGP feeds, one on WAN and the other Opt1.  I have a couple other OPT networks for customer facing IP addresses (I have my own ARIN-assigned IP blocks).  I only have one network that is NATted (LAN -- my internal network); all others have public IPs routed via BGP. 

Right now, to try and troubleshoot this issue, I have Allow IPv4+IPv6 all rules on all my interfaces.  Yet, I still have significant chunks of the internet that don't work (currently facebook.com and wordpress.com are easy examples).  When I look at my log live view, I see the default deny (automatic) rule is catching and blocking them.  I  have been trying to figure out why for quite some time.

I'm not really sure for a clean way to post my config here, so here are some screenshots.

I (and my customers) would greatly appreciate some help figuring this out.

Thanks!

So, we found a recommendation to add  floating allow-all rule, and did that, but its only somewhat helping.

I also disabled one of our upstream ISPs so we're not multi-wan anymore, but that too did not make a difference.

We aren't seeing much of a pattern to go on...Just some sites don't work and others do.  Even though we've committed the cardinal sin and just "allowed all from everywhere"...I'm still very, very confused as to why we can't get to wordpress.com and other such sites.

Quote from: JimIFN on June 12, 2025, 01:58:36 AMBaseball Bros IO
So, we found a recommendation to add  floating allow-all rule, and did that, but its only somewhat helping.

I also disabled one of our upstream ISPs so we're not multi-wan anymore, but that too did not make a difference.

We aren't seeing much of a pattern to go on...Just some sites don't work and others do.  Even though we've committed the cardinal sin and just "allowed all from everywhere"... I'm still very, very confused as to why we can't get to wordpress.com and other such sites.
When you have BGP, multiple routes, and public IPs, it is important to ensure that the routing and return paths are consistent. If you just allow all without a proper routing policy, OPNsense may not know which 'return path' for the session is correct, especially if there is asymmetric routing. I recommend checking the routing table of each interface carefully (Diagnostics → Routes) and using tcpdump to verify which flows are failing.

I see a VLAN in there. Do you have the main interface assigned/in use? If so, that'll cause wackiness. (I forget the precise nature - it's deterministic, but I don't see a useful application for it. Best avoid it.)

So, with BGP routing, the return path of the packet is always up to the network.  As the ISP, I have no control over which route the packet will come back, and in 50-60% of the cases, it will come back through a different provider than it was sent on.  This is normal, and every other ISP I've spoken to sees the same thing.  This is how BGP works...You announce yourself.  You choose which interface to send a packet out.  Each hop along the way chooses which of its routes to use to send it toward its destination.  The destination generates a return packet.  Each hop along the way chooses which route to use to deliver it to you.  This path is rarely the same it took to the destination.

I will likely have 3 paths eventually, and possibly an IX thrown in there for good measure.

However, for some of this testing, I did disable one of my providers so all traffic is going out a single WAN and coming back through the same (the internet only sees me connected through the one provider now).  Still, I see the same disconnect issue.

I also disabled the tag-which-path-to-return-on option in opnsense....forget where that is.  I believe I also set the state table (it actually defaulted, if I recall correctly) to not care about the interface for state matching; the IP/Port is good enough.

As to "default interface" -- if you're referring to the "LAN" interface as defined by opnsense, yes I am using that -- as my office / mgnt network.  All customer facing networks are on opt networks.  Also, one of my BGP peers / feeds is connected on WAN (the one that I am currently using for testing), my second interface is on opt1.  OPNsense doesn't really provide a way for not using the default LAN/WAN...

Years ago, I ran pfsense (I think I set that up before OPNsense was out or at least, before it was considered stable).  I did run with WAN and LAN, and things worked well, but I only had one WAN at the time (I did BGP-announce it).


Oh, I also realize I forgot to mention this is an HA Pair with CARP, pfSync, and xmlsync running.  Switching thus far has worked very well.

So, with the firm requirement that Internet traffic may return on a different interface than it left, how do I configure OPNsense correctly?

I discounted the path issue, as both paths seem to pass through OPNsense, and "Firewall: Settings: Advanced", down under "Miscellaneous", "Bind states to interface" is unchecked by default. Do you have it checked?

How about that ix0? Unassigned/unconfigured?

BGP has the possibility to control where the traffic is routed by using its Attribute parameters. If an Upstream BGP peer is not rewriting it, which usually is not the case for Public transit. You can control with BGP ingress as well egress.

From Scope of the FW you do not care if a packet goes via ISP routerA or routerB, the main thing is to receive the packet on the same FW or interface (if you have "Bind states to interface" it needs to come, ingress, to the SAME FW and SAME interface as egress, if its unchecked the state is tracked globally thus only the same FW).


When you check the rule details in your picture. You see the TCP flag is set to R meaning "RESET"

QuoteThe flags
          are: (F)IN, (S)YN, (R)ST, (P)USH, (A)CK, (U)RG, (E)CE, and C(W)R.

It possible the Source, Reseted the connection, and FW already dropped the session due to this. But the Source still tries to sent RESET flags so FW drops it cause there is no session in the table for that specific sIP-dIP-Port pair. This is supported by the fact you as well disable one of the two WAN connection and still see the behavior.

The question is, why the source sends the RESET flag?

Do you have sticky connections enabled?
How is your NAT configured?
Do you have shared forwarding enabled?
Do you have Disable force gateway disabled?

https://docs.opnsense.org/manual/how-tos/multiwan.html
https://docs.opnsense.org/manual/firewall_settings.html#shared-forwarding
https://github.com/opnsense/core/issues/5869
https://github.com/opnsense/core/issues/5869#issuecomment-1386221531

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

Thank you for your replies.

First question, ix0: it is unused.  All interfaces are vlan-tagged on ix0, so I have ix0_vlan140, ix0_vlan52, etc.  ix0 (no vlan) has no traffic or direct use.

All traffic will arrive to the same firewall (FW), so yes, in this case that is fine. 

Early this AM, paid support for OPNsense pointed out the synproxy option was set, and clearing that fixed the dropped packets issue....

Thank you for your replies!

Great!

Actually that would explain the deny with RESET flag.

If you used synproxy to protect from synfloods I think its better to setup synchache/syncookie protection
QuoteFirewall > Settings > Advanced > Anti DDOS

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD