Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Gianks

#1
22.1 Legacy Series / Re: Default Deny Rule - Once Again
February 17, 2022, 02:11:45 PM
Thanks for your time and story.

I agree on the difficulty of a proper external help without a full overview of the configuration.
I must admin that we pushed the overall setup to some degree of complexity using VPNs, multiWan FO & LB, HA, etc.
That's the main reason why I objected about the rule definitions, hard to imagine the problem is so easy to reproduce since I would bet it would affect many more admins.

For the sake of the discussion, I do believe that in consideration of what we said, a better set of tools for debugging should be evaluated to overcome the need of seeing/interacting with the actual device/configuration.
Providing some insight about what is going on behind the curtains to the firewall admins might be the solution to allow them to help practically to root out bugs without disclosing information.

On the other hand, similarly, integrating some export tool for debugging logs (with some mangling of IPs and other sensitive info), to allow users to provide something standardized to developers to more easily get to an answer, might conduce to a superior bug discovery process and overall support experience for the community.

Meanwhile, after replacing the various rules and a bunch of reboots (including a full power off of both units at the same time), the problem seems to be gone.

Thanks a lot!
#2
22.1 Legacy Series / Re: Default Deny Rule - Once Again
February 16, 2022, 07:07:03 PM
Quote from: Fright on February 16, 2022, 04:50:45 PM
sharing actual rule parameters (with some sanitizing if needed) and dropped packet info (catch dropped packet in live view and hit "i" button in packet string) would be helpful imho.
Not really, if I may. What you are saying would possibly help to reproduce the issue from scratch (maybe). If I am correct about this being a bug, the relevant part of code involved might have almost nothing to do with the rule definitions themselves, for ex. the ordering might be relevant, including other rules which appears innocent). To be clear, the rules are correct because they worked until yesterday. Emulating our configuration (partially moreover) might never present the problem in a test.

Quote from: Fright on February 16, 2022, 04:50:45 PM
and it should be blocked at first packet. probably the out-of-state
https://forum.opnsense.org/index.php?topic=20219.0
Well i guess you are correct on the possibility of such case (2nd or following packet gets there first) but this does not match our case for two reasons:

  • The rule misbehavior is the cause, not the network layout. Proven by our workaroud which makes everything work again flawlessly.
  • The real packet 0 (SYN), once finally arrived, should by logged also as 'pass' for the same exact connection (the one just blocked), but just does not happen. I would add on this particular point: UDP is stateless, so whichever packet can be packet 0.

I might be wrong on second point, happy to hear your thoughts.

My opinion is that some rules are not loaded in pf, and I would like to be able to know if this is true or not.
Clearly the WebUI believes those rules are there, i doubt pf does (for whichever reason).

By the way I will try your hint to see if I can produce something useful for the research.
#3
22.1 Legacy Series / Re: Default Deny Rule - Once Again
February 16, 2022, 04:15:04 PM
Hi, thanks for the reply.
I honestly I did not really know which details to provide for starters on such a problem, it's not something I did or tried to do.
We just updated the firewall from 21.7 (i hope to be correct here) to 22.1.

I wish to do some debug but i don't know how and that's why i started the thread.
Please tell me what you need to investigate the issue, I wish to provide you all the needed information to fix it!

By the way two rules that stopped working are:
pass TCP/UDP Host A:any to Host B:X
pass UDP any:any to !NetC:Y

Creating an identical rule did not help, but without port it worked, for example. The duplicated version works perfectly without any port change or else...  ???
Thanks
#4
22.1 Legacy Series / Re: Default Deny Rule - Once Again
February 16, 2022, 03:55:58 PM
Duplicating one of the few rules which was still working (on one of the affected interfaces), and editing the parameters to match one of the not working ones, magically fixed the issue and the traffic started flowing again, at least for that one rule.
To be noticed that creating the same rule manually did not produce the same effect (kept being ignored).

Duplicating the duplicated, the process worked again. Not sure this will make us fix all the weird behaviors we see in the logs (we can't even rule over returning traffic to opnsense...) but at least seems to be allowing us to restore the network services.

Is there still someone suggesting this is not a bug?  :o
#5
22.1 Legacy Series / Re: Default Deny Rule - Once Again
February 16, 2022, 03:40:27 PM
Attempted to backup and restore the entire configuration, it did not work, nevermind just a reboot.
Disabling and/or cloning rules did not produce any effect.

Is there a way to debug how packet filter decided to fallback to default deny?
Is it possibile to obtain a dump of the actually applied rules?
#6
22.1 Legacy Series / Re: Default Deny Rule - Once Again
February 16, 2022, 03:37:29 PM
For the sake of the record, I wish to make clear that is is not an OOO packet problem.
There is 100% consistency in the behavior per rule, the ones not working are ALWAYS ignored, the others keep working normally.

Also to be noticed that some connections are blocked by the firewall at packet 0, hard to believe there was a packet -1.
#7
22.1 Legacy Series / Re: Default Deny Rule - Once Again
February 16, 2022, 02:05:35 PM
I can add that there are now a ton of blocked packets returning to opnsense on its VIP for DNS, NTP, etc... this is not even traffic generated by us but by the firewall itself!  :-X
#8
22.1 Legacy Series / Default Deny Rule - Once Again
February 16, 2022, 12:11:06 PM
Hi,
Yesterday we upgraded both primary and backup opnsense to last release...
as a result we are suddenly experiencing the same behavior which brought us to leave pfSense for OpnSense one year ago.

Rules are now just partially applied on some interfaces. Defining the same rules again does not work, it's simply like they are not there. Some traffic is allowed if rules are defined with a broader scope but this simply makes no Sense at all.

The configuration has not been changed in weeks, just the upgrade.

Our network is partially unusable, the software is not working right. Traffic is blocked with Default Deny Rule.
#9
Hi,
do you manage the registration expire time on your pbx and what value is set?
wish i could! Our pbx connects to the telecom provider which is easy to piss off (fail2ban style). For the sake of completeness i must say that Asterisk would not allow itself to re-register before the server assigned timeout without manual intervention. Qualifying the client on a regular basis (normal workaround for such cases) is not an option allowed by our telecom. We are screwed  :o

in theory. but to control the translation, states are used for udp, icmp etc.
Agreed, as far as i understand, a new connection where just a single packet exchange has occurred has a shorter timeout (at least on Linux) which gets incremented when a subsequent exchange is completed (for the sake of this discussion i simplified to 'exchanges').
But... this still does not explain why a state with a 13 minute timeout left (pfTop) gets cleared before it's time!
Once the timer has been reset from 2 minutes to 15, why is 2 (empirically determined, 15-13) still relevant?

I guess my point is to understand what is going on more than finding a workaround because imho this is not the desired/expected behaviour of pf (therefore OpnSense and others based on it), otherwise we should re-define what timeout means!  ;D
#10
Hi, thanks for the answer but i don't think it has any relevance the application in use (btw, it's a SIP PBX TRUNK).
If we agree that UDP is basically stateless (without additional packet inspection) the application protocol cannot affect firewall states.
The only thing i did not state clearly enough i guess is that these connections are idle for approximately 6 minutes before each exchange, no drop occurs if there is continuous traffic and this is why i am pointing to the expiration timer.

So... what is causing a UDP state to be dropped before it's expiration given an almost empty states table (less than 2% full)?
OpnSense does not provide a per rule state expiration, but such is provided in pfSense and after testing it yesterday i can tell the issue occurs with both. In both products you can see sometimes the timer being far from expiry and still the state gets dropped for no apparent reason.

As i said it's my belief this is a BUG but i don't know enough about pf to do additional testing.
#11
Hi,
we are having troubles with UDP connections getting closed prematurely.
Upon checks with pftop we think there is a possible serious bug with the timeouts, at least with UDP: connections with more than 13 minutes of expiry left are suddenly closed for no reason.

OpnSense is already operating with conservative settings but the problem remains: connections are really dropped and applications have to reconnect all the time with SIP services.

And regardless of the setting, there is no reason why the timer starts with 1 minute, increases the expiry timer to 15 after some packages (another unexplicable behaviour, seems conservative is still trying to 'optimize' somehow, but nobody asked), and then disregards it after less than 2 minutes, forgetting completely of the state and blocking following packages!

Shall i file a bug on GitHub?
Thanks a lot