Unable to Open Firewall Live View

Started by anicoletti, April 14, 2022, 05:29:37 PM

Previous topic - Next topic
We are noticing that whenever we attempt to open the Firewall Live View, we are getting massive spikes in connectivity delays. Closing the tab and going back to the dashboard instantly resolves the issues. I'm not seeing any CPU spikes during this time. Anyone else experiencing these issues?

April 19, 2022, 04:22:57 AM #1 Last Edit: April 19, 2022, 04:27:45 AM by anicoletti
We were still experiencing this issue, so we installed OPNsense 22.1 fresh on a new server and upgraded to 22.1.6, then imported our configuration, and we are still having issues with Live View causing connectivity loss. Did notice that this time it did trigger a latency warning on the gateway. I also noticed that pfctl is pegging out multiple threads at 100% when the latency starts happening.

Maybe too much logged contents in firewall log?

# ls -lah /var/log/filter


Cheers,
Franco

Current Days log is up to 198M. We went through and disabled most of the logging options originally to see if that would help clear up the issue. I'll check again and make sure we have everything disabled and test again. Thanks for the feedback.

April 19, 2022, 07:47:12 PM #4 Last Edit: April 19, 2022, 07:55:15 PM by anicoletti
So we disabled every bit of logging from rules and Settings \ Logging, leaving only the rdr items to display, and within 10 seconds we started dropping packets again. I cleared and reinitialized the logs, same issue. To provide additional context, here's the general information of our server.

Dell PowerEdge R320, Xeon E5-2470@2.30GHz (8 cores, 16 threads), 32GB RAM
92 VLANs, 89 Interfaces, across 6 Broadcom NICs
Dual 256GB drives in Mirror, with a Hot Spare available

> 92 VLANs, 89 Interfaces

Er, that might be an issue here, but I honestly don't know. Could be the backend not being able to cope with the mapping for each log entry.


Cheers,
Franco

So is OPNsense not considered data center ready if it cannot handle that number of VLANs or Interfaces? It was working fine until we moved up to 22.1. I'm considering rebuilding again with 21.7.8, as well as a full rebuild on 22.1.6 recreating all the VLANs, Interfaces, and firewall rules from scratch (ugh...) since we've learn a bit in our past two years using the platform.

We've done multiple such performance improvements in OPNsense for customers using even more interfaces than this over the past couple of years. We can look into it but it takes time and effort. For the time being it's the first time I hear this issue for live view specifically, but it could be a side effect from the circular logging removal in 22.1. Stranger things have happened. ;)


Cheers,
Franco

Thanks for the clarification. Use to do software development myself, so completely understand how a completely unrelated change ends up causing some weird side affects in other areas. I'll move forward on testing back on 21.7.8 and a full reboot of the ruleset and see if we can find anything out. We have about 70 firewalls running OPNsense without this issue, but none to the scale of our data center.

I'd argue that changing the log file size from a fixed standard of 512 kb to possibly gigabytes of data per file would directly influence the ability of a live firewall log reader to process data, but what does it matter to you? ;)

In any case please open a ticket in GitHub and we will take a look. https://github.com/opnsense/core/issues/new?assignees=&labels=&template=bug_report.md&title=

At this point, however, it is an assumption on my part what the issue could be so it would be good to avoid jumping to conclusions.


Cheers,
Franco

I'm beginning to believe it's not a log read issue, but rather a parsing issue. I'm noticing that opening any of the interfaces under Firewall to look at the rules is taking between 30-60 seconds to load, plus when I went to reassign an interface to a new VLAN the GUI just locked up and sat there spinning and never updated the interface.

I'm probably going to just rebuild the whole config from scratch  :'( and hopefully that will resolve the issue, unless someone has any suggestions in what to look for in the config that might be causing issues with the interfaces and rules.

Just saw that 22.1.9 potentially fixes some memory leak issues. Any chance it could fix these? I can't install just yet because we need to go through regular maintenance windows, but man I would love to be able to look at live view again.

Not the same issue.

But there are performance improvements for live log on the development version.


Cheers,
Franco

Good deal. I'll look into loading the development branch to one of the units we have set aside for HA and see if that helps. Thanks for the update!