Slower over time

Started by dcol, June 18, 2023, 12:44:24 AM

Previous topic - Next topic
June 18, 2023, 12:44:24 AM Last Edit: June 18, 2023, 12:48:31 AM by dcol
Using 23.1.9. Very basic generic setup. One LAN, one DHCP WAN.
Internet speeds come to a crawl. If I reboot speeds come back, but withing a few hours, back to crawling. I can barely remote into the WebGUI when it is slow.
Resources look fine. Memory is at 14% when slow and 7% when rebooted.

Anything I should be looking at? Nothing meaningful in any logs.

This is a new box that I installed OPNsense on and just restored the config. The old box had a minor disk issue. This slowdown issue is since I put this new box in.

June 19, 2023, 05:19:32 PM #1 Last Edit: June 19, 2023, 05:53:41 PM by dcol
Anyone..........

Not sure what to do at this point. Tried adding flow control disabled to tunables.

Only thing I see is the higher the memory usage, the slower it gets. But I am only at 14% and near zero on CPU.
Takes a minute to load the WebGUI. If I reboot, speed goes back to normal. Logs are empty.
I also tired reloading all services in the console. Still slow. Memory is now @ 16% and even slower.
What causes the memory to increase like that? I am not using Suricata or VLAN's. Just a plain default config.

Another important clue. speed comes back when I reset the states tables in firewall diagnostics.
So I disabled IPv6 altogether to see if that affects the speed. I will know in a few hours.

HELP!!!!!!!!!!!!!!!!!!!!!!!!!!!. Please.

What hardware are you using? Are you running OPNsense virtualized?
In theory there is no difference between theory and practice. In practice there is.

Not virtualized. Basic default installation with no plugins.
MiniPC Intel J4125 8GB, 128GB NVMe, 4-i225 NIC ports (igc)
So far speed ok after 1 hour. Usually takes a few hours

Is your OPNsense box connected directly to the internet? Or is there an ISP modem in between?
In theory there is no difference between theory and practice. In practice there is.

There is an ISP supplied modem. It is in pass-through mode. The previous OPNsense firewall, which had a degrading drive, worked fine. I imported the config file to the new box.

It may be a long shot, but have you tried starting with default settings / empty configuration, and configuring the system yourself, instead of loading an existing config file? Your config file is made on another system with other hardware, so you may be silently importing settings that you don't want.
In theory there is no difference between theory and practice. In practice there is.

I'd agree with dinguz, since your system is different from the original, there may be a lot of config changes that might have been saved that may or may not relevant that is causing this issue.

I will look at the config file again. I did change the igb's to igc's. That is the only hardware difference between the old and new box other than the old Intel CPU was J1900 and the new one J4125.
So far, speed hasn't slowed down since I disabled IPv6 and reset the state tables. But can't be sure until tomorrow morning.
What I found made the difference was resetting the state tables. Which may also be IPv6 related.

Starting over with a default state and then slowly adding your changes back is definitely the way to track things down.

As previously mentioned, your config could have a problem in it, but also it could be that you have changes that aren't required and/or cause unintended consequences.  A lot of times problems are self inflected because people read some guide on the internet and went and modified things that didn't need to be changed.

Back to slow again this morning. I reset the state tables and back to full speed.
As I said before, this is a default installation with only flow control disabled added to the tunables.
The old box also was also a default config with no added tunables.
So far, don't see anything suspicious in the config file.

Is your state table becoming full? If so, increase the size and/or find the reason it is filling up. Perhaps you have set firewall optimization to 'conservative'?
In theory there is no difference between theory and practice. In practice there is.

Quote from: dcol on June 20, 2023, 05:12:19 PM
Back to slow again this morning. I reset the state tables and back to full speed.
As I said before, this is a default installation with only flow control disabled added to the tunables.
The old box also was also a default config with no added tunables.
So far, don't see anything suspicious in the config file.

So I know i'm new here (tho ive used pf for a while and now on opn) but i'm just curious.  Several people have suggested wiping clean and starting default and building up.  You seem to be against that, while saying your old config was default config.  If the old machine was a default config, why would you be importing anything at all?  Why would you be looking for something "suspicious" in the old config instead of simply wiping fresh and building up?  I mean if the old one was "default" with nothing changed, then you gain nothing by trying to import?  So why not just rule it out?  You can always do clean wipe and try from there and if you needed something from the old "default" then you could import later, but just to troubleshoot.......

June 21, 2023, 12:13:46 AM #13 Last Edit: June 21, 2023, 12:22:59 AM by dcol
Reason I don't just start over is because this box is 300+ miles away with no IT people there. So I have to prepare the box and send it. The people there can swap cables, but that is the extent of their knowledge.

As far as the state table size, I actually reduced it to 250000 to see if it has an effect.  I don't know how to tell if it is full. Best I can tell there are about 1100 entries in there now. I also changed the Firewall Optimization from conservative to normal. It is still ok speed from the last reset about 6 hours ago.

So my questions are, would filling the state table actually cause a slowdown, and what causes the state table to fill up? To my knowledge the site does not have high volume internet usage.

Also why should the state table be an issue at all. The previous box, which had bad SSD sectors, worked fine with the same state table size. The issue with the old box is it gave errors when I tried to do updates, or anything with plugins. I even got an error when trying to get to the shell from the console. But the old box ran fine, just couldn't make any changes to it without an error popping up. Which is why I replaced it.

By the way I have three other sites with similar OPNsense hardware and configurations. They all work fine on the latest release.

1) So unable to update plugins or other things = caused by  faulty drive = faulty data = potentially corrupt exported config?
2) Same machine yet different processor, different nic, maybe = different MB rev. and / or BIOS, or?
3) Importing "default" config.  Default config = nothing changed = what are you importing?
     (Importing between non/exact hardware and software versions is always dodgy)  Even the various *Nixes that provide a way to do a release upgrade advise against it because of ghosts in the machine and advice clean install and setup.

Maybe just change the HD on the old box then you can be 100% that everything is exactly the same?

300 Miles is a challenge for sure.  My solution ended up just traveling there, many a time (regionally around the north east), but YMMV.  I had to make yearly trips to some remote locations anyway, mostly small , some medium shops.

But take with a grain of salt, for sure.  They are just random thoughts offered in a friendly manner in an attempt to be at all  helpful.   I won't bother you again :)