GUI and traffic incredibly slow after upgrade

Started by securid, February 03, 2024, 10:12:21 AM

Previous topic - Next topic
I waited for the fixes on HAproxy with SNI to update. Seems like there's solution so I decided to backup and upgrade. The upgrade went fine, but OPNsense is so incredibly slow its crazy.

I have a ping open and when I click to go to the Dashboard this happens:

64 bytes from 10.0.0.1: icmp_seq=548 ttl=64 time=331.020 ms
64 bytes from 10.0.0.1: icmp_seq=549 ttl=64 time=7.583 ms
64 bytes from 10.0.0.1: icmp_seq=550 ttl=64 time=54.139 ms
64 bytes from 10.0.0.1: icmp_seq=551 ttl=64 time=545.836 ms
64 bytes from 10.0.0.1: icmp_seq=552 ttl=64 time=459.255 ms
64 bytes from 10.0.0.1: icmp_seq=553 ttl=64 time=270.995 ms
64 bytes from 10.0.0.1: icmp_seq=554 ttl=64 time=61.969 ms
64 bytes from 10.0.0.1: icmp_seq=555 ttl=64 time=17.350 ms
64 bytes from 10.0.0.1: icmp_seq=556 ttl=64 time=35.384 ms
64 bytes from 10.0.0.1: icmp_seq=557 ttl=64 time=226.723 ms
64 bytes from 10.0.0.1: icmp_seq=558 ttl=64 time=70.730 ms
64 bytes from 10.0.0.1: icmp_seq=559 ttl=64 time=458.012 ms
64 bytes from 10.0.0.1: icmp_seq=560 ttl=64 time=25.976 ms
Request timeout for icmp_seq 561
64 bytes from 10.0.0.1: icmp_seq=562 ttl=64 time=614.986 ms
Request timeout for icmp_seq 563
Request timeout for icmp_seq 564
64 bytes from 10.0.0.1: icmp_seq=563 ttl=64 time=2459.566 ms
Request timeout for icmp_seq 566
Request timeout for icmp_seq 567
Request timeout for icmp_seq 568
Request timeout for icmp_seq 569
Request timeout for icmp_seq 570
Request timeout for icmp_seq 571
Request timeout for icmp_seq 572
Request timeout for icmp_seq 573
64 bytes from 10.0.0.1: icmp_seq=564 ttl=64 time=10620.278 ms
64 bytes from 10.0.0.1: icmp_seq=565 ttl=64 time=9665.505 ms
64 bytes from 10.0.0.1: icmp_seq=566 ttl=64 time=8736.085 ms
64 bytes from 10.0.0.1: icmp_seq=567 ttl=64 time=7975.989 ms
64 bytes from 10.0.0.1: icmp_seq=568 ttl=64 time=7223.718 ms
64 bytes from 10.0.0.1: icmp_seq=569 ttl=64 time=6320.197 ms
64 bytes from 10.0.0.1: icmp_seq=570 ttl=64 time=5447.343 ms
64 bytes from 10.0.0.1: icmp_seq=571 ttl=64 time=4457.011 ms
64 bytes from 10.0.0.1: icmp_seq=572 ttl=64 time=3555.876 ms
64 bytes from 10.0.0.1: icmp_seq=573 ttl=64 time=2556.972 ms
64 bytes from 10.0.0.1: icmp_seq=574 ttl=64 time=1566.402 ms
64 bytes from 10.0.0.1: icmp_seq=575 ttl=64 time=578.149 ms
64 bytes from 10.0.0.1: icmp_seq=576 ttl=64 time=8.687 ms
64 bytes from 10.0.0.1: icmp_seq=577 ttl=64 time=43.058 ms


I can login to SSH, barely. Takes more than 10 seconds and typing has like a second delay on each character, but there seem to be no CPU load and no other excessive use of RAM or processes. It looks like congestion on the interfaces but with the slowness its hard to troubleshoot.

I'll continue troubleshooting but if this sounds familiar to someone please share your insights.

Thank you.

If you have mimugmail repo, install htop - has no other dependencies iirc - and post a screenshot.

Else run top, press a, post screenshot.

Attached. Sometimes I see this one:
37650 root          2  52    0    70M    49M usem     0   0:01  42.22% python3.9

But that's it. Less than 100kbps traffic in total on the graph.

Rebooted, no change.

Disabled haproxy and wireguard services too. Average less than 50kbps now. I don't understand what is going on  ::)

That looks like a console output. Over SSH you'd see much more in the Command column.

Don't disable services either, kinda hard seeing an issue when a potential culprit is not running.

That is ssh. Its a screenshot from my terminal, not on the console.

OPNsense just completely died, wouldn't even shutdown after pressing the power button (which usually does a clean shutdown and power off).

Took the power off, hooked up a monitor and powered it back on. Its actually a little better. More responsive and ping replies are better.


64 bytes from 10.0.0.1: icmp_seq=1771 ttl=64 time=5.912 ms
64 bytes from 10.0.0.1: icmp_seq=1772 ttl=64 time=6.739 ms
64 bytes from 10.0.0.1: icmp_seq=1773 ttl=64 time=4.854 ms
64 bytes from 10.0.0.1: icmp_seq=1774 ttl=64 time=7.470 ms
64 bytes from 10.0.0.1: icmp_seq=1775 ttl=64 time=7.027 ms
64 bytes from 10.0.0.1: icmp_seq=1776 ttl=64 time=5.839 ms
64 bytes from 10.0.0.1: icmp_seq=1777 ttl=64 time=18.693 ms
64 bytes from 10.0.0.1: icmp_seq=1778 ttl=64 time=5.696 ms
64 bytes from 10.0.0.1: icmp_seq=1779 ttl=64 time=10.358 ms
64 bytes from 10.0.0.1: icmp_seq=1780 ttl=64 time=68.548 ms
64 bytes from 10.0.0.1: icmp_seq=1781 ttl=64 time=81.970 ms
64 bytes from 10.0.0.1: icmp_seq=1782 ttl=64 time=4.834 ms
64 bytes from 10.0.0.1: icmp_seq=1783 ttl=64 time=13.632 ms
64 bytes from 10.0.0.1: icmp_seq=1784 ttl=64 time=29.814 ms
64 bytes from 10.0.0.1: icmp_seq=1785 ttl=64 time=7.193 ms
64 bytes from 10.0.0.1: icmp_seq=1786 ttl=64 time=4.040 ms
64 bytes from 10.0.0.1: icmp_seq=1787 ttl=64 time=4.211 ms
64 bytes from 10.0.0.1: icmp_seq=1788 ttl=64 time=26.885 ms
64 bytes from 10.0.0.1: icmp_seq=1789 ttl=64 time=8.788 ms
64 bytes from 10.0.0.1: icmp_seq=1790 ttl=64 time=7.044 ms
64 bytes from 10.0.0.1: icmp_seq=1791 ttl=64 time=10.948 ms
64 bytes from 10.0.0.1: icmp_seq=1792 ttl=64 time=4.936 ms
64 bytes from 10.0.0.1: icmp_seq=1793 ttl=64 time=27.574 ms
64 bytes from 10.0.0.1: icmp_seq=1794 ttl=64 time=25.530 ms
64 bytes from 10.0.0.1: icmp_seq=1795 ttl=64 time=8.949 ms


Still not good, but better. Still some timeouts too, less than before though.

Then you didn't press a as instructed in top before taking the screenshot. That was the info I was asking for.

Quote from: newsense on February 03, 2024, 10:55:02 AM
Then you didn't press a as instructed in top before taking the screenshot. That was the info I was asking for.

Correct, I missed that. Here it is:

last pid: 21770;  load averages:  0.17,  0.38,  0.33                                                                                                   up 0+00:13:10  10:57:59
64 processes:  2 running, 62 sleeping
CPU:  0.3% user,  0.0% nice,  0.1% system,  0.7% interrupt, 98.9% idle
Mem: 849M Active, 491M Inact, 1437M Wired, 40K Buf, 4937M Free
ARC: 556M Total, 186M MFU, 316M MRU, 13M Anon, 3777K Header, 37M Other
     443M Compressed, 1137M Uncompressed, 2.57:1 Ratio
Swap: 10G Total, 10G Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
19343 root          4  20    0    51M    15M kqread   1   0:18   0.11% /usr/local/sbin/syslog-ng -f /usr/local/etc/syslog-ng.conf -p /var/run/syslog-ng.pid
49188 root          1  20    0    14M  3864K CPU1     1   0:00   0.08% top
46041 root          1  20    0    12M  2276K select   1   0:00   0.05% /usr/sbin/powerd -b hadp -a hadp -n hadp
15719 unbound       4  20    0  1096M   878M kqread   2   0:25   0.05% /usr/local/sbin/unbound -c /var/unbound/unbound.conf
16059 root          1  20    0    13M  2776K bpf      1   0:00   0.03% /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid
17140 root          1  20    0    25M    15M select   0   0:00   0.02% /usr/local/bin/python3 /usr/local/opnsense/scripts/dhcp/unbound_watcher.py --domain internal.privatebit
18271 root          8  21    0   187M   118M kqread   0   0:05   0.01% /usr/local/bin/python3 /usr/local/opnsense/scripts/unbound/logger.py (python3.9)
3791 root          1  20    0    23M    12M select   3   0:00   0.01% /usr/local/bin/python3 /usr/local/sbin/configctl -e -t 0.5 system event config_changed (python3.9)
5028 root          1  20    0    23M    12M select   3   0:00   0.01% /usr/local/bin/python3 /usr/local/opnsense/scripts/syslog/lockout_handler (python3.9)
6996 ingemar       1  20    0    19M  9032K select   3   0:00   0.01% sshd: ingemar@pts/0 (sshd)
71197 root          2  20    0    23M  8208K select   0   0:00   0.01% /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf
77439 root          1  20    0    54M    36M select   0   0:46   0.01% /usr/local/bin/python3 /usr/local/opnsense/scripts/netflow/flowd_aggregate.py (python3.9)
21694 root          1  23    0    13M  2744K wait     3   0:00   0.01% /bin/sh /var/db/rrd/updaterrd.sh
34546 root          1  20    0    23M  8232K select   2   0:00   0.01% /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf
73692 _flowd        1  20    0    12M  2468K select   1   0:00   0.00% flowd: net (flowd)
11971 ingemar       1  20    0    20M  9300K select   0   0:00   0.00% sudo -i
10690 root          1  20    0    22M  9828K kqread   1   0:00   0.00% /usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigurator.conf
21714 nobody        1  20    0    12M  2172K sbwait   1   0:00   0.00% /usr/local/bin/samplicate -s 127.0.0.1 -p 2055 127.0.0.1/2056
67403 root          1  20    0    14M  4084K kqread   1   0:00   0.00% /usr/local/sbin/lighttpd -f /var/etc/lighttpd-acme-challenge.conf
  240 root          1  21    0   109M    60M accept   2   0:12   0.00% /usr/local/bin/python3 /usr/local/opnsense/service/configd.py console (python3.9)
12729 root          1  23    0    71M    42M accept   0   0:01   0.00% /usr/local/bin/php-cgi
14235 root          1  22    0    71M    42M accept   3   0:01   0.00% /usr/local/bin/php-cgi
22564 root          1  22    0    71M    40M accept   1   0:01   0.00% /usr/local/bin/php-cgi
14790 root          1  23    0    71M    40M accept   1   0:01   0.00% /usr/local/bin/php-cgi
19391 root          1  22    0    69M    38M accept   1   0:01   0.00% /usr/local/bin/php-cgi
14260 root          1  20    0    64M    37M accept   2   0:01   0.00% /usr/local/bin/php-cgi
85288 root          1  20    0    23M  6872K select   3   0:00   0.00% /usr/local/sbin/mpd5 -b -d /var/etc -f mpd_opt4.conf -p /var/run/pppoe_opt4.pid -s ppp pppoeclient
  238 root          1  52    0    24M    13M wait     0   0:00   0.00% /usr/local/bin/python3 /usr/local/opnsense/service/configd.py (python3.9)
14448 root          1  22    0    13M  2492K select   1   0:00   0.00% rtsold: rtsold.sendmsg (rtsold)
11328 root          1  20    0    50M    25M wait     0   0:00   0.00% /usr/local/bin/php-cgi
11029 root          1  52    0    56M    25M wait     0   0:00   0.00% /usr/local/bin/php-cgi
24983 dhcpd         1  20    0    25M  9972K select   1   0:00   0.00% /usr/local/sbin/dhcpd -user dhcpd -group dhcpd -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcp
5670 root          1  24    0    19M  8732K select   1   0:00   0.00% sshd: ingemar [priv] (sshd)
  579 root          1  20    0    11M  1608K select   3   0:00   0.00% /sbin/devd
27949 dhcpd         1  20    0    22M  8728K select   1   0:00   0.00% /usr/local/sbin/dhcpd -6 -user dhcpd -group dhcpd -chroot /var/dhcpd -cf /etc/dhcpdv6.conf -pf /var/run
11433 root          1  20    0    13M  2564K kqread   1   0:00   0.00% /usr/sbin/rtsold -p /var/run/rtsold.pid -A /var/etc/rtsold_script.sh -R /usr/local/opnsense/scripts/int
28288 root          1  20    0    13M  4108K pause    0   0:00   0.00% /bin/csh
7580 ingemar       1  22    0    14M  4528K wait     3   0:00   0.00% -bash (bash)
30022 root          1  20    0    13M  2804K wait     3   0:00   0.00% /bin/sh /usr/local/opnsense/scripts/dhcp/prefixes.sh
11672 root          1  48    0    13M  2576K nanslp   2   0:00   0.00% /usr/sbin/cron -s


Did you reset RRD and Netflow in Reporting-Settings recently ?

Better yet, disable it altogether if not using it.

I haven't, but I do look at it occasionally (like this morning).

However, it's gone ....  :o

Mysteriously disappeared  :-[

I really don't mind an occasional problem and do some troubleshooting but I hate it when this happens and I have no clue what caused it let alone solved it :-X

In any case, thanks @newsense for taking the time. Appreciate the help!

When I had to run RRD/Netflow I'd reset it monthly and before a major upgrade.

Any issues there and the CPU will be pegged, which matters a lot for low powered systems or old HW.

Thanks. I should probably do that as well, Ive never reset it and I dont need long term historical data.

What is the proper way to do that? I cant really find it in the gui?

Reporting - Settings

- Reset RRD Data

- Reset Netflow Data

Wouldn't hurt to reset the DNS one as well if in use, should there have been issues in the 24.1 upgrade with duckdb.