OPNsense Forum

Archive => 21.7 Legacy Series => Topic started by: MidGe on September 26, 2021, 01:46:04 am

Title: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: MidGe on September 26, 2021, 01:46:04 am
First of all, sorry if this is a simple issue, I am a newbie with OPNsense and FreeBSD.

I tried some  solutions I could find on this forum:
1. reboot
2. deactivate Reporting -> Netflow -> Capture local
3. repair Netlow data

All these to no avail.

Web access to OPNsense dashboard and other is extremely slow.


Running top shows me an inordinate usage of CPU by php!

Code: [Select]
  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
12779 root          1  85    0   108M    80M CPU2     2   0:29  79.26% php-cgi
80884 root          1  94    0   147M   119M RUN      3   0:41  63.55% php-cgi
48574 root          1  82    0   106M    86M RUN      1   0:05  44.93% php
20355 root          1  82    0   106M    86M CPU1     1   0:05  39.04% php
70423 root          1  85    0   113M    85M RUN      0   1:12  31.66% php-cgi
 7189 root          1  31    0   103M    76M CPU3     3   1:19  14.41% php-cgi
11907 root          7  20    0   110M    80M nanslp   2  42:27  14.19% suricata
13796 netdata      21  52   19   139M    91M pause    0   4:20   1.51% netdata
61383 root          3  52    0   280M   228M accept   1   3:13   1.29% python3.8
26015 netdata       2  39   19    52M    32M select   3   1:49   0.39% python3.8
10788 root          1  20    0  1038M  4284K CPU0     0   0:00   0.13% top
57233 root          1  20    0    18M  8144K kqread   3   0:06   0.09% lighttpd
27323 netdata       1  39   19    18M  7024K nanslp   1   0:16   0.05% apps.plugin
54279 root          1  20    0    23M    14M select   2   0:17   0.05% python3.8
44278 root          1  20    0    18M  6372K select   2   0:09   0.03% ntpd
13176 root          3  20    0    30M    10M kqread   2   0:03   0.03% syslog-ng
51196 root          1  20    0    21M    11M select   2   0:06   0.02% python3.8
63155 root          1  20    0    21M    11M select   2   0:06   0.02% python3.8
32033 root          1  20    0    17M  7908K select   0   0:00   0.02% sshd
73352 root          1  20    0    11M  2704K select   0   0:01   0.01% syslogd
...

Tried a few more things since first posting this:

1. deleted all widgets from the portal - result in slightly lower php usage
2. Reset RRD
3. reset Netflow data

All to no avail again!

The problem is very likely to be a php issue.  Unfortunately my knowledge of php is close to nil.  :(

Can this be fixed and how, or do I need a re-install?

Thanks for any help.

If any other info is required, let me know.
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access to OPNsense
Post by: MidGe on September 26, 2021, 07:53:05 am
I have kept on trying various fixes.  Mostly stabs in the dark and always rebooting after a change to make sure.

Now, disabling suricata and rebooting gives me a different result. Now the issue is with php still but also with python3.8.

Code: [Select]
  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
93888 root          1  90    0   110M    82M RUN      0   0:14  75.59% php
35258 root          1  88    0   319M   308M CPU3     3   0:12  64.56% python3.8
39001 root          1  87    0   116M    82M CPU1     1   0:12  59.04% php
78509 root          1  87    0   319M   308M RUN      2   0:15  55.70% python3.8
81797 root          1  87    0   407M   388M CPU0     0   0:37  50.51% python3.8
82987 root          1  88    0   124M    95M RUN      3   0:51  46.82% php-cgi
51231 root          1  88    0   319M   308M RUN      2   0:17  43.79% python3.8
...

Unfortunately, not being immortal, and having less time to live than most on this forum, I may have to abandon OPNsense. I really cannot afford to spend days trying to fix a network issue that is not productive of my time. This is my perimeter router and it need be stable.


Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access to OPNsense
Post by: MidGe on September 26, 2021, 10:28:37 am
Again, here is a full top report:
Code: [Select]
last pid:  6197;  load averages:  6.48,  6.59,  6.60                                                                      up 0+02:13:58  15:51:39
59 processes:  8 running, 50 sleeping, 1 zombie
CPU: 98.9% user,  0.0% nice,  0.8% system,  0.3% interrupt,  0.0% idle
Mem: 769M Active, 2835M Inact, 671M Wired, 468M Buf, 3547M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
83182 root          1  83    0   106M    78M RUN      2   0:05  75.80% php
90397 root          1  83    0   106M    78M RUN      1   0:05  66.13% php
 2200 root          1  86    0   110M    83M CPU1     1   0:12  65.95% php
92416 root          1  88    0   110M    82M CPU3     3   0:13  57.32% php
47124 root          1  84    0   108M    80M RUN      0   0:06  49.83% php
81956 root          1  85    0   112M    84M CPU0     0   0:14  48.24% php
 3118 root          1  87    0   110M    81M RUN      2   0:11  32.96% php
13941 netdata      21  52   19   117M    69M pause    3   1:04   0.91% netdata
38210 netdata       2  39   19    52M    32M select   2   0:34   0.38% python3.8
94140 root          1  20    0  1044M  4140K CPU2     2   0:00   0.12% top
16116 netdata       1  39   19    18M  7336K nanslp   3   0:05   0.07% apps.plugin
31008 root          1  20    0    23M    14M select   2   0:09   0.05% python3.8
67388 root          1  20    0    21M    11M select   2   0:03   0.04% python3.8
86549 root          1  20    0    18M  6496K select   2   0:03   0.03% ntpd
87080 root          1  20    0    21M    11M select   1   0:03   0.03% python3.8
71351 root          1  20    0    17M  7304K select   1   0:00   0.02% sshd
85286 root          1  20    0    25M    16M select   1   2:10   0.02% python3.8
59047 root          1  20    0    18M  8136K kqread   2   0:07   0.01% lighttpd
38853 root          1  20    0    11M  2712K select   3   0:01   0.01% syslogd
68352 root          1  20    0    96M    67M select   2   0:54   0.01% php-cgi
28372 root          1  20    0    91M    61M select   0   0:01   0.01% php-cgi
65199 root          2  26    0    19M  7296K nanslp   2   0:00   0.01% monit
18862 root          1  22    0    96M    68M select   2   0:59   0.00% php-cgi
64243 root          1  20    0   104M    75M select   0   1:08   0.00% php-cgi
79915 root          1  20    0    12M  2548K bpf      3   0:00   0.00% filterlog
50443 root          1  20    0   103M    75M select   0   1:03   0.00% php-cgi
17156 root          1  20    0    91M    62M select   0   0:06   0.00% php-cgi
 7659 root          7  52    0   279M   220M accept   3   3:26   0.00% python3.8
88361 clamav        2  20    0  1233M  1169M select   3   1:16   0.00% clamd
77511 root          1  52    0    31M    19M wait     2   0:05   0.00% python3.8
 7843 root          1  52    0  1043M  3564K wait     0   0:04   0.00% sh

This shows a php issue. This php is at the core of the OPNsense software, no?  It is not simply the php for the web server.

What can I do to progress this? I am a bit concerned about simply re-installing as it takes time to do so and my whole network will be completely down for that period.

Is it possible to downgrade from the CLI or from the web portal?

Any other suggestions?

If it is of any use here are the detail of my hardware:

Platform
Manufacturer   Protectli
Product Name   FW4B
Version   Ver 1.3
Serial Number   Default string
Family   Default string
BIOS
Vendor   American Megatrends Inc.
Version   5.11
Release Date   10/22/2019

It is a Protectli FW4B – 4 Port Intel® J3160
Intel Celeron® J3160 Quad Core at 1.6 GHz (Burst to 2.24 GHz)
with 4 Intel® Gigabit Ethernet NIC ports
and AES-NI support

Thanks for any help or advice

Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access to OPNsense
Post by: Fright on September 26, 2021, 09:06:31 pm
hi
any clue in system log?
can you share "top report" from System: Diagnostics: Activity? (top -an)
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access to OPNsense
Post by: MidGe on September 27, 2021, 01:15:10 am
hi
any clue in system log?
can you share "top report" from System: Diagnostics: Activity? (top -an)

Thanks for trying to help.

Here is the "top -an" result:

Code: [Select]
  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
27631 root          1  88    0   112M    91M CPU2     2   0:15  53.56% /usr/local/bin/php /usr/local/opnsense/scripts/interfaces/traffic_stats.php
33574 root          1  87    0   110M    89M CPU1     1   0:09  52.59% /usr/local/bin/php /usr/local/opnsense/scripts/interfaces/traffic_stats.php
 7197 root          1  88    0   127M   111M RUN      1   0:17  51.95% /usr/local/bin/php /usr/local/opnsense/scripts/routes/gateway_status.php
80339 root          1  87    0   108M    87M RUN      0   0:08  50.00% /usr/local/bin/php /usr/local/opnsense/scripts/routes/gateway_status.php
35262 root          1  83    0   108M    86M CPU3     3   0:05  37.70% /usr/local/bin/php /usr/local/opnsense/scripts/interfaces/traffic_stats.php
55901 root          1  82    0   106M    85M RUN      2   0:05  35.50% /usr/local/bin/php /usr/local/opnsense/scripts/routes/gateway_status.php
34198 root          1  23    0   103M    75M select   2   0:31   4.39% /usr/local/bin/php-cgi
72367 root          1  21    0    98M    73M select   0   1:23   1.76% /usr/local/bin/php-cgi
35815 root          1  20    0    98M    72M select   2   0:07   1.76% /usr/local/bin/php-cgi
16564 root          1  20    0    93M    66M select   0   0:59   0.68% /usr/local/bin/php-cgi
13941 netdata      21  52   19   140M    92M pause    3   7:58   0.59% /usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
85286 root          1  20    0    25M    16M select   3  86:31   0.20% /usr/local/bin/python3 /usr/local/opnsense/scripts/netflow/flowd_aggregate.py (python3.8)
62531 root          1  20    0   103M    75M select   0   0:39H   0.20% /usr/local/bin/php-cgi
38210 netdata       2  39   19    52M    32M select   3   3:43   0.10% /usr/local/bin/python3.8 /usr/local/libexec/netdata/plugins.d/python.d.plugin 1
 7659 root          7  52    0   288M   227M accept   2   5:57   0.00% /usr/local/bin/python3 /usr/local/opnsense/service/configd.py console (python3.8)
88361 clamav        2  20    0  1238M  1173M select   0   2:34   0.00% /usr/local/sbin/clamd
95860 root          1  20    0    91M    65M select   2   0:46   0.00% /usr/local/bin/php-cgi
16116 netdata       1  39   19    18M  7356K nanslp   1   0:34   0.00% /usr/local/libexec/netdata/plugins.d/apps.plugin 1

None of the logs under system show anything that may indicate the source of a problem. I get the expected warnings about excessive CPU usage but that is it.

Changing pages in the web gui take a minute or more.
Loading  the logs in the gui takes more than a minute.
ssh into the system gets me the welcome message immediately but after that I have to wait a minute or more to get to the menu.

I had suricata on, but disabled it, trying to fix the issue.  It is still currently disabled.

It has now been three days and my router is hot although the sensors show a  temperature (50-60 C) still.

Thanks again for taking interest in this.



Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access to OPNsense
Post by: allebone on September 27, 2021, 02:00:58 am
Any strange vlan setup?
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access to OPNsense
Post by: MidGe on September 27, 2021, 02:38:21 am
Hello Allebone,

I had a second LAN interface setup for DMZ but had nothing at all plugged in/connected to it for a few weeks. To be sure, I disabled the interface completely just now without any effect. Still 100% CPU usage.

Thank you for your attention to this.
Title: Re: 21.7.3. - high CPU - Issue HALF SOLVED
Post by: MidGe on September 27, 2021, 05:00:55 am
OK! I think I know a bit more about what is causing the issue.

Since I have been trying to solve this I always had a window in my web browser open to the OPNsense GUI.

As they were php-cgi processes with high cpu usage, I decided to get rid of those to close that tab and watch via ssh top what was happening. THE ISSUE DISAPPEARED! No processes running with more than 0.09% cpu usage, and the highest one was top!

So, now the problem has to do with the web server of OPNsense somehow. The only thing I can think of is that it may be related to my monitor settings. It is a 2K monitor that I am using to access the GUI of OPNsense. This was not an issue until the last upgrade!

Is that helping ?

Meanwhile, I guess, I can simply keep the tab monitoring the router closed until there is a fix. Hopefully not too long to wait as as soon as I open the GUI again, the cpu usage goes through the roof and stays there for as long as the window is open. I would really like to keep on running OPNsense rather than having to go to alternatives. :)

If the devs need any more information from me, simply ask.  I will monitor this topic regularly over the next couple of days.

Here is the output of top with the GUI tab closed:
Code: [Select]
63154 root          1  20    0  1044M  4044K CPU3     3   0:01   0.09% top
94184 root          1  20    0    23M    14M select   3   0:06   0.03% python3.8
20949 root          1  20    0    21M    11M select   3   0:01   0.01% python3.8
59998 root          1  20    0    12M  2548K bpf      0   0:00   0.01% filterlog
97220 root          1  20    0    17M  7028K select   1   0:00   0.01% sshd
89394 root          3  20    0    28M    10M CPU1     1   0:00   0.01% syslog-ng
13444 root          1  20    0    21M    11M select   2   0:01   0.01% python3.8
22839 root          1  20    0    26M    16M select   0   0:14   0.01% python3.8
40855 root          1  20    0    18M  6492K select   3   0:00   0.01% ntpd
 4118 root          1  20    0    18M  8012K kqread   2   0:01   0.00% lighttpd
63083 root          2  26    0    19M  7396K nanslp   1   0:00   0.00% monit
58792 root          1  52    0    99M    79M accept   1   2:43   0.00% php-cgi
52477 root          1  52    0   249M   203M accept   0   2:24   0.00% python3.8
28141 root          1  20    0    74M    55M accept   3   2:13   0.00% php-cgi
99605 root          1  20    0    71M    53M accept   2   1:55   0.00% php-cgi
12970 root          1  52    0   107M    86M accept   3   1:34   0.00% php-cgi
 5931 clamav        2  20    0  1233M  1170M select   0   1:17   0.00% clamd
85673 root          1  20    0    93M    64M accept   0   1:15   0.00% php-cgi
37864 root          1  20    0    91M    63M accept   2   0:45   0.00% php-cgi
43307 root          1  52    0    31M    19M wait     3   0:05   0.00% python3.8
53063 root          1  52    0  1037M  3328K wait     2   0:01   0.00% sh
92696 unbound       4  20    0    70M    38M kqread   2   0:00   0.00% unbound
61424 root          1  20    0    21M  6812K select   0   0:00   0.00% mpd5
57269 root          1   4    0    11M  2708K CPU1     1   0:00   0.00% syslogd
81032 root          1  52    0    43M    20M wait     3   0:00   0.00% php-cgi
49170 root          1  52    0    43M    20M wait     0   0:00   0.00% php-cgi
47350 root          1  33    0  1036M  3256K nanslp   2   0:00   0.00% cron
82552 root          1  20    0  1044M  4644K pause    0   0:00   0.00% csh
15995 dhcpd         1  20    0    23M  9628K select   0   0:00   0.00% dhcpd
59269 root          1  20    0    10M  1428K select   2   0:00   0.00% devd
97240 nobody        1  20    0    10M  2080K sbwait   3   0:00   0.00% samplicate

Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: Fright on September 27, 2021, 10:55:14 am
hm. are there any anomalies in Traffic Graph widget (if using) or on Reporting: Traffic page? may be something in System: Log Files: Backend?
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: MidGe on September 27, 2021, 02:35:21 pm
No, no anomalies at all, except a very slow refresh rate.

I, now, did put on the Dashboard, all widgets. Then I deleted them one by one , saving the setting after each and waiting the 4 or 5 minutes to get each completed, trying to identify the culprit. Guess what, even with no widgets left on the dashboard, I still get at least one php-cgi process that goes to the 100% CPU.

Strangely, when I go to the Licence page, I get none. When I exit the GUI entirely, everything is normal, no high cpu usage.  Unfortunately, having to wait 4 or 5 minutes to change page on the web GUI does not suit my use case.

If there is no solution by tomorrow morning my time, it will have been 4 days with this issue, I will have to pull the plug on OPNsense. I really cannot afford the time and I have not got the time to learn the whole framework to try to identify and fix the problem. I may, very regretfully,  have to go to a paid and  closed source solution.  :(





Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: allebone on September 27, 2021, 08:04:22 pm
I also had this before when I setup bridging incorrectly. Did you have any bonding or bridging at all?
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: MidGe on September 27, 2021, 10:25:30 pm
None currently.

Only a router as WI-FI access point working with a different sub-net which has no issues.
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: allebone on September 27, 2021, 10:36:09 pm
Can you check IPv6 is not causing an issue by disabling it entirely:
https://www.thomas-krenn.com/en/wiki/OPNsense_disable_IPv6
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: MidGe on September 28, 2021, 12:49:29 am
IPv6 has been disabled since very early after the install and still is totally disabled. I find internal IP4 much easier to manage as my labs have very frequent changes of hardware and configuration.
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: allebone on September 28, 2021, 01:03:39 am
Then I am stumped  :(
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: MidGe on September 28, 2021, 01:53:18 am
So am I.  :(  :)

So I now have to decide whether to re-install or go to an alternative like pfSense, IPFire or OpenWRT. They all have their pluses and minuses.

To re-install without having a clue about the issue seems a bit pointless.

Shrug!

Anyway thanks for your attention and time given to this. I really appreciated it.
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: franco on September 28, 2021, 06:35:31 am
27631 root          1  88    0   112M    91M CPU2     2   0:15  53.56% /usr/local/bin/php /usr/local/opnsense/scripts/interfaces/traffic_stats.php
33574 root          1  87    0   110M    89M CPU1     1   0:09  52.59% /usr/local/bin/php /usr/local/opnsense/scripts/interfaces/traffic_stats.php
 7197 root          1  88    0   127M   111M RUN      1   0:17  51.95% /usr/local/bin/php /usr/local/opnsense/scripts/routes/gateway_status.php
80339 root          1  87    0   108M    87M RUN      0   0:08  50.00% /usr/local/bin/php /usr/local/opnsense/scripts/routes/gateway_status.php
35262 root          1  83    0   108M    86M CPU3     3   0:05  37.70% /usr/local/bin/php /usr/local/opnsense/scripts/interfaces/traffic_stats.php
55901 root          1  82    0   106M    85M RUN      2   0:05  35.50% /usr/local/bin/php /usr/local/opnsense/scripts/routes/gateway_status.php

It sure looks odd. Nothing in the code that would suddenly cause this so reinstall might solve it or not- worst case your hardware is slowing things down which would transfer to any other new installation.

Make sure your health audit comes up empty. If that is the case trying booting a live image with an early config import and see if that behaves better. It might also be the disk or SD card depending on what is installed there.


Cheers,
Franco
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: devhunter55 on September 28, 2021, 06:10:13 pm
My system is near to crash - did the upgrade this morning to 21.7.3_1

CPU most of the time 100%, Memory increasing and increasing ..
Think it's a phython 3.8 issue together with syslog-ng ?!

last pid: 13263;  load averages:  3.97,  3.91,  3.46                                                                                                                 up 9+13:55:29  18:08:53
53 processes:  3 running, 48 sleeping, 2 zombie
CPU: 54.5% user,  0.0% nice, 40.1% system,  1.1% interrupt,  4.3% idle
Mem: 851M Active, 503M Inact, 1384M Laundry, 1047M Wired, 392M Buf, 123M Free
Swap: 5120M Total, 3212M Used, 1908M Free, 62% Inuse, 128K In, 6016K Out

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
74757 root          6  28    0    33M  4388K kqread   0  91:32  93.87% syslog-ng
90861 root          1 101    0    11M  1564K RUN      2  87:22  87.99% syslogd
15412 root          1 101    0  4261M  2115M CPU3     3 842:39  85.44% python3.8


Name    opnadm9.opn9.9opn
Versions    OPNsense 21.7.3_1-amd64
FreeBSD 12.1-RELEASE-p20-HBSD
OpenSSL 1.1.1l 24 Aug 2021
Updates    Click to check for updates.
CPU type    AMD GX-412TC SOC (4 cores)
CPU usage    
Load average    3.69, 3.72, 3.33
Uptime    9 days 13:53:29
Current date/time    Tue Sep 28 18:06:53 CEST 2021
Last config change    Tue Sep 28 11:16:26 CEST 2021
CPU usage    
100 %
State table size    
0 % ( 890/403000 )
MBUF usage    
0 % ( 1806/250690 )
Memory usage    
88 % ( 3563/4035 MB )
SWAP usage    
59 % ( 3041/5120 MB )
Disk usage    
75% / [ufs] (9.3G/13G)
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: Fright on September 28, 2021, 06:55:13 pm
@devhunter55
imho your issue is related with https://forum.opnsense.org/index.php?topic=24868.0 not this one
try to reboot opnsense or just kill '/usr/local/bin/python3 /usr/local/opnsense/service/configd_ctl.py -e -t 0.5 system event config_changed' instances and clear system log
Title: Re: 21.7.3. - high CPU - Mem usage:OK - very slow web access - HALF SOLVED
Post by: devhunter55 on September 28, 2021, 10:00:41 pm
yes, thanks .. i already mentioned that in the other board  :D