Maltrail consuming a lot of CPU after 19.7.3

Started by tusc, August 29, 2019, 03:03:50 AM

Previous topic - Next topic
I recently upgraded today and noticed maltrail's senor has 4 processes running and load is above 5. Is this normal?

@OPNsense:/usr/local/etc/rc.d # ps axu | grep maltrail
root    21811 100.3 19.8  717304 704352  -  S    19:56    4:22.30 python2.7 /usr/local/share/maltrail/sensor.py
root    96015  34.4 19.8  713732 703972  -  R    19:58    1:10.19 python2.7 /usr/local/share/maltrail/sensor.py
root     2744  32.5 19.8  713852 704220  -  S    19:58    1:11.61 python2.7 /usr/local/share/maltrail/sensor.py
root    24134  32.1 19.8  713596 703792  -  S    19:58    1:09.22 python2.7 /usr/local/share/maltrail/sensor.py
root    55286   0.0  0.8   40668  26844  -  S    19:26    0:02.11 python2.7 /usr/local/share/maltrail/server.py



root@OPNsense:/usr/local/etc/rc.d # top -bHS
last pid: 78026;  load averages:  4.36,  5.04,  5.42  up 0+01:31:43    20:02:23
210 processes: 12 running, 155 sleeping, 43 waiting

Mem: 780M Active, 210M Inact, 538M Wired, 303M Buf, 1817M Free
Swap:


  PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
21811 root        86    0   700M   688M CPU0    0   2:01  47.46% python2.7{python2.7}
21811 root        86    0   700M   688M RUN     0   1:59  47.36% python2.7{python2.7}
96015 root        44    0   697M   687M select  2   1:26  33.40% python2.7{python2.7}
24134 root        44    0   697M   687M CPU2    2   1:25  32.67% python2.7{python2.7}
2744 root        45    0   697M   688M RUN     3   1:27  32.37% python2.7{python2.7}
   11 root       155 ki31     0K    64K RUN     3  24:49  30.08% idle{idle: cpu3}
   11 root       155 ki31     0K    64K RUN     2  25:12  29.49% idle{idle: cpu2}
   11 root       155 ki31     0K    64K RUN     0  22:51  25.29% idle{idle: cpu0}
   12 root       -92    -     0K   720K CPU3    3  21:53  25.29% intr{irq265: igb1:que 3}
   11 root       155 ki31     0K    64K RUN     1  19:21  20.56% idle{idle: cpu1}
   12 root       -92    -     0K   720K WAIT    2  16:50  18.80% intr{irq259: igb0:que 2}
   12 root       -92    -     0K   720K WAIT    0   3:28   2.39% intr{irq262: igb1:que 0}
   12 root       -92    -     0K   720K RUN     3   1:36   1.95% intr{irq260: igb0:que 3}
   12 root       -92    -     0K   720K WAIT    2   1:37   1.56% intr{irq264: igb1:que 2}
56999 root        22    0 35044K 22400K select  1   0:00   0.39% php-cgi

Well, it looks like it has subsided. Not sure what caused it. Does it normally have that many processes active?


root@OPNsense:/usr/local/etc/rc.d # top -bHS
last pid: 80886;  load averages:  0.61,  0.67,  0.85  up 0+02:36:04    21:06:44
207 processes: 6 running, 156 sleeping, 45 waiting

Mem: 486M Active, 501M Inact, 539M Wired, 303M Buf, 1820M Free
Swap:


  PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
   11 root       155 ki31     0K    64K CPU2    2  70:52  92.09% idle{idle: cpu2}
   11 root       155 ki31     0K    64K CPU0    0  66:48  86.96% idle{idle: cpu0}
   11 root       155 ki31     0K    64K RUN     3  70:55  84.47% idle{idle: cpu3}
   11 root       155 ki31     0K    64K CPU1    1  62:52  84.28% idle{idle: cpu1}
24134 root        22    0   697M   688M select  0  11:00   7.28% python2.7{python2.7}
96015 root        21    0   697M   688M CPU1    1  11:00   5.76% python2.7{python2.7}
2744 root        21    0   698M   688M select  2  10:59   3.56% python2.7{python2.7}
21811 root        21    0   700M   688M bpf     2   9:46   1.86% python2.7{python2.7}
21811 root        21    0   700M   688M bpf     2   9:44   1.86% python2.7{python2.7}
81897 root        52    0 37096K 25328K accept  0   0:02   0.98% php-cgi



What does your /var/log/maltrail/error.log show? Mine is flooded " with Received unexpected datalink (186) ". Cannot find root cause for it, but this is definitely the rootcause for high CPU load on my device.


Not sure for my case, just noticed it. Meanwhile MT development committed a patch. @mimugmail, is there an advised way to apply the patch to the plugin? I don't think opnsense-patch can be applied to another repo?



@mimugmail

Apparently changing from `PROCESS_COUNT` to `USE_MULTIPROCESSING` solved my problem with very high CPU usage.

Makes any sense to you?

ref.: https://github.com/cloudfence/plugins/commit/e878c035a465882d186e5c9181f827b5c21e177d
Cloudfence Open Source Team

February 20, 2020, 06:02:52 PM #9 Last Edit: February 20, 2020, 06:09:28 PM by juliocbc
Tests with APU2e4 (quadcore / 4GB RAM) -  OPNsense 19.7.8-amd64

Running with:

-100Mb + 60 Mb WANs
-4 VLANs
-1 Site to Site OpenVPN
-Plugins: Ngnix, Let's encrypt, Proxy (with Cloudfence's Webfilter), haproxy, FTPProxy;
-ARP Table: 51 entries;

Maltrail only with sensor running (We use a dedicated Maltrail Server here) with custom configuration:
PROCESS_COUNT 1
#USE_MULTIPROCESSING true
DISABLE_CPU_AFFINITY false


CPU stats (every 3mins) - Load average (1 core): 94%
Memory: 15%
Cloudfence Open Source Team

Maltrail server stats:

Cloudfence Open Source Team

Quote from: juliocbc on February 20, 2020, 06:02:52 PM
Tests with APU2e4 (quadcore / 4GB RAM) -  OPNsense 19.7.8-amd64

Running with:

-100Mb + 60 Mb WANs
-4 VLANs
-1 Site to Site OpenVPN
-Plugins: Ngnix, Let's encrypt, Proxy (with Cloudfence's Webfilter), haproxy, FTPProxy;
-ARP Table: 51 entries;

Maltrail only with sensor running (We use a dedicated Maltrail Server here) with custom configuration:
PROCESS_COUNT 1
#USE_MULTIPROCESSING true
DISABLE_CPU_AFFINITY false


CPU stats (every 3mins) - Load average (1 core): 94%
Memory: 15%

Currently maltrail.conf uses fixed $CPU_CORES, shall I make this configurable?

Hi Michael!

I've edited jinja2 template to make some tests.
Cloudfence Open Source Team

This remedy to be fixed for awhile, but now I am back up to high CPU with maltrail.

8212   uwait   37.50%   python3 /usr/local/share/maltrail/sensor.py (python3.7){python3.7}
   49973   CPU3   36.38%   python3 /usr/local/share/maltrail/sensor.py (python3.7){python3.7}
   61436   CPU2   35.60%   /usr/local/sbin/openvpn --config /var/etc/openvpn/client2.conf
   34952   select   34.77%   python3 /usr/local/share/maltrail/sensor.py (python3.7){python3.7}
   23576   select   33.50%   python3 /usr/local/share/maltrail/sensor.py (python3.7){python3.7}
   74367   select   23.78%   /usr/local/bin/suricata -D --netmap --pidfile /var/run/suricata.pid -c /usr/local/etc/suricata/suricata.yaml{W#01-em3}

System is 4core APU, total memory use is only about 20%.

Forgot to mention that I am running 20.1.8

Usually this only happens when downloading new feeds after restart ... how long did you wait?