OPNsense Forum
Archive => 18.7 Legacy Series => Topic started by: Julien on July 21, 2018, 10:44:15 pm
-
Dear All,
I have a hardware box which is continue running on 90%/99% CPU which cause alot of pakket los on the WAN side.
I have checked the IO Operation
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average ||||
/0% /10 /20 /30 /40 /50 /60 /70 /80 /90 /100
cpu user|XXX
nice|
system|XX
interrupt|
idle|XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
/0% /10 /20 /30 /40 /50 /60 /70 /80 /90 /100
ada0 MB/s
tps|XXXX
pass0 MB/s
tps|
and the interupt CPU usages
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 32K RUN 1 158:26 91.77% [idle{idle: cpu1}]
11 root 155 ki31 0K 32K RUN 0 160:42 79.55% [idle{idle: cpu0}]
10682 root 21 0 112M 22368K accept 1 0:04 2.05% /usr/local/bin/php-cgi
12 root -60 - 0K 400K CPU0 0 2:17 0.48% [intr{swi4: clock (0)}]
12 root -92 - 0K 400K WAIT 0 0:29 0.24% [intr{irq259: em1:rx0}]
88373 root 20 0 20076K 3804K CPU1 1 0:00 0.22% top -aSH
21352 root 20 0 49640K 8628K kqread 1 0:43 0.12% /usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigurator.conf
423 root 52 0 132M 32328K accept 0 0:20 0.08% /usr/local/bin/python2.7 /usr/local/opnsense/service/configd.py console{python2.7}
46510 root 20 0 1061M 6584K select 1 0:01 0.08% /usr/local/sbin/openvpn --config /var/etc/openvpn/client2.conf
12 root -92 - 0K 400K WAIT 0 0:05 0.05% [intr{irq262: em2:rx0}]
18706 root 20 0 1049M 2760K select 0 0:04 0.03% /usr/local/sbin/apinger -c /var/etc/apinger.conf
62443 root 20 0 1091M 6864K select 1 0:00 0.03% sshd: root@pts/0 (sshd)
16 root -16 - 0K 16K pftm 0 0:05 0.03% [pf purge]
12 root -92 - 0K 400K WAIT 1 0:03 0.03% [intr{irq260: em1:tx0}]
17 root -16 - 0K 16K - 1 0:03 0.01% [rand_harvestq]
968 root 29 0 97112K 22380K select 1 46:15 0.01% /usr/local/bin/python2.7 /usr/local/opnsense/scripts/netflow/flowd_aggregate.py
4 root -16 - 0K 32K - 1 0:05 0.01% [cam{doneq0}]
4692 root 20 0 1051M 3028K select 0 0:03 0.01% /usr/local/sbin/syslogd -s -c -c -P /var/run/syslog.pid -l /var/dhcpd/var/run/log -f /var/
85922 root 20 0 1051M 6124K select 0 0:02 0.01% /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid{ntpd}
12 root -92 - 0K 400K WAIT 1 0:01 0.01% [intr{irq263: em2:tx0}]
29963 dhcpd 20 0 24732K 8788K select 1 0:01 0.01% /usr/local/sbin/dhcpd -user dhcpd -group dhcpd -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf
12 root -88 - 0K 400K WAIT 0 0:03 0.01% [intr{irq19: ahci0}]
0 root -4 - 0K 320K - 0 0:02 0.01% [kernel{/ trim}]
18 root -16 - 0K 48K psleep 1 0:00 0.00% [pagedaemon{pagedaemon}]
94464 root 20 0 38816K 5740K kqread 1 0:01 0.00% /usr/local/sbin/lighttpd -f /var/etc/lighttpd-acme-challenge.conf
14975 root 20 0 1053M 2824K bpf 1 0:01 0.00% /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid
42547 root 20 0 1061M 6592K select 1 0:00 0.00% /usr/loc
The firewall is running just simple 2 firewall rules, any to any on the LAN and OPENVPN on the WAN nothing else.
i just wanna make sure we are not dealing with a faulty hardware.
Can someone please point me to the right directions to check ?
Thank you
-
Did you cut off part of the screen? The system looks 99% idle.
-
Did you cut off part of the screen? The system looks 99% idle.
Thank you for your answer. but i am not sure i understand what you mean ?
which screen has been cut ?
-
Did you cut off part of the screen? The system looks 99% idle.
Thank you for your answer. but i am not sure i understand what you mean ?
which screen has been cut ?
The top two lines in your second 'capture' show the following:
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 32K RUN 1 158:26 91.77% [idle{idle: cpu1}]
11 root 155 ki31 0K 32K RUN 0 160:42 79.55% [idle{idle: cpu0}]
Which, in my limited experience of FreeBSD, would seem to indicate that cpu1 @ 91.77% idle and cpu0 @ 79.55% idle. Is that what it's showing and why do you think that your 'hardware box' is running at 90-99% cpu usage?
-
Did you cut off part of the screen? The system looks 99% idle.
Thank you for your answer. but i am not sure i understand what you mean ?
which screen has been cut ?
The top two lines in your second 'capture' show the following:
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 32K RUN 1 158:26 91.77% [idle{idle: cpu1}]
11 root 155 ki31 0K 32K RUN 0 160:42 79.55% [idle{idle: cpu0}]
Which, in my limited experience of FreeBSD, would seem to indicate that cpu1 @ 91.77% idle and cpu0 @ 79.55% idle. Is that what it's showing and why do you think that you 'hardware box' is running at 90-99% cpu usage?
Hi bill,
is just the timing not right for the capture. it shows iddle 0.1% and jumps again back.
on the gui it shows the CPU running like 99%
because it so busy its causes alot of pakket drop on the wan side and i need to indetify what causes this.
when we ping the ISP IP it does shows times out
64 bytes from 66.88.99.0: icmp_seq=0 ttl=64 time=3.902 ms
64 bytes from 66.88.99.0: icmp_seq=1 ttl=64 time=1.032 ms
64 bytes from 66.88.99.0: icmp_seq=2 ttl=64 time=1.369 ms
Request timed out.
64 bytes from 66.88.99.0: icmp_seq=3 ttl=64 time=1.187 ms
64 bytes from 66.88.99.0: icmp_seq=4 ttl=64 time=1.123 ms
64 bytes from 66.88.99.0: icmp_seq=5 ttl=64 time=1.335 ms
Request timed out.
64 bytes from 66.88.99.0: icmp_seq=6 ttl=64 time=1.099 ms
64 bytes from 66.88.99.0: icmp_seq=7 ttl=64 time=2.227 ms
64 bytes from 66.88.99.0: icmp_seq=8 ttl=64 time=1.191 ms
64 bytes from 66.88.99.0: icmp_seq=9 ttl=64 time=1.060 ms
we just need to know where to look, ISP or the firewall.
thank you
-
Your system looks idle based on what you've shared.
You should see something above that you cut off when you ran your top command:
last pid: 52990; load averages: 0.29, 0.20, 0.13 up 6+21:42:25 09:25:27
49 processes: 1 running, 48 sleeping
CPU: 0.0% user, 0.0% nice, 0.0% system, 0.1% interrupt, 99.9% idle
Mem: 24M Active, 3645M Inact, 729M Wired, 437M Buf, 3462M Free
Swap: 8192M Total, 8192M Free
-
Your system looks idle based on what you've shared.
You should see something above that you cut off when you ran your top command:
last pid: 52990; load averages: 0.29, 0.20, 0.13 up 6+21:42:25 09:25:27
49 processes: 1 running, 48 sleeping
CPU: 0.0% user, 0.0% nice, 0.0% system, 0.1% interrupt, 99.9% idle
Mem: 24M Active, 3645M Inact, 729M Wired, 437M Buf, 3462M Free
Swap: 8192M Total, 8192M Free
Thank you for your answer.
Do you maybe know what cause the drop of the ISP gateway ?
-
More info such as what type of connection, eg bridged modem etc.
As an example, I have a dual wan setup with two dsl modems in bridge mode, one goes to the incumbent carrier and the second a reseller of the incumbent carrier's service. The incumbent uses ADSL2+ whereas the reseller is G.dmt, both fast path. The incumbent line does exactly what you describe and when I log into that modem( I documented how to set up rules in tutorials to do this), I see the snr margin has dropped considerably but this never lasts for more than a minute, this happens only several times a week too. The point of this is that it is in no way the fault of OPNsense. Your system appears to be happily gliding along as far as load.
-
Could perhaps the dot-zero IP-address (66.88.99.0) be the culprit?
https://labs.ripe.net/Members/stephane_bortzmeyer/all-ip-addresses-are-equal-dot-zero-addresses-are-less-equal
-
Usally a dot zero represents an entire subnet eg 66.88.99.0/length
-
Usally a dot zero represents an entire subnet eg 66.88.99.0/length
Only if the subnet size is /24 or smaller.
-
Thank you guys, we got this fixed.
it was a ISP issue and they have it fixed already.
thank you for your supports.
-
Sorry for kicking up an old thread, but I was wondering if the ISP told you what the problem was? I'm having these same symptoms and I can't seem to pin them down.
Thanks,