Dear All,
I have a hardware box which is continue running on 90%/99% CPU which cause alot of pakket los on the WAN side.
I have checked the IO Operation
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average ||||
/0% /10 /20 /30 /40 /50 /60 /70 /80 /90 /100
cpu user|XXX
nice|
system|XX
interrupt|
idle|XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
/0% /10 /20 /30 /40 /50 /60 /70 /80 /90 /100
ada0 MB/s
tps|XXXX
pass0 MB/s
tps|
and the interupt CPU usages
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 32K RUN 1 158:26 91.77% [idle{idle: cpu1}]
11 root 155 ki31 0K 32K RUN 0 160:42 79.55% [idle{idle: cpu0}]
10682 root 21 0 112M 22368K accept 1 0:04 2.05% /usr/local/bin/php-cgi
12 root -60 - 0K 400K CPU0 0 2:17 0.48% [intr{swi4: clock (0)}]
12 root -92 - 0K 400K WAIT 0 0:29 0.24% [intr{irq259: em1:rx0}]
88373 root 20 0 20076K 3804K CPU1 1 0:00 0.22% top -aSH
21352 root 20 0 49640K 8628K kqread 1 0:43 0.12% /usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigurator.conf
423 root 52 0 132M 32328K accept 0 0:20 0.08% /usr/local/bin/python2.7 /usr/local/opnsense/service/configd.py console{python2.7}
46510 root 20 0 1061M 6584K select 1 0:01 0.08% /usr/local/sbin/openvpn --config /var/etc/openvpn/client2.conf
12 root -92 - 0K 400K WAIT 0 0:05 0.05% [intr{irq262: em2:rx0}]
18706 root 20 0 1049M 2760K select 0 0:04 0.03% /usr/local/sbin/apinger -c /var/etc/apinger.conf
62443 root 20 0 1091M 6864K select 1 0:00 0.03% sshd: root@pts/0 (sshd)
16 root -16 - 0K 16K pftm 0 0:05 0.03% [pf purge]
12 root -92 - 0K 400K WAIT 1 0:03 0.03% [intr{irq260: em1:tx0}]
17 root -16 - 0K 16K - 1 0:03 0.01% [rand_harvestq]
968 root 29 0 97112K 22380K select 1 46:15 0.01% /usr/local/bin/python2.7 /usr/local/opnsense/scripts/netflow/flowd_aggregate.py
4 root -16 - 0K 32K - 1 0:05 0.01% [cam{doneq0}]
4692 root 20 0 1051M 3028K select 0 0:03 0.01% /usr/local/sbin/syslogd -s -c -c -P /var/run/syslog.pid -l /var/dhcpd/var/run/log -f /var/
85922 root 20 0 1051M 6124K select 0 0:02 0.01% /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid{ntpd}
12 root -92 - 0K 400K WAIT 1 0:01 0.01% [intr{irq263: em2:tx0}]
29963 dhcpd 20 0 24732K 8788K select 1 0:01 0.01% /usr/local/sbin/dhcpd -user dhcpd -group dhcpd -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf
12 root -88 - 0K 400K WAIT 0 0:03 0.01% [intr{irq19: ahci0}]
0 root -4 - 0K 320K - 0 0:02 0.01% [kernel{/ trim}]
18 root -16 - 0K 48K psleep 1 0:00 0.00% [pagedaemon{pagedaemon}]
94464 root 20 0 38816K 5740K kqread 1 0:01 0.00% /usr/local/sbin/lighttpd -f /var/etc/lighttpd-acme-challenge.conf
14975 root 20 0 1053M 2824K bpf 1 0:01 0.00% /usr/local/sbin/filterlog -i pflog0 -p /var/run/filterlog.pid
42547 root 20 0 1061M 6592K select 1 0:00 0.00% /usr/loc
The firewall is running just simple 2 firewall rules, any to any on the LAN and OPENVPN on the WAN nothing else.
i just wanna make sure we are not dealing with a faulty hardware.
Can someone please point me to the right directions to check ?
Thank you
Did you cut off part of the screen? The system looks 99% idle.
Quote from: Animosity022 on July 22, 2018, 12:15:25 AM
Did you cut off part of the screen? The system looks 99% idle.
Thank you for your answer. but i am not sure i understand what you mean ?
which screen has been cut ?
Quote from: Julien on July 22, 2018, 11:12:46 AM
Quote from: Animosity022 on July 22, 2018, 12:15:25 AM
Did you cut off part of the screen? The system looks 99% idle.
Thank you for your answer. but i am not sure i understand what you mean ?
which screen has been cut ?
The top two lines in your second 'capture' show the following:
QuotePID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 32K RUN 1 158:26 91.77% [idle{idle: cpu1}]
11 root 155 ki31 0K 32K RUN 0 160:42 79.55% [idle{idle: cpu0}]
Which, in my limited experience of FreeBSD, would seem to indicate that cpu1 @ 91.77% idle and cpu0 @ 79.55% idle. Is that what it's showing and why do you think that your 'hardware box' is running at 90-99% cpu usage?
Quote from: phoenix on July 22, 2018, 01:55:40 PM
Quote from: Julien on July 22, 2018, 11:12:46 AM
Quote from: Animosity022 on July 22, 2018, 12:15:25 AM
Did you cut off part of the screen? The system looks 99% idle.
Thank you for your answer. but i am not sure i understand what you mean ?
which screen has been cut ?
The top two lines in your second 'capture' show the following:
QuotePID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 32K RUN 1 158:26 91.77% [idle{idle: cpu1}]
11 root 155 ki31 0K 32K RUN 0 160:42 79.55% [idle{idle: cpu0}]
Which, in my limited experience of FreeBSD, would seem to indicate that cpu1 @ 91.77% idle and cpu0 @ 79.55% idle. Is that what it's showing and why do you think that you 'hardware box' is running at 90-99% cpu usage?
Hi bill,
is just the timing not right for the capture. it shows iddle 0.1% and jumps again back.
on the gui it shows the CPU running like 99%
because it so busy its causes alot of pakket drop on the wan side and i need to indetify what causes this.
when we ping the ISP IP it does shows times out
64 bytes from 66.88.99.0: icmp_seq=0 ttl=64 time=3.902 ms
64 bytes from 66.88.99.0: icmp_seq=1 ttl=64 time=1.032 ms
64 bytes from 66.88.99.0: icmp_seq=2 ttl=64 time=1.369 ms
Request timed out.
64 bytes from 66.88.99.0: icmp_seq=3 ttl=64 time=1.187 ms
64 bytes from 66.88.99.0: icmp_seq=4 ttl=64 time=1.123 ms
64 bytes from 66.88.99.0: icmp_seq=5 ttl=64 time=1.335 ms
Request timed out.
64 bytes from 66.88.99.0: icmp_seq=6 ttl=64 time=1.099 ms
64 bytes from 66.88.99.0: icmp_seq=7 ttl=64 time=2.227 ms
64 bytes from 66.88.99.0: icmp_seq=8 ttl=64 time=1.191 ms
64 bytes from 66.88.99.0: icmp_seq=9 ttl=64 time=1.060 ms
we just need to know where to look, ISP or the firewall.
thank you
Your system looks idle based on what you've shared.
You should see something above that you cut off when you ran your top command:
last pid: 52990; load averages: 0.29, 0.20, 0.13 up 6+21:42:25 09:25:27
49 processes: 1 running, 48 sleeping
CPU: 0.0% user, 0.0% nice, 0.0% system, 0.1% interrupt, 99.9% idle
Mem: 24M Active, 3645M Inact, 729M Wired, 437M Buf, 3462M Free
Swap: 8192M Total, 8192M Free
Quote from: Animosity022 on July 22, 2018, 03:25:52 PM
Your system looks idle based on what you've shared.
You should see something above that you cut off when you ran your top command:
last pid: 52990; load averages: 0.29, 0.20, 0.13 up 6+21:42:25 09:25:27
49 processes: 1 running, 48 sleeping
CPU: 0.0% user, 0.0% nice, 0.0% system, 0.1% interrupt, 99.9% idle
Mem: 24M Active, 3645M Inact, 729M Wired, 437M Buf, 3462M Free
Swap: 8192M Total, 8192M Free
Thank you for your answer.
Do you maybe know what cause the drop of the ISP gateway ?
More info such as what type of connection, eg bridged modem etc.
As an example, I have a dual wan setup with two dsl modems in bridge mode, one goes to the incumbent carrier and the second a reseller of the incumbent carrier's service. The incumbent uses ADSL2+ whereas the reseller is G.dmt, both fast path. The incumbent line does exactly what you describe and when I log into that modem( I documented how to set up rules in tutorials to do this), I see the snr margin has dropped considerably but this never lasts for more than a minute, this happens only several times a week too. The point of this is that it is in no way the fault of OPNsense. Your system appears to be happily gliding along as far as load.
Could perhaps the dot-zero IP-address (66.88.99.0) be the culprit?
https://labs.ripe.net/Members/stephane_bortzmeyer/all-ip-addresses-are-equal-dot-zero-addresses-are-less-equal
Usally a dot zero represents an entire subnet eg 66.88.99.0/length
Quote from: Davesworld on July 22, 2018, 11:00:45 PM
Usally a dot zero represents an entire subnet eg 66.88.99.0/length
Only if the subnet size is /24 or smaller.
Thank you guys, we got this fixed.
it was a ISP issue and they have it fixed already.
thank you for your supports.
Sorry for kicking up an old thread, but I was wondering if the ISP told you what the problem was? I'm having these same symptoms and I can't seem to pin them down.
Thanks,