As title says, I have to reboot my firewall every ~7 days as the OOM Killer kicks in and kills all processes, leaving only kernel and nothing else (ssh and web interface are killed too)
OPNsense 22.1.2_1-amd64
FreeBSD 13.0-STABLE
OpenSSL 1.1.1m 14 Dec 2021
Please find attached the result of these commands executed with an uptime of 30h
df -h
vmstat -z | tail +3 | awk -F '[:,] *' 'BEGIN { total=0; cache=0; used=0 } {u = $2 * $4; c = $2 * $5; t = u + c; cache += c; used += u; total += t; name=$1; gsub(" ", "_", name); print t, name, u, c} END { print total, "TOTAL", used, cache } ' | sort -n | perl -a -p -e 'while (($j, $_) = each(@F)) { 1 while s/^(-?\d+)(\d{3})/$1,$2/; print $_, " "} print "\n"' | column -t
vmstat -o | sort -nr | head -n 3000
vmstat -m | sort -rk3
more attachments in the next posts
vmstat -o | sort -nr | head -n 3000
vmstat -z | tail +3 | awk -F '[:,] *' 'BEGIN { total=0; cache=0; used=0 } {u = $2 * $4; c = $2 * $5; t = u + c; cache += c; used += u; total += t; name=$1; gsub(" ", "_", name); print t, name, u, c} END { print total, "TOTAL", used, cache } ' | sort -n | perl -a -p -e 'while (($j, $_) = each(@F)) { 1 while s/^(-?\d+)(\d{3})/$1,$2/; print $_, " "} print "\n"' | column -t
df -h
Some screenshots over time of
top -o size
first one is just after boot
more top
more top
more top
So let's start with something directly relevant... you have /var MFS enabled, yes?
Cheers,
Franco
I already posted the result of `df -h`
as second confirmation, please find attached a screenshot of the webgui
here's the output of `mount`
/dev/gpt/rootfs on / (ufs, local, noatime, soft-updates)
devfs on /dev (devfs)
devfs on /var/dhcpd/dev (devfs)
devfs on /var/unbound/dev (devfs)
here's the memory load after 53h uptime
last pid: 63460; load averages: 0.28, 0.39, 0.44 up 2+04:54:04 11:49:44
53 processes: 1 running, 52 sleeping
CPU: 1.1% user, 0.0% nice, 16.3% system, 0.0% interrupt, 82.6% idle
Mem: 39M Active, 989M Inact, 458M Wired, 249M Buf, 2465M Free
Sorry, did not look at attached files / searched with browser find.
At least it's not /var/log overflowing main memory if /var MFS is off.
What's your configuration? Which heavy tasks do you have enabled? OpenVPN, IPsec, Intrusion Detection (without or without IPS?), Unbound blocklists, Web Proxy running, additional plugins...
Cheers,
Franco
Firewall is used just as Wireguard VPN concentrator, so:
OpenVPN: No
IPsec: No
Intrusion Detection (without or without IPS?): No and no
Unbound blocklists: No
Web Proxy running: No
additional plugins...
os-dyndns (installed) 1.27_3 179KiB OPNsense Dynamic DNS Support
os-iperf (installed) 1.0_1 24.6KiB OPNsense Connection speed tester
os-vmware (installed) 1.5_1 610B OPNsense VMware tools
os-wireguard (installed) 1.10 47.1KiB OPNsense WireGuard VPN service
os-zabbix-agent (installed) 1.11 50.1KiB OPNsense Zabbix monitoring agent
only os-wireguard is truly essential, in order from top to least important:
wireguard
zabbix-agent
dyndns
iperf3
vmware
top after ~54h
last pid: 35841; load averages: 0.31, 0.47, 0.41 up 2+06:21:38 13:17:18
54 processes: 1 running, 53 sleeping
CPU: 0.0% user, 0.0% nice, 15.8% system, 0.0% interrupt, 84.2% idle
Mem: 80M Active, 998M Inact, 466M Wired, 256M Buf, 2406M Free
wireguard-kmod installed?
Cheers,
Franco
Yes, wireguard-kmod 0.0.20211105 installed. But I hardly believe it is the cause of the problem.
I've been facing this problem since early 2021.
Latest change I made to the system is switching from userland wireguard-go implementation to the kernel driver, and this happened at the beginning of march.
While this greatly improved CPU usage, it didn't solve the problem. I used to have OOM and I still have OOM problems.
Please find attached a zabbix graph before and after switching the wireguard implementation.
G.C.
steady rise continues
last pid: 42872; load averages: 0.59, 0.46, 0.45 up 2+07:23:49 14:19:29
54 processes: 1 running, 53 sleeping
CPU: 0.2% user, 0.0% nice, 10.1% system, 0.0% interrupt, 89.7% idle
Mem: 49M Active, 992M Inact, 476M Wired, 263M Buf, 2432M Free
output of
pkg info
Edit
top -o size after 3 days 6 hours
Could you please write the text directly into your postings instead of attachments and screenshots? That would be way more convenient for everybody trying to read what you write.
There are code tags for that:
This is pasted code
or command output.
Thanks,
Patrick
roger that, I was just trying to manually collect data even while on the go, so I used screenshots. I'll try to stick with text data.
Now I've wrapped up a shell script to collect
vmstat -m
over time, hopefully I'll get some good data for further analysis. Please redirect me if there's a better way to debug this.
In the meantime, here's latest top -o size
last pid: 36860; load averages: 0.44, 0.42, 0.46 up 3+07:50:54 14:46:34
53 processes: 1 running, 52 sleeping
CPU: 0.2% user, 0.0% nice, 13.1% system, 0.0% interrupt, 86.7% idle
Mem: 51M Active, 1041M Inact, 547M Wired, 331M Buf, 2314M Free
Just a guess: is your OPNsense UI accessible from the Internet? Possibly someone is probing the UI for bugs causing php-cgi processes with high memory consumption to pile up?
nope, the only port available on the Internet is the vpn one, and tcpdump confirms that there's no unexpected traffic
I'd expect peaks in memory, but all I see is a continuous slow leak
last pid: 70854; load averages: 0.29, 0.46, 0.48 up 3+12:26:21 19:22:01
57 processes: 3 running, 54 sleeping
CPU: 13.4% user, 0.0% nice, 33.3% system, 0.0% interrupt, 53.3% idle
Mem: 65M Active, 1066M Inact, 558M Wired, 340M Buf, 2265M Free
Being curious about your issue... Can you post the output of
vmstat -z | tr ':' ',' | sort -nk4 -t','
We see that your wired memory is increasing slowly over time, does it stabilize at some point or really ends up consuming both free and inactive ? Mine is stable around 800M. It varies between 8 to 12% of the total memory 8G in my case so 800M is around 10%). Do you know how big the numbers were right before an OOM (including free and inactive) ?
We also see your free being converted into inactive memory. This does not seem abnormal at first glance... This inactive memory can be reused if need be.
You should not be running into an OOM with those numbers unless there is something else eating up the memory or trying to allocate something very rapidly right before the OOM that we do not see here that would explain the OOM.
Also, there is indeed a shift in memory from free to inactive that was less present in pre-22 version but I never ran into an OOM because of it, then again I have 8GB of ram, not 4GB so I'm likely less prone to OOM.
What does your Reporting/Health looks like (if you have it) ?
Attached is mine (System/Memory from Reporting/Health) for the last 77 days (inverse turned on, resolution high), each peak is usually a reboot or an upgrade/reboot.
Quote from: RedVortex on May 17, 2022, 11:40:54 PM
Being curious about your issue... Can you post the output of
vmstat -z | tr ':' ',' | sort -nk4 -t','
Thanks for stopping on this issue. Here's the output: https://termbin.com/0uzj
Quote from: RedVortex
We see that your wired memory is increasing slowly over time, does it stabilize at some point or really ends up consuming both free and inactive ? Mine is stable around 800M. It varies between 8 to 12% of the total memory 8G in my case so 800M is around 10%). Do you know how big the numbers were right before an OOM (including free and inactive) ?
It doesn't stabilize over time, it keeps going until OOM kills all processes leaving only kernel alive. Before switching to wireguard kmod it was killing wireguard-go too and so all vpn connections, while now it leaves vpn alive but no other services available.
I'm not sure about the numbers before OOM, my external monitor (zabbix) records "Available memory is defined as free+cached+buffers memory" and last OOM happened when this value was 1.5GB, so not really explaining anything.
Quote from: RedVortex
We also see your free being converted into inactive memory. This does not seem abnormal at first glance... This inactive memory can be reused if need be.
This can be explaining by my manual activation and kill of iperf3 server in tmux terminal. It consumes a lot of memory and releases it afterwards. This is not the cause of OOM as I am aware of this behaviour and problem happens even at night when no iperf3 running and admin is sleeping :'( EDIT: can't really be sure of this, just retried and no massive spike in memory usage.
Quote from: RedVortex
You should not be running into an OOM with those numbers unless there is something else eating up the memory or trying to allocate something very rapidly right before the OOM that we do not see here that would explain the OOM.
There's still a possibility that the problem is not caused by a steady rise but a massive spike in memory usage, but so far it seems that the steady rise of wired memory is the only player in the field.
I've been collecting the output of `vmstat -m` by the minute in the last hours. Please find attached the resulting plot of "MemUse" column when filtering out the constant features.
I can share the python code that generates this if required.
Just a question, I've been told to monitor vmstat -m, but you're suggesting to monitor vmstat -z instead. Which one should I use? Thanks
Quote from: RedVortex on May 18, 2022, 12:11:13 AM
Also, there is indeed a shift in memory from free to inactive that was less present in pre-22 version but I never ran into an OOM because of it, then again I have 8GB of ram, not 4GB so I'm likely less prone to OOM.
What does your Reporting/Health looks like (if you have it) ?
Attached is mine (System/Memory from Reporting/Health) for the last 77 days (inverse turned on, resolution high), each peak is usually a reboot or an upgrade/reboot.
I've re-enabled the monitoring service only lately as I tried to disable not essential services before diving into the problem in detail, but i can share the memory report for the last 60h
It seems that among all vmstat -m vars, nvlist is the only one exposing a almost linear grow.
Does it mean anything to you?
I feel like there's no enough traction for problems like kernel leak and out-of-memory :-\
I'm wondering about something in regards to your nvlist.
Could you stop the Zabbix monitoring and agent and make sure it doesn't run anymore, even locally. See if it still leaks ?
I wonder if something is grabbing a list of states or something and there might be a leak in there somewhere. nvlist could be used for something similar.
Something along those lines: https://bugs.freebsd.org/bugzilla//show_bug.cgi?id=255971
Before shutting down zabbix-agentd, I'll attach an updated chart displaying linear grow of nvlist up to now
you nailed it! After stopping zabbix-agentd, the wired memory consumption stopped growing and the nvlist chart went flat. No memory released though, it's still at 611MB even after the process has been killed.
It's still a bug though... Hopefully they'll fix it kernel side somewhere in the next FreeBSD release. That was fun to troubleshoot though, not something we casually dig in every day 😊
sure it is not easy at all to spot this.
Not only top doesn't show it linked to zabbix-agentd, but nvlist is hard to inspect.
Do you know how can I open an issue on this? Is this a freebsd thing, or zabbix thing?
Hmmm, I'm not sure where I would open up the bug to be honest. This definitely looks like a FreeBSD bug. I mean, Zabbix only "uses" something provided by the OS/Kernel that "should" work properly from what we know. Unless it is Zabbix itself that leaks this in some way in their code but it seems to be kernel-related.
If I were to open up a bug, I would try to reproduce it in a simple fashion like a command-line script or something easy reproducible. Similar to what they did here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=255971 ideally trying not to use Zabbix to make sure it's now Zabbix itself that leaks in some way this nvlist thing (I wouldn't know how)...
And then open up a bug with FreeBSD and/or OPNSense. What they fixed in the above bug may not have made its way yet to the FreeBSD version that we are running of OPNSense. I'm not sure how to determine if the version we have contains this bugfix or even if it is really relevant to the bug you found here but it certainly looks very similar to you. Since they discovered this bad behaviour in the 13.0-Stable and that's what we're running in the latest OPNSense, maybe we don't have this bugfix yet, I'm not sure how to determine this 100%.
That being said, I would most likely at least open up a bug with OPNSense to start with considering this ends up crashing the whole thing and you're not doing anything "funky" and you're using the released version of both the zabbix agent and OPNSense. So you're probably not alone out there with this issue... I would probably put a link to this forum's post in the ticket and fully explain what triggers this bug (running zabbix agent) and what happens (nvlist leak leading to OOM and crash) and probably also put the link to the FreeBSD bug as well asking if this may be related in some way.
If you know which metric in the zabbix monitoring triggers this leak it would be helpful, this way they could easily replicate/troubleshoot and fix it. On your side, you could also try to exclude this metric for now from the agent and monitor everything else instead of nothing at all.
Hopefully, this is already fixed by the previous bug and we just need an updated version of FreeBSD or at least a patched one.
Interestingly, I just tried this (Similar to what is in the bug ticket of FreeBSD):
pfctl -sa
And wrapping it in a small script:
echo "before: " ; vmstat -m | grep "InUse\|nvlist" ; pfctl -sa > /dev/null ; echo "after:" ; vmstat -m | grep "InUse\|nvlist"
And that seems to leak. I'm not pushing it to see how far it will go but it does not seems to release the InUse and MemUse for far.
echo "before: " ; vmstat -m | grep "InUse\|nvlist" ; pfctl -sa > /dev/null ; echo "after:" ; vmstat -m | grep "InUse\|nvlist"
before:
Type InUse MemUse Requests Size(s)
nvlist 22030 2979K 66258359 16,32,64,128,256,2048,4096,8192
after:
Type InUse MemUse Requests Size(s)
nvlist 22036 2980K 66975304 16,32,64,128,256,2048,4096,8192
This command also seems to increase the InUse and never go down with FreeBSD 13.1 (testing opnsense 22.7 kernel right now) so I'm not sure the zabbix agent monitoring that creates a leak will be fixed in a more recent kernel or os release unfortunately.
You should probably open up a bit with opnsense in case it's something they can or need to fix on their side for zabbix and or something in regards to nvlist if you need this working.
It's rather strange... the fixes are already in 22.1 which means there are more leaks in the ioctl code.
Cheers,
Franco
Leak seems to be create by "pfctl -s info" diagnostics only which is gathered by a number of things in the core system. I'm trying to see if this is an easy patch.
Cheers,
Franco
https://github.com/opnsense/src/commit/4d3b9e4a34
# echo "before: " ; vmstat -m | grep "InUse\|nvlist" ; pfctl -sa > /dev/null ; echo "after:" ; vmstat -m | grep "InUse\|nvlist"
before:
Type InUse MemUse Requests Size(s)
nvlist 0 0K 200866 16,32,64,128,256,2048,4096,8192
after:
Type InUse MemUse Requests Size(s)
nvlist 0 0K 320016 16,32,64,128,256,2048,4096,8192
Good enough for now? :)
For anyone willing to try:
# opnsense-update -kzr 22.1.8-nvlist2
# opnsense-shell reboot
Cheers,
Franco
After more digging https://github.com/opnsense/src/commit/79e0c974ae85 seems to be on FreeBSD main branch only since April, which is a bit unfortunate. I thought it would fix all cases but there is another leak in the code which was merged into main via https://reviews.freebsd.org/D35385 yesterday.
We'll be offering both patches in 22.1.9 most likely to avoid further issues with this particular memory leak as the core system regularly polls pfctl as well. Not as fast as Zabbix but still problematic.
Cheers,
Franco
Thanks a lot for all the patches and digging Franco !
I can't test the patches myself since I'm on the test release for FreeBSD 13.1 (22.7...), hopefully arkanoid could test it, especially with Zabbix, since it seemed to leak a lot, maybe from other places than the ones we found...
I'm very glad you could nail it down, memory leaks could be very tricky to figure out sometimes. It was another fun bug hunting :)