Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - arkanoid

#1
Thanks for the answer, you're right.

I give up with the idea to serve webconfig internally on port 443

thanks
#2
Hello

I want to have opnsense web config served on LAN and WG interfaces, and leave WAN for haproxy, but if a wireguard interface is selected in "System: Settings: Administration" as listening interface for HTTPS it is ignored at boot, but it works when applied manually after boot.

"sockstat -4 -l" shows that lighttps is NOT listening on wireguard interface after boot

So I have a configuration that works before rebooting, and works again after boot if and only if I reapply the very same listening knterfaces preferences.

how can I fix this?

Thanks
#3
Thanks! I'll follow GitHub thread, then.
#4
Current WireGuard integration kills all existing connection states on WireGuard network(s) as soon as "apply" is clicked in web GUI, for example when adding or removing a peer from an existing network. This is quite disruptive.

WireGuard is capable of applying a new configuration on a running configuration via the syncconf command, see

https://man.freebsd.org/cgi/man.cgi?query=wg-quick&apropos=0&sektion=0&manpath=FreeBSD+12.2-RELEASE+and+Ports&arch=default&format=html


https://serverfault.com/questions/1101002/wireguard-client-addition-without-restart
#5
can confirm that this works, and you don't need to prepare or download anything locally. Every step can be accomplished using the cloud console

thanks a lo! Wiping /dev/sda while running an OS mounted on it is wild!
#6
Hello

like the title says, seems that opn is trying to start zabbix_agentd before the interfaces it's listening on are up.

I solve this by just clicking the play button in home dashboard after the firewall is up.

Question is what's the proper way to fix this:
- listen on all local ips but add a firewall rule (removes the need of binding completely)
- reorder service start or apply preconditions (how?)

considering it's an opnsense package and there's even a web gui for it, I'd say it requires a more general solution.

Thanks
#7
sure it is not easy at all to spot this.

Not only top doesn't show it linked to zabbix-agentd, but nvlist is hard to inspect.

Do you know how can I open an issue on this? Is this a freebsd thing, or zabbix thing?
#8
you nailed it! After stopping zabbix-agentd, the wired memory consumption stopped growing and the nvlist chart went flat. No memory released though, it's still at 611MB even after the process has been killed.
#9
Before shutting down zabbix-agentd, I'll attach an updated chart displaying linear grow of nvlist up to now
#10
I feel like there's no enough traction for problems like kernel leak and out-of-memory  :-\
#11
It seems that among all vmstat -m vars, nvlist is the only one exposing a almost linear grow.

Does it mean anything to you?
#12
Quote from: RedVortex on May 18, 2022, 12:11:13 AM
Also, there is indeed a shift in memory from free to inactive that was less present in pre-22 version but I never ran into an OOM because of it, then again I have 8GB of ram, not 4GB so I'm likely less prone to OOM.

What does your Reporting/Health looks like (if you have it) ?

Attached is mine (System/Memory from Reporting/Health) for the last 77 days (inverse turned on, resolution high), each peak is usually a reboot or an upgrade/reboot.

I've re-enabled the monitoring service only lately as I tried to disable not essential services before diving into the problem in detail, but i can share the memory report for the last 60h
#13
Quote from: RedVortex on May 17, 2022, 11:40:54 PM
Being curious about your issue... Can you post the output of

vmstat -z | tr ':' ',' | sort -nk4 -t','

Thanks for stopping on this issue. Here's the output: https://termbin.com/0uzj

Quote from: RedVortex
We see that your wired memory is increasing slowly over time, does it stabilize at some point or really ends up consuming both free and inactive ? Mine is stable around 800M. It varies between 8 to 12% of the total memory 8G in my case so 800M is around 10%). Do you know how big the numbers were right before an OOM (including free and inactive) ?

It doesn't stabilize over time, it keeps going until OOM kills all processes leaving only kernel alive. Before switching to wireguard kmod it was killing wireguard-go too and so all vpn connections, while now it leaves vpn alive but no other services available.

I'm not sure about the numbers before OOM, my external monitor (zabbix) records "Available memory is defined as free+cached+buffers memory" and last OOM happened when this value was 1.5GB, so not really explaining anything.

Quote from: RedVortex
We also see your free being converted into inactive memory. This does not seem abnormal at first glance... This inactive memory can be reused if need be.

This can be explaining by my manual activation and kill of iperf3 server in tmux terminal. It consumes a lot of memory and releases it afterwards. This is not the cause of OOM as I am aware of this behaviour and problem happens even at night when no iperf3 running and admin is sleeping  :'(
EDIT: can't really be sure of this, just retried and no massive spike in memory usage.

Quote from: RedVortex
You should not be running into an OOM with those numbers unless there is something else eating up the memory or trying to allocate something very rapidly right before the OOM that we do not see here that would explain the OOM.

There's still a possibility that the problem is not caused by a steady rise but a massive spike in memory usage, but so far it seems that the steady rise of wired memory is the only player in the field.


I've been collecting the output of `vmstat -m` by the minute in the last hours. Please find attached the resulting plot of "MemUse" column when filtering out the constant features.

I can share the python code that generates this if required.

Just a question, I've been told to monitor vmstat -m, but you're suggesting to monitor vmstat -z instead. Which one should I use? Thanks
#14
nope, the only port available on the Internet is the vpn one, and tcpdump confirms that there's no unexpected traffic

I'd expect peaks in memory, but all I see is a continuous slow leak
last pid: 70854;  load averages:  0.29,  0.46,  0.48    up 3+12:26:21  19:22:01
57 processes:  3 running, 54 sleeping
CPU: 13.4% user,  0.0% nice, 33.3% system,  0.0% interrupt, 53.3% idle
Mem: 65M Active, 1066M Inact, 558M Wired, 340M Buf, 2265M Free
#15
roger that, I was just trying to manually collect data even while on the go, so I used screenshots. I'll try to stick with text data.

Now I've wrapped up a shell script to collect

vmstat -m

over time, hopefully I'll get some good data for further analysis. Please redirect me if there's a better way to debug this.

In the meantime, here's latest top -o size

last pid: 36860;  load averages:  0.44,  0.42,  0.46  up 3+07:50:54  14:46:34
53 processes:  1 running, 52 sleeping
CPU:  0.2% user,  0.0% nice, 13.1% system,  0.0% interrupt, 86.7% idle
Mem: 51M Active, 1041M Inact, 547M Wired, 331M Buf, 2314M Free