Weird, very high memory utilisation with v25

Started by sunbeam60, February 20, 2025, 01:42:27 PM

Previous topic - Next topic
February 20, 2025, 01:42:27 PM Last Edit: February 20, 2025, 03:39:20 PM by sunbeam60 Reason: Typo in headline
Since updating to 25.x I'm seeing very different memory utilisation behaviour. I only detected this after the router stopped routing traffic late last night. A reboot obviously brought this back down to normal levels, but it started creeping upwards again.

I've updated to 25.1.1 today, but not seeing a difference in the behaviour.

Rather than try to describe it, it's easier looking at this set of images: https://imgur.com/a/opn-25-x-memory-utilisation-pattern-strange-minFixB

I am wondering if unbound could be the culprit as it seems to have very high memory utilisation (see last image in the imgur post). I'm wondering if there's new cache behaviour in DNS lookups - i.e. the memory pattern grows as DNS requests are cached.

Am I the only one seeing this on the 25.x series?

I would say that from your process list, there seem to be a boatload of configd.py processes with a "console" parameter in "wait" state. I have only one of those processes running. So it seems that there is some process that hangs and gets restarted without the old process actually stopping.

Maybe you should look at the log files to see what goes wrong...

Potentially, there are corrupt RRD or Netflow databases that cause processes to run away, you could try to reset these.
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 770 up, Bufferbloat A

Logs seem full of these errors:
error in configd communication Traceback (most recent call last): File "/usr/local/sbin/configctl", line 65, in exec_config_cmd line = sock.recv(65536).decode() ^^^^^^^^^^^^^^^^ ConnectionResetError: [Errno 54] Connection reset by peer

Since 25.x I've had frequent errors in the WebGUI so it's clearly all pointing one place. Not sure what to do to resolve it, though, other than "fresh install", which I'd be loathe to do.


Hmm, sort of solved.

I noticed a LOT of /usr/local/bin/php /usr/local/etc/rc.newwanipv6 pppoe0 force in the activity (top) list.

I don't have an IPv6 address from my ISP (Zen UK) - one can be requested, but I never did.

Disabling IPv6 on the WAN interface completely (which I believe was auto-enabled on upgrading to 25.x - I certainly don't remember ever enabling it) removed all of these and the memory is no longer growing.


I am also seeing this since updating from 24.7.12 to 25.1.1. Memory usage jumps to 96-99% within a couple of minutes of boot. Looking at top -o size, I can see this: imgur

Logs seems to be full of errors relating to various processes failing or being killed too. See extract from log here: pastebin

Any and all help greatly appreciated!


Small update on this. I ended up reinstalling from scratch and restoring my config. That dealt with the immediate memory leak, but I still saw usage steadily increasing over time until - as sunbeam60 mentioned - I disabled IPv6 on my WAN interface (I am also with Zen in the UK). Memory usage is now holding and stable.

I have an image of the problematic install, if useful.

There's no problematic install, reinstallation was not needed here There's a pppoe / IPv6 issue that is apparent on Zen UK network with a known mitigation.

If Franco doesn't see this thread by Monday it is best to open an issue on Github opnsense/core and see what's the path forward here or if it stays as is with the known mitigation until Zen UK changes things on their end.

I'm not sure that's completely right. I tried disabling DHCPv6 on WAN before reinstalling, and it didn't solve the RAM utlisation issue.

I think there are at least two issues at play here. One is the Zen issue, which is resolved by disabling DHCPv6 on WAN. But I think there is also another issue causing the box to chew through all of its RAM within 30 seconds or so of boot in some circumstances.

Quote from: hedders on February 22, 2025, 07:04:20 AMLogs seems to be full of errors relating to various processes failing or being killed too. See extract from log here: pastebin

I saw a million configd exits too. Definitely the same issue, but what is the chicken and what is the egg I shan't say.

Quote from: newsense on February 22, 2025, 04:08:03 PMThere's a pppoe / IPv6 issue that is apparent on Zen UK network with a known mitigation.

When you say "there's a pppoe / IPv6 issue apparent on Zen UK" is this something that's been acknowledged elsewhere or are you speculating? I did try to search this forum before posting my original post on this, but all I could find was this: https://docs.opnsense.org/manual/how-tos/IPv6_ZenUK.html

While this certainly talks about how to set up IPv6 for Zen UK in the best way, it didn't directly acknowledge a problem, so I'm wondering what I missed.

Quote from: hedders on February 22, 2025, 07:04:20 AMLogs seems to be full of errors relating to various processes failing or being killed too. See extract from log here: pastebin


I don't think pastebin indicates an issue, I would think it is more likely to be an artefact of the connectivity issue.

Quote from: sunbeam60 on February 20, 2025, 07:51:58 PMLogs seem full of these errors:
error in configd communication Traceback (most recent call last): File "/usr/local/sbin/configctl", line 65, in exec_config_cmd line = sock.recv(65536).decode() ^^^^^^^^^^^^^^^^ ConnectionResetError: [Errno 54] Connection reset by peer

This appears to be Zen resetting your connection for whatever reason