Hey hi hello,
I'm having an issue with the webgui that I could use some help debugging.
~~system~~
OPNsense 25.7.8-amd64
FreeBSD 14.3-RELEASE-p5
OpenSSL 3.0.18
I'm running this as a VM on Proxmox
~~primary symptom~~
I can't access the webgui. I can restart all services, or just restart the webgui with configctl, but within a couple of minutes access times out. I've been looking through tons of logs seeing if I can make sense of what's happening.
Beside that, everything else works. I still have ssh access, networking is working as expected. The issue is mostly transparent until I need to reconfigure something.
~~configd logs~~
The first big red flag is that when I tail the configd logs while access is down, I see a lot of invocations of "list shells", "list locales", and "Stream CPU stats", followed by "Script action terminated by other end". When I restart webgui, I get a big dump of termination logs, and the whole cycle repeats until I eventually lose access.
My gut says something is requesting these resources, not getting them, and re-requesting. A queue fills up, eventually terminations occur, and the whole system is locked waiting for something to resolve. I'm just not sure what could be requesting these resources while the dashboard isn't loaded.
~~other notes~~
sockstat | grep configd shows at least 60 open connections from python to not quite that many individual php-cgi processes. lighttpd shows one process with connections in the 700s, mostly pointing to my local IP and a debian host. I don't think I have 700 active connections. I also have north of 1000 php sessions listed in /var/lib/php/sessions
I think I've exhausted my current skillset for debugging and could use some guidance on where to probe from here.
Thanks!