webgui unavailable after upgrade 23.1.8

Started by frispete, May 31, 2023, 07:17:22 PM

Previous topic - Next topic
May 31, 2023, 07:17:22 PM Last Edit: May 31, 2023, 09:35:24 PM by frispete
[version info corrected!]

Hello,

after today's upgrade to 23.1.8 (not exactly sure, from which version, last update was around March), the login screen is displayed fine, but attempting to login results in a 503.

Connecting via ssh, I can see the lighttpd process running, and also checked /var/etc/lighty-webConfigurator.conf, that it picked up my non-standard https port. Killing and restarting the lighttpd process gives back the webgui.

The effect is reproducible after reboot: Reboot msg -> Login -> 503 -> kill -> manual restart -> webgui.
Everything else seems to be working fine so far.

I'm probably too new to OPNsense to further debug this issue without a helping hand.

Just noticed, that a 23.1.9 update is available.

Applied it, and rebooted (to test, if the behaviour may have changed).

Unfortunately not. Proper webgui is available after manual restart of lighttpd.
While at it, isn't the webgui a service? Right now, I start the webgui with:

/usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigurator.conf

Guess, it would be cleaner to restart the appropriate service, but nothing jumped into my eye from the list of
service -l.

The system is a pretty standard Intel(R) Celeron(R) N5105 with four I225 nics: (WAN, LAN, DMZ), running OPNsense 23.1.9-amd64.

Hmm, this is looking interesting (from /var/log/lighttpd/latest.log):

<27>1 2023-05-31T23:00:25+02:00 miller.lisa.loc lighttpd 65088 - [meta sequenceId="1"] (/usr/obj/usr/ports/www/lighttpd/work/lighttpd-1.4.71/src/server.c.1216) [note] graceful shutdown started
<27>1 2023-05-31T23:00:25+02:00 miller.lisa.loc lighttpd 65001 - [meta sequenceId="2"] (/usr/obj/usr/ports/www/lighttpd/work/lighttpd-1.4.71/src/server.c.1909) server started (lighttpd/1.4.71)
<27>1 2023-05-31T23:00:27+02:00 miller.lisa.loc lighttpd 65088 - [meta sequenceId="3"] (/usr/obj/usr/ports/www/lighttpd/work/lighttpd-1.4.71/src/server.c.2308) server stopped by UID = 0 PID = 44602
<27>1 2023-05-31T23:00:27+02:00 miller.lisa.loc lighttpd 65001 - [meta sequenceId="4"] (/usr/obj/usr/ports/www/lighttpd/work/lighttpd-1.4.71/src/gw_backend.c.281) establishing connection failed: socket: unix:/tmp/php-fastcgi.socket-1: No such file or directory
<27>1 2023-05-31T23:00:27+02:00 miller.lisa.loc lighttpd 65001 - [meta sequenceId="5"] (/usr/obj/usr/ports/www/lighttpd/work/lighttpd-1.4.71/src/gw_backend.c.281) establishing connection failed: socket: unix:/tmp/php-fastcgi.socket-0: No such file or directory
<27>1 2023-05-31T23:00:27+02:00 miller.lisa.loc lighttpd 65001 - [meta sequenceId="6"] (/usr/obj/usr/ports/www/lighttpd/work/lighttpd-1.4.71/src/gw_backend.c.1007) all handlers for /index.php? on .php are down.
<27>1 2023-05-31T23:00:30+02:00 miller.lisa.loc lighttpd 65001 - [meta sequenceId="7"] (/usr/obj/usr/ports/www/lighttpd/work/lighttpd-1.4.71/src/gw_backend.c.358) gw-server re-enabled: unix:/tmp/php-fastcgi.socket-1  0 /tmp/php-fastcgi.socket
<27>1 2023-05-31T23:00:30+02:00 miller.lisa.loc lighttpd 65001 - [meta sequenceId="8"] (/usr/obj/usr/ports/www/lighttpd/work/lighttpd-1.4.71/src/gw_backend.c.358) gw-server re-enabled: unix:/tmp/php-fastcgi.socket-0  0 /tmp/php-fastcgi.socket
<27>1 2023-05-31T23:02:59+02:00 miller.lisa.loc lighttpd 65001 - [meta sequenceId="1"] (/usr/obj/usr/ports/www/lighttpd/work/lighttpd-1.4.71/src/server.c.2308) server stopped by UID = 0 PID = 24407
<27>1 2023-05-31T23:03:04+02:00 miller.lisa.loc lighttpd 3112 - [meta sequenceId="2"] (/usr/obj/usr/ports/www/lighttpd/work/lighttpd-1.4.71/src/server.c.1909) server started (lighttpd/1.4.71)


After the restart, the backend connection fails for some reason!


Thanks, Franco!

webgui is listening on LAN interface only, and this setting wasn't changed since early setup of OPNsense.

Is the (LAN) interface setup somehow racing with webgui start here?

My best guess: IPv6 mode set to track WAN which this is highly unreliable and it was said many many years ago:

https://github.com/opnsense/core/issues/1347#issuecomment-347696172

For emphasis:

There will not be a lot of sanity checking. To stress this point, if all manually configured interfaces do not have a single IP listening address, the service will refuse to start as opposed to falling back to listen on all interfaces...

Use at your own risk. It's hard to recover without other precautions like console access, auto-console login, etc.


Cheers,
Franco