SSL errors on console...

Started by ThomasE, March 13, 2025, 09:40:14 AM

Previous topic - Next topic
Hi there,

we're running an OPNsense 24.10 and we believe that the following problem comes from our captive portal setup:

We're literally getting hundreds of SSL error messages every minute logged to the console, which basically makes it completely unusable. We can - and in most cases do - login via SSH and do everything from there, so it's not that big of an issue, but seriously, it is my understanding that only really critical messages should be logged to the console. Unless I'm completely mistaken and those messages really do indicate that something is going completely wrong and we must do something about it, I'm looking for a way to simply get rid of those messages by sending them to some log file only.

Here're a few examples of what's being logged:


error: 0A000417: SSL routines::sslv3 alert illegal parameter
error: 0A000102: SSL routines::unsupported protocol
error: 0A0000EB: SSL routines::no application protocol
error: 0A000076: SSL routines::no suitable signature algorithm
error: 0A00010B: SSL routines::wrong version number

Those seem to be by far the most frequent error messages, but there are other ones, too. They come at a rate of ~200 per minute, so it's really quite a bit. The system itself seems so be working just fine - except maybe for a somewhat high load average we think is unrelated to this problem.

Any suggestions what we could do about it?

Thanks
Thomas
:)

Hi Thomas,

First we need to find out what service prints these log messages, but I'm not sure how... no process ID or anything similar?


Cheers,
Franco

The process is lighttpd, /usr/obj/usr/ports/www/lighttpd/work/lighttpd-1.4.76/src/mod_openssl.c.{3510|3470}. (I found those two numbers in a single screenshot. There may be others. ;-)

Sounds related to https://github.com/opnsense/core/issues/6689 then.  Glenn posted a workaround, but nobody tested it so far.


Cheers,
Franco

Great, I'd be more than happy to give it a try and tell you what happened. ;-)

I found the options in

/var/etc/lighttpd-cp-zone-0.conf

where they're unset. Of course I could simply edit that file, but I'm afraid that change will be overridden as soon as I make any changes to the CP configuration if not even sooner, so what would be the best place to make this setting persistent?

Thanks!
Thomas

Hi Thomas,

This coming from a busy captive portal instance is probably more likely than being emitted from the admin GUI.

You can try this patch:

# opnsense-patch https://github.com/opnsense/core/commit/c4c3a8654c

Reapply the settings of the captive portal zone to activate them.


Cheers,
Franco

Hi Franco,

looks like that helped... a little bit... maybe... ;-)

There are still a lot of messages, but the rate seems to be somewhat slower - down by a roughly a third I would guess. I do assume that this is because of the applied patch - not because of less traffic on the captive portal. Is it safe to simply increase those numbers until everything works? By that I mean applying a factor of 2, 4, 8 or perhaps 16 at most - not hundreds or thousands. ;-)

Again, it depends how busy your captive portal instance is. Feel free to increase these numbers and see what happens :)


Cheers,
Franco

Well, I think it's safe to assume those limits are there for a reason ranging from "Nothing. You may just waste some resources that nobody really cares about on any modern system." to "Setting this too high may eventually crash the whole thing." - and since I'm on a production system, I prefer being a bit too careful rather than sorry. ;-)

Anyway, it looks like "Mission accomplished!" to me. I kept doubling those numbers until the messages disappeared. I ended up with

## number of file descriptors (leave off for lighty loaded sites)
server.max-fds         = 131072

## maximum concurrent connections the server will accept (1/2 of server.max-fds)
server.max-connections = 65536

which I suppose gives you an idea of just how busy our CP is. ;-)

Besides, I'm not sure if it has something to do with it or because it's Friday afternoon, but the load average on the whole system seems to be down significantly - from ~15 to ~3.

Thanks and have a nice weekend!
Thomas


Sadly, I just found out why there are no more error messages: The whole CP is down... :-(

According to the log files, the service is starting normally, but ports 8000 and 9000 are closed. We checked the limits (ulimit -a) for root and www users - no problem there. Any ideas?

But there're even more strange things...

ls -lh /var/etc/lighttpd-cp-zone-0.conf
-rw-r--r--  1 root wheel  1.2M Mar 24 08:15 /var/etc/lighttpd-cp-zone-0.conf
Notice the file size. At the beginning of the file, there's one extremely long line consisting of characters that can't be seen. After that, there's this:

              # ssl enabled, redirect to https
                       
#############################################################################################
###  Captive portal zone 0 lighttpd.conf  BEGIN
###  -- listen on port 8000 for primary (ssl) connections
###  -- forward on port 9000 for plain http redirection
#############################################################################################
#
#### modules to load
server.modules           = ( "mod_expire",
                             "mod_auth",
                             "mod_redirect",
                             "mod_access",
                             "mod_deflate",
                             "mod_status",
                             "mod_rewrite",
                             "mod_proxy",
                             "mod_setenv",
Nothing suspicious after that. Now I don't think this has anything to do with our orginal problem, but it still looks strange to me...