Sensei - External elastic search - socket/open file descriptors exhaustion

Started by serbans, August 08, 2021, 03:51:41 PM

Previous topic - Next topic
Hi everybody!

I have the following issue with Sensei 1.9.3 on an external elastic search database. The number of opened sockets from OPNsense to ES is increasing around 1 TCP socket per second, and the sockets do not seem to be closing on either side (they show connected on both OPNSense and ES) at the same rate as they are being created. This leads to open file descriptors exhaustion on ES side, after a period of time.

I opened a ticket with SunnyValley as well, but wondering if there is some mitigation on the ES side. I tried setting a lower TCP keepalive interval, but this is usually good for connections passing through a firewall in order to avoid state table timeouts, but I do not think it is the case here.

thanks a lot,
Serban

@serbans, we have received the ticket. At first sight, this looks like a socket leak. We're digging deeper and will get back to you soon.

Might well be the same issue as this one I had earlier: https://forum.opnsense.org/index.php?topic=23786.0

I've moved to local Elasticsearch as there wasn't really any progress in finding out why it's hugging up that much memory over time, but I guess there is a chance the underlying issue is the same as for your TCP sockets.

A short update here:

Received a patch from Sensei about 3-5 days after the ticket was created on the system (thanks!). I have not applied it due to an OpSec issue - I was given an executable with extension .py to replace a python script (which is - in a way - a big no-no)

Was told that the changes will be reflected in the 1.10 version, that is supposed to move (partly) to a new language, hence the executable.

Decided to wait for the official release. Will update then again if the issue is solved.

Do you use Kibana for visualization of the different reports / Sensei dashboards? Do you have any, to share? :D

https://github.com/psychogun/zenarmor-kibana-dashboards

Running OPNsense through Proxmox
4 x Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (1 Socket)
24 GB RAM

New version 1.10 apparently solves the issue of the file descriptors exhaustion.

thanks koushun for the dashboards, really interesting stuff !!

Thanks a lot !
Serban