Sensei on OPNsense - Application based filtering

Started by mb, August 25, 2018, 03:38:14 AM

Previous topic - Next topic
In version 0.8 beta 7 on netmap kernel i experience tremendous slowdown in DNS resolving and packet loss to internet resources.
Proxmox enthusiast @home, bare metal @work.

i have the same problem for over a week now, at the moment i'm using sensei in xlarge mode and have set dhcp lease time for 8 hour default and 10 hour max.

this seems to help stablilize the occurends

What common on earth have DHCP lease time with packet loss ?
Proxmox enthusiast @home, bare metal @work.

I just started testing and noticed the slowdown. In my case disabling cloud threat intel solved this.
maybe this helps.

Quote from: SchylgeICT on April 03, 2019, 09:03:14 PM
I just started testing and noticed the slowdown. In my case disabling cloud threat intel solved this.
maybe this helps.

I can confirm that, cloud threat intel cause noticable delay in the dns query. Its seems the cloud servers not stable enough, since i see packet loss. In a workaround use the opnsense builtin intrusion detection with ET Pro telemetry (can be installed as a plugin). Its free if you let your firewall send anonymous statistics (why not?).
Other than that sensei is an amazing product!

Quote from: Archanfel80 on April 03, 2019, 09:08:10 PM
Quote from: SchylgeICT on April 03, 2019, 09:03:14 PM
I just started testing and noticed the slowdown. In my case disabling cloud threat intel solved this.
maybe this helps.

I can confirm that, cloud threat intel cause noticable delay in the dns query. Its seems the cloud servers not stable enough, since i see packet loss. In a workaround use the opnsense builtin intrusion detection with ET Pro telemetry (can be installed as a plugin). Its free if you let your firewall send anonymous statistics (why not?).
Other than that sensei is an amazing product!

I can confirm too ;) We'll be shipping 0.8.0.beta8 tomorrow. It has several fixes which we expect to address this issue.

Plus, it has tagged (trunk) vlan interface support :)

Quote from: mdurkin on March 30, 2019, 09:05:01 AM
Anyone having problems blocking YouTube using 0.8.0.beta7? I used app control but it has no effect. Other controls seem to work fine. It's a shame as its the reason I installed was to try this out!
Anyone else tried blocking YouTube?

Hi mdurkin,

Many thanks for reporting this. I checked with several deployments now. It looks like it's blocking. Let me contact you, there might be something in your environment which might trigger this.


Hello! 4 of my graphs are suddenly showing nothing. "Egress New Connections by App Over Time" and "Egress New Connections by Source Over Time" say "No Egress New Connection." "New Connections & Unique Remote Hosts" says "No New Connection & Unique Remote Host" and "Unique Local Hosts over Time" says "No Local Host." I just updated to 0.8.0.beta7 as well as stopping and starting the Sensei Packet Engine and Elasticsearch services. Any thoughts on what might have gone wrong or how to fix it?

Thanks!

Hi OPNsense4ever,

Many thanks for trying Sensei & reporting the issue.

We changed a field type in Elasticsearch. New query format is not compatible with the data type in old indexes. This is why you cannot see any data with those "histogram"s.

When you have some activity over time, they'll get back to normal, at most in a couple of days.

April 05, 2019, 02:57:41 PM #249 Last Edit: April 05, 2019, 03:03:21 PM by mb
Dear Sensei users,

We've shipped 0.8.0.beta8 yesterday. This update brings vlan tagged interface support and fixes several issues with beta7. All beta7 users are encouraged to update to beta8.

With regard to Cloud infrastructure, we decided to take following steps to improve the availability:

1. Independent cloud queries:

Currently we're utilizing DNS infrastructure to communicate with our Cloud backend systems. Since we're redirecting dns traffic, this means for the cloud systems, we have to also act like a DNS recursive server. On the recursion side, since this is not within the scope of Sensei project, we cannot always guarantee the best DNS response time.

This is why, starting with 0.8.0.beta9, we'll be doing the cloud threat intelligence lookups with an independent to-the-purpose query. 

2. New cloud servers for US-West, US-East and Asia.

To improve cloud response time and distributing load, we'll be introducing new servers for Asia, US-West and US-East regions.

This change will have the following benefits:

1. Improved the availability
2. Improved response times (from avg 100ms to as low as 5ms)
3. You'll be able to continue using your local DNS servers.
4. You'll be able utilize other DNS based solutions (like Pi-hole) - in conjunction -  with Sensei.

We plan to have this before 0.8 rc1 so, hopefully we'll ship this with beta9 in two weeks.

Hi!

Just a curious question. Did you consider using Apache Lucene as the db backend instead of Elasticsearch?
I use lucene in several projects (mostly bitnami) and its a very scalable and fast backend. There is an option to use as a "lightweight" scenario and also like as an "enterprise". It may solve the low memory hw problem.
Im just thinkin loudly :)

Hi Archanfel80,

Many thanks for the suggestion. Actually didn't consider this as an option - wasn't aware that lucene had a lightweight option.

Currently we're evaluating Timescaledb and Influxdb. We'll also have a look at lucene lightweight option. Any pointers on this for me?

Hi!

I mostly played with heap sizes and buffer sizes. Lower values results lower memory usage in the cost of performance (slower queries) because the increased disk IO.
TimescaleDB is a good choice too. Im not sure about the Influxdb, i had to use it in the past but cause too much headache. Its not easy to operate.
Elasticsearch memory consumption also can limited. If i use in a low users <100 scenario and does not store more than 3 days data, the whole system memory usage is below 2GB. I run sensei in a 2GB board for almost a week now, small office 8 user only stored 3 days. The boss just want to see what the workers do so he check sensei reports in the end of the day. The whole system memory consumption is below 2GB. I use the default 2GB swap in opnsense but not a single byte used on that. I had to disable the sensei health check because its stopped the engine from time to time, but no issues so far. Also i have a bigger system, college with students, much more user much more data, stored 3 days history, the memory is just a bit above 4GB. I think the 8GB minimum recommended ram is a bit high. I dont have any system what eat this much.

What if sensei will detect the available system memory with the optional swap file too and gray out the big scenarios like 500 user and limit the maximum data history time limit, etc. So the user cant use a big scenario what break down the system?
For example with 2GB system, 25 users max, 3 days history
4GB system 100 users max, 7 days history
etc. And you can limit elasticsearch memory usage too.

And a quick report, after the beta8 the cloud threat query time a bit better but still cause delay what the user noticed.

Keep up the good work :)

Quote from: mb on April 06, 2019, 03:15:29 PM
Hi Archanfel80,

Many thanks for the suggestion. Actually didn't consider this as an option - wasn't aware that lucene had a lightweight option.

Currently we're evaluating Timescaledb and Influxdb. We'll also have a look at lucene lightweight option. Any pointers on this for me?

Hi Archanfel80,

Many thanks for sharing your experience. Indeed, we found this very helpful.

Now I'm thinking we might be over optimizing. We were trying to keep the memory usage for the Sensei and DB below 1GB for small deployments, like 25 users. And also we are trying to provide at least a month of history.

If the median minimal RAM size for OPNsense small deployments are 2GB, your suggestion looks very viable.

Let's do a quick twitter poll:

https://twitter.com/sunnyvalley/status/1115109250479476737

With regard to beta8, glad to hear that it looks better. We've received similar feedback from several other users. Hopefully, we will be solving the remaining issue with Cloud with beta9.

Hi!

I think keep the ram usage below 1GB would be a bit hard.
This is my smallest scenario, very low activity, sensei active only in one IF, around 8-10 users.

https://imgur.com/a/t8Bk8qg

This is a VM actually, the ram usage is below 2GB, but higher than 1GB. I cant keep below that. Of course this is the OS+Sensei RAM usage together. OPNSense eat 300-800MB RAM depending on scenario, so the 2GB usage with sensei means sensei use 1-1.5GB RAM with a low end settings.
A 2GB board should handle this, even with a swap file.
I think you can try to reach the ~1GB ram usage for a small scenario, that should be satisfy the low end HW users :)

Quote from: mb on April 08, 2019, 06:48:31 AM
Hi Archanfel80,

Many thanks for sharing your experience. Indeed, we found this very helpful.

Now I'm thinking we might be over optimizing. We were trying to keep the memory usage for the Sensei and DB below 1GB for small deployments, like 25 users. And also we are trying to provide at least a month of history.

If the median minimal RAM size for OPNsense small deployments are 2GB, your suggestion looks very viable.

Let's do a quick twitter poll:

https://twitter.com/sunnyvalley/status/1115109250479476737

With regard to beta8, glad to hear that it looks better. We've received similar feedback from several other users. Hopefully, we will be solving the remaining issue with Cloud with beta9.