OPNsense Forum

English Forums => 24.7, 24.10 Legacy Series => Topic started by: molnart on December 29, 2024, 01:48:27 PM

Title: Netflow excessive CPU and disk I/O usage
Post by: molnart on December 29, 2024, 01:48:27 PM
I have repeated issues with netflow, ever since I have first installed OPNsense like 4 year ago.
Usually i am noticing it by system alerts that my ssd temperatures went off the charts. Trying to troubleshoot it, i see netflow (flowd_aggregate.py) producing a disk i/o of around 200 MB/s, also accompanied by high CPU usage.

Restarting the netflow service doesnot help. Restarting OPNsense also does not, disk and CPU usage is the same afterwards. Also for long time I had the feeling that rebooting OPNsense takes ages - not i know its because tar is apparently archiving the netflow files, running for almost 10 minutes.

The only thing that helps is reseting netflow data altogether - i have to do it once a few months. But looking at the contents of /var/netflow the sqlite database is not that big.


--- /var/netflow -------------------------------------------------------------------------------------------------------                                /..
    4.4 GiB [#################]  src_addr_details_086400.sqlite
    1.4 GiB [#####            ]  dst_port_086400.sqlite
    1.2 GiB [####             ]  dst_port_086400.sqlite-journal
  419.1 MiB [#                ]  src_addr_086400.sqlite
  121.8 MiB [                 ]  interface_000030.sqlite
   36.5 MiB [                 ]  src_addr_000300.sqlite
   17.3 MiB [                 ]  dst_port_003600.sqlite
   15.0 MiB [                 ]  dst_port_000300.sqlite
   13.1 MiB [                 ]  interface_000300.sqlite
   13.0 MiB [                 ]  src_addr_003600.sqlite
    1.5 MiB [                 ]  interface_003600.sqlite
  136.0 KiB [                 ]  interface_086400.sqlite
   12.0 KiB [                 ]  metadata.sqlite


I have stumbled upon this thread https://forum.opnsense.org/index.php?topic=19786.0 claiming its caused by IPv6 but that one is disabled in my config.

Is there any longterm solution for this? Like moving netflow data to an external database or something?
Title: Re: Netflow excessive CPU and disk I/O usage
Post by: Patrick M. Hausen on December 29, 2024, 02:07:02 PM
Netflow creates a protocol entry of every single connection. On a busy gateway what you observe is just expected. It's a heck of a lot of data, so there is no "solution".

You could set up an external network management system and netflow aggregator and send the data there instead of processing it locally. Most products are commercial, though. I am still investigating if there is any open source tool I can use.
Title: Re: Netflow excessive CPU and disk I/O usage
Post by: molnart on December 29, 2024, 02:22:47 PM
in that case it looks to me like unoptimized logic on OPNsense side. i think this could be solved by defining data retention periods, downsampling older data, etc. but unfortunately it doesn't look like netflow got any significant development during the past years
Title: Re: Netflow excessive CPU and disk I/O usage
Post by: Patrick M. Hausen on December 29, 2024, 02:35:57 PM
Netflow is standardized. It works exactly as designed and there is nothing to change because that would break interoperability. It was invented by Cisco and of course routers were expected to shovel the data off the chassis into some aggregator system with enough storage and processing power.

Possibly the aggregator and monitor built into OPNsense could be improved, agreed. But any serious high bandwidth scenario will need a dedicated machine and product for that, anyway.

There are better protocols today I guess. sFlow seems to be popular.
Title: Re: Netflow excessive CPU and disk I/O usage
Post by: jaykumar2005 on January 07, 2025, 02:32:25 PM
Quote from: Patrick M. Hausen on December 29, 2024, 02:07:02 PMNetflow creates a protocol entry of every single connection. On a busy gateway what you observe is just expected. It's a heck of a lot of data, so there is no "solution".

You could set up an external network management system and netflow aggregator and send the data there instead of processing it locally. Most products are commercial, though. I am still investigating if there is any open source tool I can use.


I use Elastiflow (renamed to NetObserve). They have a free tier license which is good enough for homelab use.

https://www.elastiflow.com/basic-license
Title: Re: Netflow excessive CPU and disk I/O usage
Post by: Patrick M. Hausen on January 07, 2025, 02:48:33 PM
Quote from: jaykumar2005 on January 07, 2025, 02:32:25 PMI use Elastiflow (renamed to NetObserve). They have a free tier license which is good enough for homelab use.
Thanks! I'll give it a spin.
Title: Re: Netflow excessive CPU and disk I/O usage
Post by: molnart on January 13, 2025, 10:36:15 PM
i have just installed Graylog and its able to process netflow data, but setting up visualizations looks like much work