OPNsense Forum

Archive => 16.7 Legacy Series => Topic started by: Silverstar on August 29, 2016, 11:02:35 am

Title: [solved] NetFlow disk usage
Post by: Silverstar on August 29, 2016, 11:02:35 am
Hi folks,

seems that NetFlow uses unlimited disk space... or am I missing any way to stop it from filling up my disk to the limit?

Best,
Silverstar
Title: Re: NetFlow disk usage
Post by: Silverstar on August 31, 2016, 09:41:00 am
Sorry for bumping this up but the diskusage is growing and growing... is there really only the option to reset the netflow data manually but no rotation or space limitation?

Thx,
Silverstar
Title: Re: NetFlow disk usage
Post by: franco on September 01, 2016, 09:42:39 am
Hi there,

The solution is to do it manually for now, correct. It's under Reporting: Settings: "Reset Netflow Data".

I think there is rotation, but no retention policy:

% ls -lah /var/log/flowd.log*
-rw-------  1 root  wheel   5.1M Sep  1 09:38 /var/log/flowd.log
-rw-------  1 root  wheel    11M Aug 31 21:51 /var/log/flowd.log.000001

It would help to know your system's parameters, available size, your current log size and your expected capacity limitations to see how to address this issue concretely in a feature request. :)


Cheers,
Franco
Title: Re: NetFlow disk usage
Post by: Silverstar on September 05, 2016, 05:40:19 pm
Hi Franco,

today I deleted a 14G flowd.log...
The numbered files do all have 11 M but the main file grows infinitely...

Code: [Select]
root@fw:~ # du -hs /var/log/*
544K    /var/log/dhcpd.log
4.0K    /var/log/dmesg.today
4.0K    /var/log/dmesg.yesterday
544K    /var/log/filter.log
 14G    /var/log/flowd.log
 11M    /var/log/flowd.log.000001
 11M    /var/log/flowd.log.000002
 11M    /var/log/flowd.log.000003
[...]

My system is runnig on a 42G SSD.

Would be nice if we could set a rotation in the config section (GUI) instead of reset all the data from time to time.

Best,
Silverstar
Title: Re: NetFlow disk usage
Post by: AdSchellevis on September 05, 2016, 08:03:56 pm
Hi Silverstar,

It looks like flowd_aggregate isn't running on your end, the flowd.log file is used as a staging area for Insight, but if the aggregation process isn't running it doesn't rotate either (it rotates on a 10Mb interval normally).

With the following command you can check the status of the service:
Code: [Select]
service flowd_aggregate status

The latest version of flowd_aggregate has an automatic repair option to recover after a crash, which should prevent this in the future.
Best thing to do now, is probably to remove flow.log.* and restart flowd and the aggregator.
Code: [Select]
service flowd_aggregate stop
service flowd restart
service flowd_aggregate start


Best regards,

Ad
Title: Re: NetFlow disk usage
Post by: Poof on September 07, 2016, 05:17:22 am
Just giving a +1. flowd_aggregate crashes on my instance after 3~ hours. I just checked and found myself with a 3.6GB flowd.log file. :/ It has filled my disk a few times, so frustrating!
Title: Re: NetFlow disk usage
Post by: Silverstar on September 17, 2016, 05:44:22 pm
Hi Ad,

I've done as you suggested, deleted the flowd.log* files from /var/log/, stopped flowd_aggregate, restarted flowd and started flowd_aggregate again.
But after a few seconds to a minute flowd_aggregate isn't running regarding to service flowd_aggregate status :(

Seems, something is broken here!

Best,
Silverstar
Title: Re: NetFlow disk usage
Post by: AdSchellevis on September 18, 2016, 11:30:12 am
Hi Silverstar,

Can you try to run flowd_aggregate manually?

Code: [Select]
service flowd_aggregate stop
/usr/local/opnsense/scripts/netflow/flowd_aggregate.py console

Then wait for it to exit (should take the same amount of time), then post the output here including any messages in /var/log/syslog

Code: [Select]
clog /var/log/system.log

The latest version of our software should try to run an automatic repair on the sqlite files its using, so maybe your experiencing something completely different here.

Just to be sure, you are using OPNsense 16.7.3 ?

Best regards,

Ad

Title: Re: NetFlow disk usage
Post by: franco on September 18, 2016, 04:03:36 pm
The database repair code is not on 16.7.3, it will be available in 16.7.4 with the patch below. You can, however, install the code running the following command in the console:

# opnsense-patch 2bcdb42

https://github.com/opnsense/core/commit/2bcdb42

Maybe this will help.


Cheers,
Franco
Title: Re: NetFlow disk usage
Post by: Silverstar on September 19, 2016, 07:44:12 pm
Hi Ad,
hi Franco,

I did as you suggested.
The script took about 15 min. of runtime.
The script itself didn't output anything on command line.

Here are the lines from clog /var/log/system.log around the runtime of the script.

Code: [Select]
Sep 19 18:47:13 fw flowd_aggregate.py: flowd aggregate died with message Traceback (most recent call last):   File "/usr/local/opnsense/scripts/netflow/flowd_aggregate.py", line 145, in run     aggregate_flowd(do_vacuum)   File "/usr/local/opnsense/scripts/netflow/flowd_aggregate.py", line 85, in aggregate_flowd     stream_agg_object.cleanup(do_vacuum)   File "/usr/local/opnsense/scripts/netflow/lib/aggregate.py", line 277, in cleanup     self._update_cur.execute('delete from timeserie where mtime < :expire', {'expire': expire_timestamp}) DatabaseError: database disk image is malformed
Sep 19 19:00:47 fw sshd[27366]: Accepted keyboard-interactive/pam for root from 10.0.220.6 port 58747 ssh2
Sep 19 19:01:29 fw opnsense: /index.php: Successful login for user 'root' from: 10.0.220.6
Sep 19 19:01:30 fw configd.py: [77c5f0d8-d529-41ff-984f-7557f8675e5a] IPsec list ip address pools
Sep 19 19:01:30 fw configd.py: [e6dd1c66-804c-4ae5-88d3-a3cef8923bad] IPsec list status
Sep 19 19:01:48 fw configd.py: [6a6cffb1-080c-4548-b0f3-dff2d05f9e29] IPsec list ip address pools
Sep 19 19:01:48 fw configd.py: [e2736d1d-46e4-4a1d-8241-545efce8fe43] IPsec list status
Sep 19 19:02:00 fw configd.py: [459569e5-d465-4d9c-bbdf-fa8368b01b9a] IPsec list ip address pools
Sep 19 19:02:00 fw configd.py: [4b5aa42b-8674-4223-8dc2-c2b2296d1a14] IPsec list status
Sep 19 19:02:16 fw configd.py: [85324443-fcf2-462f-a97f-d9b17745bfd6] IPsec list ip address pools
Sep 19 19:02:16 fw configd.py: [bfaaefbb-8b08-4852-987f-c0f365880cd4] IPsec list status
Sep 19 19:02:43 fw configd.py: [63b45754-fd76-4591-97b6-5f5186ac8274] show system activity
Sep 19 19:03:30 fw configd.py: [c0f49f71-c15f-472f-a9b3-298b0f728057] show system activity
Sep 19 19:03:46 fw configd.py: [7c52b133-a3c3-414f-8689-10341fbd9e0c] IPsec list ip address pools
Sep 19 19:03:46 fw configd.py: [164c2aef-31ba-4027-9ac3-dd620ed8428e] IPsec list status
Sep 19 19:04:17 fw configd.py: [8893657c-2c7e-439b-ad75-6ea64422b0ba] show system activity
Sep 19 19:04:26 fw configd.py: [0e6a1c96-63fa-4cc2-b05c-6c21074e5880] show system activity
Sep 19 19:04:38 fw configd.py: [b8cff5d9-4feb-4a83-8740-f8ba7a59e2a2] show system activity
Sep 19 19:04:40 fw configd.py: [e13586a8-0759-4056-8ae7-c3ffb3aaa968] show system activity
Sep 19 19:04:54 fw configd.py: [a5496d29-1f1f-4cca-a45f-51c66577f11a] IPsec list ip address pools
Sep 19 19:04:54 fw configd.py: [49150607-f6be-4e57-aa27-b8c2012fb27e] IPsec list status
Sep 19 19:04:59 fw kernel: A,4282023658,2141745024,65535,,
Sep 19 19:05:03 fw configd.py: [07711743-c7e3-4e83-a6a9-5c20ba109e17] show system activity
Sep 19 19:05:11 fw configd.py: [c9ddda98-20de-42c7-bcf9-dcb281f0d66b] show system activity
Sep 19 19:05:20 fw configd.py: [c02e0b17-bf6d-4df7-b95d-bd3b95641c70] show system activity
Sep 19 19:06:32 fw configd.py: [0bf487aa-95d3-41f5-a9f1-242644e758be] show system activity
Sep 19 19:08:06 fw configd.py: [ad232804-d3f1-4f09-8765-38e109e23ea1] show system activity
Sep 19 19:08:32 fw configd.py: [5d5657d4-9698-424b-b2b7-43174b248eb3] show system activity
Sep 19 19:09:55 fw configd.py: [638a3cd1-291c-4d47-89bf-e2a0d001065c] show system activity
Sep 19 19:10:38 fw configd.py: [d8f4c6aa-0429-4c6b-ba20-e48800ab4dbb] show system activity
Sep 19 19:10:49 fw configd.py: [b56d4d76-5b78-4eb8-b4a7-b6663e488d3a] show system activity
Sep 19 19:10:55 fw configd.py: [586829ad-69a3-469f-89ac-9b0dbe7de498] show system activity
Sep 19 19:11:14 fw configd.py: [1e0a5b66-b48c-4eab-88f4-b4c118d5d7e8] show system activity
Sep 19 19:11:20 fw configd.py: [4c3b2b8f-63d8-4a8f-b40d-c53e3d5f3ca4] show system activity
Sep 19 19:11:53 fw configd.py: [e826f334-cec2-42e0-8b3b-b2359ccaa5d7] show system activity
Sep 19 19:12:14 fw configd.py: [d835988a-5a82-492d-a79c-996d5ade3f59] show system activity
Sep 19 19:12:20 fw configd.py: [4c8b40f2-c5c9-4f0c-b38a-174b616bbfd2] show system activity
Sep 19 19:12:24 fw configd.py: [bfa1c6e4-cc41-43dd-a3cf-077a880fe732] show system activity
Sep 19 19:13:02 fw configd.py: [e97ad08c-d85a-4cdf-98d6-e7eb2096842c] show system activity
Sep 19 19:13:31 fw configd.py: [6ce5d6c5-5b04-4086-8aa9-f36e25e41048] show system activity
Sep 19 19:14:12 fw configd.py: [c81fa60a-579f-4d62-bec3-d1cbf48147b3] show system activity
Sep 19 19:15:04 fw configd.py: [4c34ddf3-53d9-42eb-96d6-14a80f5b137a] show system activity
Sep 19 19:15:13 fw configd.py: [b1b1ee07-476a-4803-aaed-94dc5c761b76] show system activity
Sep 19 19:15:16 fw configd.py: [1a4c6ea8-2291-44b3-9b9c-7ed821ba7724] show system activity
Sep 19 19:15:20 fw configd.py: [dfa0ff61-8f80-429f-a869-07f5dbd650d1] show system activity
Sep 19 19:15:23 fw configd.py: [d47433f9-9030-43e4-be91-1a162db6212e] show system activity
Sep 19 19:15:27 fw configd.py: [a20e7c6f-ce5b-4965-8a25-27210c616320] show system activity
Sep 19 19:15:42 fw configd.py: [e85dee2d-98af-4fc3-9ed8-c6c81b74df33] show system activity
Sep 19 19:16:14 fw configd.py: [30be9ba0-9d89-442c-9491-70387982dbed] show system activity
Sep 19 19:16:23 fw configd.py: [5a666136-8741-412f-bb6d-176e231d09d0] show system activity
Sep 19 19:16:47 fw kernel: A,1209682972,961485202,65535,,
Sep 19 19:16:48 fw configd.py: [6a943954-07c6-407c-b346-bb7aaaef0afd] show system activity
Sep 19 19:16:55 fw configd.py: [73688291-5017-47e5-b3b1-7730d7c9af08] show system activity
Sep 19 19:16:58 fw configd.py: [8071150b-09ad-4b65-8266-bbca2d424b73] show system activity
Sep 19 19:17:23 fw configd.py: [20a979cc-3b3b-4570-8ad7-c524c2072bbd] show system activity
Sep 19 19:18:16 fw flowd_aggregate.py: flowd aggregate died with message Traceback (most recent call last):   File "/usr/local/opnsense/scripts/netflow/flowd_aggregate.py", line 145, in run     aggregate_flowd(do_vacuum)   File "/usr/local/opnsense/scripts/netflow/flowd_aggregate.py", line 85, in aggregate_flowd     stream_agg_object.cleanup(do_vacuum)   File "/usr/local/opnsense/scripts/netflow/lib/aggregate.py", line 277, in cleanup     self._update_cur.execute('delete from timeserie where mtime < :expire', {'expire': expire_timestamp}) DatabaseError: database disk image is malformed
Sep 19 19:18:27 fw configd.py: [0a32a78b-720c-4528-89a8-5cb7f9c9ef91] show system activity
Sep 19 19:18:42 fw configd.py: [d34ade08-63d2-4329-b463-591ddb55624a] IPsec list ip address pools
Sep 19 19:18:42 fw configd.py: [c692d1bb-95e2-43f0-88ed-3c97ed164b6b] IPsec list status

The system version is 16.7.2

Will take the update to 16.7.3 and the patch from Franco as the next steps and keep you guys posted...

Thanks,
Silverstar
Title: Re: NetFlow disk usage
Post by: Silverstar on September 19, 2016, 10:31:21 pm
Update & patch done.
service flowd_aggregate keeps running for now.
Produces log files at 11MB each.

Keep you posted if roation kicks in...

Best,
Silverstar
Title: Re: NetFlow disk usage
Post by: Silverstar on September 20, 2016, 11:44:01 am
Seems to be solved :)
Thank you guys!

Rotation works and keeps 10 files.
Working file never exceeds 10 MB.
Code: [Select]
root@fw:/var/log # ls -alh flowd.log*
-rw-------  1 root  wheel   5.1M Sep 20 11:39 flowd.log
-rw-------  1 root  wheel    11M Sep 20 11:36 flowd.log.000001
-rw-------  1 root  wheel    11M Sep 20 11:30 flowd.log.000002
-rw-------  1 root  wheel    11M Sep 20 11:24 flowd.log.000003
-rw-------  1 root  wheel    12M Sep 20 11:18 flowd.log.000004
-rw-------  1 root  wheel    12M Sep 20 11:12 flowd.log.000005
-rw-------  1 root  wheel    11M Sep 20 11:06 flowd.log.000006
-rw-------  1 root  wheel    11M Sep 20 11:01 flowd.log.000007
-rw-------  1 root  wheel    11M Sep 20 10:55 flowd.log.000008
-rw-------  1 root  wheel    11M Sep 20 10:49 flowd.log.000009
-rw-------  1 root  wheel    11M Sep 20 10:43 flowd.log.000010

Best,
Silverstar
Title: Re: [solved] NetFlow disk usage
Post by: Poof on October 02, 2016, 07:47:39 am
It looks like the crashes are back. I implemented the patch referenced here but I have another 16GB+ flowd.log file. :( No logs in the system log either.
Title: Re: [solved] NetFlow disk usage
Post by: Poof on November 04, 2016, 01:12:52 am
This issue is still happening.

OPNsense 16.7.7-amd64
FreeBSD 10.3-RELEASE-p11
OpenSSL 1.0.2j 26 Sep 2016

root@vpn:/var/log # ps wwwaux | grep -i flow
root   73815  37.5  0.5 122160 32864  -  Ds   Mon05PM  1796:05.42 /usr/local/bin/python2.7 /usr/local/opnsense/scripts/netflow/flowd_aggregate.py
_flowd 85758   0.2  0.0  12360  2412  -  Ds   Mon05PM    13:34.98 flowd: net (flowd)
root   85525   0.0  0.0  12360  1560  -  Is   Mon05PM     0:00.00 flowd: monitor (flowd)
root   82449   0.0  0.0  18752  2196  0  S+    5:06PM     0:00.00 grep -i flow
root@vpn:/var/log # ls -alh flow*
-rw-------  1 root  wheel    33G Nov  3 17:06 flowd.log
root@vpn:/var/log #
Title: Re: [solved] NetFlow disk usage
Post by: nibblerrick on November 05, 2016, 05:17:54 pm
I think I am into the same problem here, updated last week to the actual version and now it seems that opnsense crashes from time to time.
Having a 1.3G flowd.log, till Sep. 29 I have 11 MB logs. flowd_aggregate seems to eat up one core completely of the server constantly.
Disables netflow now and everythings seems back to normal.
Title: Re: [solved] NetFlow disk usage
Post by: Silverstar on November 05, 2016, 07:33:42 pm
Never run in that issue again.
Updated to 16.7.7 today and had to reset the netflow data to get isight working again.
For me flowd.log never exceeds 12 MB and there are 10 flowd.log.0000xx files kept in rotation.
Every one of them about 12 MB in size.
All I noticed since /usr/local/opnsense/scripts/netflow/flowd_aggregate.py constantly runs is a high CPU consumption of that script...
Title: Re: [solved] NetFlow disk usage
Post by: nibblerrick on November 06, 2016, 12:52:12 pm
Then I'll reset the flowd-data and try again. I think I will see within a day if it runs crazy or smoothly. Thanks.
Title: Re: [solved] NetFlow disk usage
Post by: nibblerrick on November 09, 2016, 07:45:28 pm
After deleting the big flowd.log and starting the service again with the actual version everything keeps running smoothly!