OPNsense Forum
Archive => 16.7 Legacy Series => Topic started by: Silverstar on August 29, 2016, 11:02:35 am
-
Hi folks,
seems that NetFlow uses unlimited disk space... or am I missing any way to stop it from filling up my disk to the limit?
Best,
Silverstar
-
Sorry for bumping this up but the diskusage is growing and growing... is there really only the option to reset the netflow data manually but no rotation or space limitation?
Thx,
Silverstar
-
Hi there,
The solution is to do it manually for now, correct. It's under Reporting: Settings: "Reset Netflow Data".
I think there is rotation, but no retention policy:
% ls -lah /var/log/flowd.log*
-rw------- 1 root wheel 5.1M Sep 1 09:38 /var/log/flowd.log
-rw------- 1 root wheel 11M Aug 31 21:51 /var/log/flowd.log.000001
It would help to know your system's parameters, available size, your current log size and your expected capacity limitations to see how to address this issue concretely in a feature request. :)
Cheers,
Franco
-
Hi Franco,
today I deleted a 14G flowd.log...
The numbered files do all have 11 M but the main file grows infinitely...
root@fw:~ # du -hs /var/log/*
544K /var/log/dhcpd.log
4.0K /var/log/dmesg.today
4.0K /var/log/dmesg.yesterday
544K /var/log/filter.log
14G /var/log/flowd.log
11M /var/log/flowd.log.000001
11M /var/log/flowd.log.000002
11M /var/log/flowd.log.000003
[...]
My system is runnig on a 42G SSD.
Would be nice if we could set a rotation in the config section (GUI) instead of reset all the data from time to time.
Best,
Silverstar
-
Hi Silverstar,
It looks like flowd_aggregate isn't running on your end, the flowd.log file is used as a staging area for Insight, but if the aggregation process isn't running it doesn't rotate either (it rotates on a 10Mb interval normally).
With the following command you can check the status of the service:
service flowd_aggregate status
The latest version of flowd_aggregate has an automatic repair option to recover after a crash, which should prevent this in the future.
Best thing to do now, is probably to remove flow.log.* and restart flowd and the aggregator.
service flowd_aggregate stop
service flowd restart
service flowd_aggregate start
Best regards,
Ad
-
Just giving a +1. flowd_aggregate crashes on my instance after 3~ hours. I just checked and found myself with a 3.6GB flowd.log file. :/ It has filled my disk a few times, so frustrating!
-
Hi Ad,
I've done as you suggested, deleted the flowd.log* files from /var/log/, stopped flowd_aggregate, restarted flowd and started flowd_aggregate again.
But after a few seconds to a minute flowd_aggregate isn't running regarding to service flowd_aggregate status :(
Seems, something is broken here!
Best,
Silverstar
-
Hi Silverstar,
Can you try to run flowd_aggregate manually?
service flowd_aggregate stop
/usr/local/opnsense/scripts/netflow/flowd_aggregate.py console
Then wait for it to exit (should take the same amount of time), then post the output here including any messages in /var/log/syslog
clog /var/log/system.log
The latest version of our software should try to run an automatic repair on the sqlite files its using, so maybe your experiencing something completely different here.
Just to be sure, you are using OPNsense 16.7.3 ?
Best regards,
Ad
-
The database repair code is not on 16.7.3, it will be available in 16.7.4 with the patch below. You can, however, install the code running the following command in the console:
# opnsense-patch 2bcdb42
https://github.com/opnsense/core/commit/2bcdb42
Maybe this will help.
Cheers,
Franco
-
Hi Ad,
hi Franco,
I did as you suggested.
The script took about 15 min. of runtime.
The script itself didn't output anything on command line.
Here are the lines from clog /var/log/system.log around the runtime of the script.
Sep 19 18:47:13 fw flowd_aggregate.py: flowd aggregate died with message Traceback (most recent call last): File "/usr/local/opnsense/scripts/netflow/flowd_aggregate.py", line 145, in run aggregate_flowd(do_vacuum) File "/usr/local/opnsense/scripts/netflow/flowd_aggregate.py", line 85, in aggregate_flowd stream_agg_object.cleanup(do_vacuum) File "/usr/local/opnsense/scripts/netflow/lib/aggregate.py", line 277, in cleanup self._update_cur.execute('delete from timeserie where mtime < :expire', {'expire': expire_timestamp}) DatabaseError: database disk image is malformed
Sep 19 19:00:47 fw sshd[27366]: Accepted keyboard-interactive/pam for root from 10.0.220.6 port 58747 ssh2
Sep 19 19:01:29 fw opnsense: /index.php: Successful login for user 'root' from: 10.0.220.6
Sep 19 19:01:30 fw configd.py: [77c5f0d8-d529-41ff-984f-7557f8675e5a] IPsec list ip address pools
Sep 19 19:01:30 fw configd.py: [e6dd1c66-804c-4ae5-88d3-a3cef8923bad] IPsec list status
Sep 19 19:01:48 fw configd.py: [6a6cffb1-080c-4548-b0f3-dff2d05f9e29] IPsec list ip address pools
Sep 19 19:01:48 fw configd.py: [e2736d1d-46e4-4a1d-8241-545efce8fe43] IPsec list status
Sep 19 19:02:00 fw configd.py: [459569e5-d465-4d9c-bbdf-fa8368b01b9a] IPsec list ip address pools
Sep 19 19:02:00 fw configd.py: [4b5aa42b-8674-4223-8dc2-c2b2296d1a14] IPsec list status
Sep 19 19:02:16 fw configd.py: [85324443-fcf2-462f-a97f-d9b17745bfd6] IPsec list ip address pools
Sep 19 19:02:16 fw configd.py: [bfaaefbb-8b08-4852-987f-c0f365880cd4] IPsec list status
Sep 19 19:02:43 fw configd.py: [63b45754-fd76-4591-97b6-5f5186ac8274] show system activity
Sep 19 19:03:30 fw configd.py: [c0f49f71-c15f-472f-a9b3-298b0f728057] show system activity
Sep 19 19:03:46 fw configd.py: [7c52b133-a3c3-414f-8689-10341fbd9e0c] IPsec list ip address pools
Sep 19 19:03:46 fw configd.py: [164c2aef-31ba-4027-9ac3-dd620ed8428e] IPsec list status
Sep 19 19:04:17 fw configd.py: [8893657c-2c7e-439b-ad75-6ea64422b0ba] show system activity
Sep 19 19:04:26 fw configd.py: [0e6a1c96-63fa-4cc2-b05c-6c21074e5880] show system activity
Sep 19 19:04:38 fw configd.py: [b8cff5d9-4feb-4a83-8740-f8ba7a59e2a2] show system activity
Sep 19 19:04:40 fw configd.py: [e13586a8-0759-4056-8ae7-c3ffb3aaa968] show system activity
Sep 19 19:04:54 fw configd.py: [a5496d29-1f1f-4cca-a45f-51c66577f11a] IPsec list ip address pools
Sep 19 19:04:54 fw configd.py: [49150607-f6be-4e57-aa27-b8c2012fb27e] IPsec list status
Sep 19 19:04:59 fw kernel: A,4282023658,2141745024,65535,,
Sep 19 19:05:03 fw configd.py: [07711743-c7e3-4e83-a6a9-5c20ba109e17] show system activity
Sep 19 19:05:11 fw configd.py: [c9ddda98-20de-42c7-bcf9-dcb281f0d66b] show system activity
Sep 19 19:05:20 fw configd.py: [c02e0b17-bf6d-4df7-b95d-bd3b95641c70] show system activity
Sep 19 19:06:32 fw configd.py: [0bf487aa-95d3-41f5-a9f1-242644e758be] show system activity
Sep 19 19:08:06 fw configd.py: [ad232804-d3f1-4f09-8765-38e109e23ea1] show system activity
Sep 19 19:08:32 fw configd.py: [5d5657d4-9698-424b-b2b7-43174b248eb3] show system activity
Sep 19 19:09:55 fw configd.py: [638a3cd1-291c-4d47-89bf-e2a0d001065c] show system activity
Sep 19 19:10:38 fw configd.py: [d8f4c6aa-0429-4c6b-ba20-e48800ab4dbb] show system activity
Sep 19 19:10:49 fw configd.py: [b56d4d76-5b78-4eb8-b4a7-b6663e488d3a] show system activity
Sep 19 19:10:55 fw configd.py: [586829ad-69a3-469f-89ac-9b0dbe7de498] show system activity
Sep 19 19:11:14 fw configd.py: [1e0a5b66-b48c-4eab-88f4-b4c118d5d7e8] show system activity
Sep 19 19:11:20 fw configd.py: [4c3b2b8f-63d8-4a8f-b40d-c53e3d5f3ca4] show system activity
Sep 19 19:11:53 fw configd.py: [e826f334-cec2-42e0-8b3b-b2359ccaa5d7] show system activity
Sep 19 19:12:14 fw configd.py: [d835988a-5a82-492d-a79c-996d5ade3f59] show system activity
Sep 19 19:12:20 fw configd.py: [4c8b40f2-c5c9-4f0c-b38a-174b616bbfd2] show system activity
Sep 19 19:12:24 fw configd.py: [bfa1c6e4-cc41-43dd-a3cf-077a880fe732] show system activity
Sep 19 19:13:02 fw configd.py: [e97ad08c-d85a-4cdf-98d6-e7eb2096842c] show system activity
Sep 19 19:13:31 fw configd.py: [6ce5d6c5-5b04-4086-8aa9-f36e25e41048] show system activity
Sep 19 19:14:12 fw configd.py: [c81fa60a-579f-4d62-bec3-d1cbf48147b3] show system activity
Sep 19 19:15:04 fw configd.py: [4c34ddf3-53d9-42eb-96d6-14a80f5b137a] show system activity
Sep 19 19:15:13 fw configd.py: [b1b1ee07-476a-4803-aaed-94dc5c761b76] show system activity
Sep 19 19:15:16 fw configd.py: [1a4c6ea8-2291-44b3-9b9c-7ed821ba7724] show system activity
Sep 19 19:15:20 fw configd.py: [dfa0ff61-8f80-429f-a869-07f5dbd650d1] show system activity
Sep 19 19:15:23 fw configd.py: [d47433f9-9030-43e4-be91-1a162db6212e] show system activity
Sep 19 19:15:27 fw configd.py: [a20e7c6f-ce5b-4965-8a25-27210c616320] show system activity
Sep 19 19:15:42 fw configd.py: [e85dee2d-98af-4fc3-9ed8-c6c81b74df33] show system activity
Sep 19 19:16:14 fw configd.py: [30be9ba0-9d89-442c-9491-70387982dbed] show system activity
Sep 19 19:16:23 fw configd.py: [5a666136-8741-412f-bb6d-176e231d09d0] show system activity
Sep 19 19:16:47 fw kernel: A,1209682972,961485202,65535,,
Sep 19 19:16:48 fw configd.py: [6a943954-07c6-407c-b346-bb7aaaef0afd] show system activity
Sep 19 19:16:55 fw configd.py: [73688291-5017-47e5-b3b1-7730d7c9af08] show system activity
Sep 19 19:16:58 fw configd.py: [8071150b-09ad-4b65-8266-bbca2d424b73] show system activity
Sep 19 19:17:23 fw configd.py: [20a979cc-3b3b-4570-8ad7-c524c2072bbd] show system activity
Sep 19 19:18:16 fw flowd_aggregate.py: flowd aggregate died with message Traceback (most recent call last): File "/usr/local/opnsense/scripts/netflow/flowd_aggregate.py", line 145, in run aggregate_flowd(do_vacuum) File "/usr/local/opnsense/scripts/netflow/flowd_aggregate.py", line 85, in aggregate_flowd stream_agg_object.cleanup(do_vacuum) File "/usr/local/opnsense/scripts/netflow/lib/aggregate.py", line 277, in cleanup self._update_cur.execute('delete from timeserie where mtime < :expire', {'expire': expire_timestamp}) DatabaseError: database disk image is malformed
Sep 19 19:18:27 fw configd.py: [0a32a78b-720c-4528-89a8-5cb7f9c9ef91] show system activity
Sep 19 19:18:42 fw configd.py: [d34ade08-63d2-4329-b463-591ddb55624a] IPsec list ip address pools
Sep 19 19:18:42 fw configd.py: [c692d1bb-95e2-43f0-88ed-3c97ed164b6b] IPsec list status
The system version is 16.7.2
Will take the update to 16.7.3 and the patch from Franco as the next steps and keep you guys posted...
Thanks,
Silverstar
-
Update & patch done.
service flowd_aggregate keeps running for now.
Produces log files at 11MB each.
Keep you posted if roation kicks in...
Best,
Silverstar
-
Seems to be solved :)
Thank you guys!
Rotation works and keeps 10 files.
Working file never exceeds 10 MB.
root@fw:/var/log # ls -alh flowd.log*
-rw------- 1 root wheel 5.1M Sep 20 11:39 flowd.log
-rw------- 1 root wheel 11M Sep 20 11:36 flowd.log.000001
-rw------- 1 root wheel 11M Sep 20 11:30 flowd.log.000002
-rw------- 1 root wheel 11M Sep 20 11:24 flowd.log.000003
-rw------- 1 root wheel 12M Sep 20 11:18 flowd.log.000004
-rw------- 1 root wheel 12M Sep 20 11:12 flowd.log.000005
-rw------- 1 root wheel 11M Sep 20 11:06 flowd.log.000006
-rw------- 1 root wheel 11M Sep 20 11:01 flowd.log.000007
-rw------- 1 root wheel 11M Sep 20 10:55 flowd.log.000008
-rw------- 1 root wheel 11M Sep 20 10:49 flowd.log.000009
-rw------- 1 root wheel 11M Sep 20 10:43 flowd.log.000010
Best,
Silverstar
-
It looks like the crashes are back. I implemented the patch referenced here but I have another 16GB+ flowd.log file. :( No logs in the system log either.
-
This issue is still happening.
OPNsense 16.7.7-amd64
FreeBSD 10.3-RELEASE-p11
OpenSSL 1.0.2j 26 Sep 2016
root@vpn:/var/log # ps wwwaux | grep -i flow
root 73815 37.5 0.5 122160 32864 - Ds Mon05PM 1796:05.42 /usr/local/bin/python2.7 /usr/local/opnsense/scripts/netflow/flowd_aggregate.py
_flowd 85758 0.2 0.0 12360 2412 - Ds Mon05PM 13:34.98 flowd: net (flowd)
root 85525 0.0 0.0 12360 1560 - Is Mon05PM 0:00.00 flowd: monitor (flowd)
root 82449 0.0 0.0 18752 2196 0 S+ 5:06PM 0:00.00 grep -i flow
root@vpn:/var/log # ls -alh flow*
-rw------- 1 root wheel 33G Nov 3 17:06 flowd.log
root@vpn:/var/log #
-
I think I am into the same problem here, updated last week to the actual version and now it seems that opnsense crashes from time to time.
Having a 1.3G flowd.log, till Sep. 29 I have 11 MB logs. flowd_aggregate seems to eat up one core completely of the server constantly.
Disables netflow now and everythings seems back to normal.
-
Never run in that issue again.
Updated to 16.7.7 today and had to reset the netflow data to get isight working again.
For me flowd.log never exceeds 12 MB and there are 10 flowd.log.0000xx files kept in rotation.
Every one of them about 12 MB in size.
All I noticed since /usr/local/opnsense/scripts/netflow/flowd_aggregate.py constantly runs is a high CPU consumption of that script...
-
Then I'll reset the flowd-data and try again. I think I will see within a day if it runs crazy or smoothly. Thanks.
-
After deleting the big flowd.log and starting the service again with the actual version everything keeps running smoothly!