OPNsense Forum

Archive => 23.1 Legacy Series => Topic started by: bob9744 on January 31, 2023, 07:02:15 PM

Title: Unbound logging/cpu usage
Post by: bob9744 on January 31, 2023, 07:02:15 PM
Hi all,

Just a quick observation - I've had unbound logging reporting (like the graphs and table btw! Table in particular helped me track down a config issue) running, and it seems like, over time (around 36 hours now) it's grabbing more of the cpu more frequently (python 3.9 process).

Is this expected as the dataset grows? Is this going to become more noticeable, or (36 hours on) have I hit a steady state?

Thanks!
Title: Re: Unbound logging/cpu usage
Post by: L00pback0 on January 31, 2023, 08:49:17 PM
I've noticed a steady increase in CPU after enabling the unbound reports as well.
Title: Re: Unbound logging/cpu usage
Post by: L00pback0 on January 31, 2023, 08:57:56 PM
Attached a screenshot showing the usage climbing over the last 24hrs.
Title: Re: Unbound logging/cpu usage
Post by: craig on January 31, 2023, 10:11:50 PM
I have also seen quite a steady increase in CPU usage with htop showing the logger as the culprit
Title: Re: Unbound logging/cpu usage
Post by: tuto2 on February 01, 2023, 08:34:12 AM
Hi all,

Thanks for the observations. I'm aware of this and investigating if some optimizations can be applied given the constraints of the environment. Keep an eye on the next minor release.

You can always opt-out by switching off the collection of statistics.

Quote from: bob9744 on January 31, 2023, 07:02:15 PM
Is this expected as the dataset grows? Is this going to become more noticeable, or (36 hours on) have I hit a steady state?

Well, if you'd like to keep it running for a while longer you can deduce this for yourself and maybe share it here.

Cheers,
Stephan
Title: Re: Unbound logging/cpu usage
Post by: franco on February 01, 2023, 09:01:29 AM
> You can always opt-out by switching off the collection of statistics.

That's a strange thing to say for an opt-in service :D

But what Stephan means here is that reporting facilities like NetFlow/Insight and now Unbound DNS statistics require more computation time which might require more capable hardware so switching these on should only be done with that constraint in mind.


Cheers,
Franco
Title: Re: Unbound logging/cpu usage
Post by: tuto2 on February 01, 2023, 09:45:49 AM
If you'd like to provide feedback you can run

# opnsense-patch 44e9dc25b

And see if it contributes to an overall reduction.
Title: Re: Unbound logging/cpu usage
Post by: Fright on February 01, 2023, 09:56:55 AM
@tuto2
Hi!
I'm sorry, I'm not good at databases, but it seems to me that the process of cleanup is not optimized? Could this be the reason?
IMHO the index may not help here because of the data conversion (the time column is defined as an integer and the time data is converted during cleanup).
and is conversion even necessary if we compare two epoch values?
thanks  :)
Title: Re: Unbound logging/cpu usage
Post by: tuto2 on February 01, 2023, 10:19:36 AM
Quote from: Fright on February 01, 2023, 09:56:55 AM
@tuto2
Hi!

Hey there :)

Quote from: Fright on February 01, 2023, 09:56:55 AM
but it seems to me that the process of cleanup is not optimized? Could this be the reason?

If you're referring to truncating the database, this is insignificant (measurement show a mean of 0.009% of total processing time) compared to the biggest limiting factor, which is the fact that a connect() call on the database increases linearly with the size of the actual db.

Quote from: Fright on February 01, 2023, 09:56:55 AM
IMHO the index may not help here because of the data conversion (the time column is defined as an integer and the time data is converted during cleanup).

Removing index shows no increase in performance, and may adversely impact query performance on the reporting side.


Quote from: Fright on February 01, 2023, 09:56:55 AM
and is conversion even necessary if we compare two epoch values?

What conversion are we talking about here?


The problem lies in the fact that we cannot hold open a connection due to write concurrency limitations, I'm not sure we can improve a whole lot there. The most obvious improvement would be to plateau earlier by reducing the amount of data kept.
Title: Re: Unbound logging/cpu usage
Post by: Fright on February 01, 2023, 12:48:27 PM
thanks for answers! (and sorry for the delay)
QuoteRemoving index shows no increase in performance
I was not clear, probably: I'm just not sure that the idx_query index is actually used when executing the DELETE statement, although in theory it should be faster using the index.
QuoteWhat conversion are we talking about here?
"to_timestamp(time) < to_timestamp(?) "
this line made me question (for some reason it seems to me that the to_timestamp function will prevent using the index)?
Quotebiggest limiting factor, which is the fact that a connect() call on the database increases linearly with the size of the actual db.
understood thanks!
I'm growing the db (for now it's too small for the changes to be noticeable in runtime) and will try to play around )
Title: Re: Unbound logging/cpu usage
Post by: bob9744 on February 01, 2023, 01:32:22 PM
Quote from: franco on February 01, 2023, 09:01:29 AM
But what Stephan means here is that reporting facilities like NetFlow/Insight and now Unbound DNS statistics require more computation time which might require more capable hardware so switching these on should only be done with that constraint in mind.

Cheers,
Franco

So, I'm running with a J6412 and 8gb of ram - it's hard to tell whether those cpu spikes actually adversely impact anything important - given the nature of traffic on my network, I'm betting not. That said, would it ease cpu usage to farm the logs out to a different machine?

Thanks!
Title: Re: Unbound logging/cpu usage
Post by: franco on February 01, 2023, 01:50:55 PM
Don't think export is in the current scope as you would also have to add import facilities on another node or connect to a remote database. The way this is now is to provide the shortest path as a standard builtin.


Cheers,
Franco
Title: Re: Unbound logging/cpu usage
Post by: Fright on February 02, 2023, 11:35:57 AM
@tuto2
Hi!
So db has grown  :D. 50 MB and over 1.2 million entries.
I compared (with the profiling) old record cleanup operations with the type conversion and with the direct epoch comparison.
Result in attachment.
So imho in this part it is still possible to speed up something?
Regarding the connect() time (thanks for the hint. I read the docs and didn't get excited  ::) ), the only thing that comes to mind is to make the "timedelta" configurable (so that we can reduce the size of the data for realy high-load servers)?

ps. it was also noticed that the size of the database file is not shrinked after deletion, and I did not find such an option in dukdb. only export/import remains?
Title: Re: Unbound logging/cpu usage
Post by: tuto2 on February 02, 2023, 01:03:46 PM
Quote from: Fright on February 02, 2023, 11:35:57 AM
So imho in this part it is still possible to speed up something?

Thanks for testing! Seems there is something to be gained there indeed. I'll include it as part of the deconstruction process :D

Quote from: Fright on February 02, 2023, 11:35:57 AM
Regarding the connect() time (thanks for the hint. I read the docs and didn't get excited  ::) ), the only thing that comes to mind is to make the "timedelta" configurable (so that we can reduce the size of the data for realy high-load servers)?

That or reducing it statically are both good options, but as a general rule of thumb I'd like to prevent adding more toggles and dials for now.

Quote from: Fright on February 02, 2023, 11:35:57 AM
ps. it was also noticed that the size of the database file is not shrinked after deletion, and I did not find such an option in dukdb. only export/import remains?

Not sure how this plateaus as time progresses, maybe some more observation time is needed there. Locally I have some ~3.4m entries (yes, a device was misbehaving) and a db size of ~124mb which is about a month old (4th of january). So this seems to fall within expectations when compared to your size.
Title: Re: Unbound logging/cpu usage
Post by: Fright on February 02, 2023, 08:15:19 PM
@tuto2
thanks!
QuoteSeems there is something to be gained there indeed.
Looks like the index is gone now ) but on the other hand, it looks like the efficiency of the index was purely theoretical - after dropping the index, I don't see any spikes that I tried to associate with its non-use (and it looks like I fooled myself a little - there was a notebook with an open dashboard and a lot of widgets with constantly updated data). so I apologize for the noise about the indexes  :-[
Quotemaybe some more observation time is needed there
I'll keep watching but it doesn't look good so far: I set the timedelta to 4 hours. now there are 550 thousand records in the table, but the file continues to grow (56MB now)
Title: Re: Unbound logging/cpu usage
Post by: y0y0y0 on February 14, 2023, 08:23:26 AM
Hello

where do i find this database?
if it is not in a ramfs can i move it that directory?

regards /y
Title: Re: Unbound logging/cpu usage
Post by: Moonshine on February 14, 2023, 04:34:13 PM
FWIW, I ran into rising CPU with Unbound stats enabled as well.  They eventually leveled out, but I ended up turning them off since it was substantial.  Not sure if the patch mentioned is worth testing or not?



Title: Re: Unbound logging/cpu usage
Post by: franco on February 14, 2023, 05:54:45 PM
23.1.1 will address this tomorrow.


Cheers,
Franco
Title: Re: Unbound logging/cpu usage
Post by: bob9744 on February 15, 2023, 12:10:48 AM
Quote from: franco on February 14, 2023, 05:54:45 PM
23.1.1 will address this tomorrow.


Cheers,
Franco

Awesome - thanks!
Title: Re: Unbound logging/cpu usage
Post by: bob9744 on February 15, 2023, 08:40:12 PM
Btw, looks like things are performing better now - reporting is working like a champ without the router feeling like it's slowing down - thanks again!