fq_codel console flood

Started by doktornotor, February 23, 2024, 05:30:51 PM

Previous topic - Next topic
This serial console flood seems to be a new feature coming with 24.1.x... (Happens with FQ-CODEL shaper and heavy uploads  - backups to Wasabi object storage).


fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 258
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 125
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 125
fq_codel_enqueue over limit
fq_codel_enqueue maxidx = 125
fq_codel_enqueue over limit

February 23, 2024, 05:32:51 PM #1 Last Edit: February 23, 2024, 05:34:55 PM by Seimus
What do you mean by serial console?

fq_codel_enqueue usually appears when your queue is full its referencing to FQ-CoDel limit in the Shaper > Pipe if I am not wrong (at least this is what I was seeing when playing with that setting)

It did appear as well on previous versions in the log.

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

Serial console: RS-232. Null modem. Headless box. Cisco console cable. Whatever.

Yeah, queue is probably full. I don't exactly need to have hundreds of these lines scrolling there. Any tips how to mute the noise?

Ah that's what you ment by serial.

Well personally I didn't yet experienced it on the serial connection. I was seeing this rather in the OPNsense log. However you can adjust the queue, if you are using the FQ_Codel shaper adjust your limit. My is set to default (blank) and I can still have the same performance and BufferBloat test shows A to A+

Basically try to tune the queue, as it looks like yours is to small to cope the traffic you try to sent thru.

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

https://forum.opnsense.org/index.php?topic=31410.msg173387#msg173387

...comes directly from kernel, so fine tuning or life with it...
kind regards
chemlud
____
"The price of reliability is the pursuit of the utmost simplicity."
C.A.R. Hoare

felix eichhorns premium katzenfutter mit der extraportion energie

A router is not a switch - A router is not a switch - A router is not a switch - A rou....

Quote from: chemlud on February 23, 2024, 05:45:52 PM
https://forum.opnsense.org/index.php?topic=31410.msg173387#msg173387

...comes directly from kernel, so fine tuning or life with it...

Well, I don't see anything worth tuning, the shaper works just fine. I just did not see this noise before, nothing changed regarding the shaper config or WAN backup jobs traffic volume/bandwidth.

Filing an upstream bug does not seem like worth the effort really, given my previous experience with that. Never understood the thinking behind these "wheee, lets spit this random superimportant noise to the screen" ideas.

Cheers.

No offence intended, maybe this helps

https://forums.freebsd.org/threads/is-there-a-way-to-suppress-the-screen-output-of-alert-messages.79152/

but no idea if this works in sense at all... ;-)
kind regards
chemlud
____
"The price of reliability is the pursuit of the utmost simplicity."
C.A.R. Hoare

felix eichhorns premium katzenfutter mit der extraportion energie

A router is not a switch - A router is not a switch - A router is not a switch - A rou....

That's for the FreeBSD syslog, but it could also be done with syslog-ng... probably much more granular even.

Got some anecdotal evidence on my system for a similar issue, i think the code displayed was different.

I had this happen only once after the upgrade from 23 to release 24.
What i did was re-visit some of my shaper settings, in particular under pipes/advanced lowered my FQ-codel quantum from 2700 to 2400 for the download pipe. The upload pipe i did not change it has default value for that setting.
A Bufferbloat benchmark with 2700 setting behaved strange performance wise with release 24 (was good on release 23)and after lowering the number the benchmark results were much better.
Since that time i have not seen the flooded message again.

I use similar quantum values, as xPliZit_xs. Usually my testing is always between 1500 - 2800, and my sweet spot is around 2400-2600

This is tested now with quantum 2400 limit default
https://www.waveform.com/tools/bufferbloat?test-id=52c440c7-0976-4889-9891-862b4f03f448

Also as mentioned I don't think quantum is causing those logs "enqueue" is related to the queue size which is set by the limit.

https://man.freebsd.org/cgi/man.cgi?query=ipfw&sektion=8&format=html

       quantum
       m  specifies  the quantum (credit) of the scheduler.  m
       is the number of bytes a queue can serve before being
       moved  to  the tail of old queues list. The default is
       1514 bytes, and the maximum acceptable  value  is  9000
       bytes.

       limit   m specifies the hard size limit (in unit of packets) of
       all  queues  managed  by an instance of the scheduler.
       The default value of m is 10240 packets, and the maxi-
       mum acceptable value is 20480 packets.


The funny thing this, in most if not all of the guides everybody says that quantum & limit should be set alike, but in reality many people let limit on default, as you can see the default limit is quiet large. If set too low you will get the "enqueue". Also the 300 per 100M is not fully true, its a nice reference but this will work best for small packets. Thats why many people have reported bad buffer bload and performance using this rule.

Basically once you fill out the queue you will TAIL DROP. If you have a good enough internet connection and your ISP is able to keep up with the traffic on your line the default set limit and tuned quantum can handle it.


@doktornotor
Personally I do not know if you can get rid of this log. Maybe Franco or somebody on BSD forum would know. But keep in mind if you get those logs, you are oversubscribing your queue which results in Tail Drops.

Regards,
S.



Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

I have no idea why you think a quantum of 2400 is good. (I am one of the authors of the fq_codel algorithm).

I started a project recently to try and collect all the complaints about fq_codel and shaping and see if they were a bug in the implementation or not. It was an outgrowth of the reddit thread here:
https://www.reddit.com/r/opnsense/comments/18z68ec/anyone_who_hasnt_configured_codel_limiters_do_it/

So far the openbsd implementation is proving out ok, but I have not had the time, money or chops, to get someone to rigorously test the freebsd (opnsense version) enough to prove to me that it is indeed implemented correctly. Anyone out there capable of running some serious flent tests? I outlined basic tests
to the openbsd person on the case here: https://marc.info/?l=openbsd-tech&m=170782792023898&w=2

I am perversely happy that a probable source of speed complaints about it is it doing too aggressive logging under stress, as reported on this bug here today.

The freebsd bug is being tracked here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276890

The quantum should be set to the MTU, or about 300 at lower rates if you can spare the CPU.

"Basically once you fill out the queue you will TAIL DROP. If you have a good enough internet connection and your ISP is able to keep up with the traffic on your line the default set limit and tuned quantum can handle it."

While I have not looked deeply at the fq_codel code for freebsd yet, the linux, ns3, ns2 implementations all seek out the biggest queue, and HEAD drop from that. Now depending on how pf is implemented, perhaps it is dropping elsewhere?

Head dropping from the fattest queue is a big feature, however, I can think of numerous ways that could be being done incorrectly - for example linux fq_codel increments the count variable on every overflow so as to hand the AQM more bite against the flow normally.

Quote from: dtaht on February 24, 2024, 06:07:13 PM
I am perversely happy that a probable source of speed complaints about it is it doing too aggressive logging under stress, as reported on this bug here today.

The freebsd bug is being tracked here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276890

;D :D As noted. I'm pretty happy with the results. Except for the log noise.

Thanks for the link.

Quote from: dtaht on February 24, 2024, 06:16:37 PM
The quantum should be set to the MTU, or about 300 at lower rates if you can spare the CPU.

"Basically once you fill out the queue you will TAIL DROP. If you have a good enough internet connection and your ISP is able to keep up with the traffic on your line the default set limit and tuned quantum can handle it."

While I have not looked deeply at the fq_codel code for freebsd yet, the linux, ns3, ns2 implementations all seek out the biggest queue, and HEAD drop from that. Now depending on how pf is implemented, perhaps it is dropping elsewhere?

Head dropping from the fattest queue is a big feature, however, I can think of numerous ways that could be being done incorrectly - for example linux fq_codel increments the count variable on every overflow so as to hand the AQM more bite against the flow normally.

Hi dtaht,
thank you for the input the much that you are an actual one of the persons behind fq_codel. I am as well not fully sure from which queue here the drop would happen, I do not see behind the implementation for BSD/pf. But what are you saying gives sense to me.

In regards of quantum. Yes that gives as well sense to have it set to MTU value that's why default is set to 1514B at least this was my own understatement and reasoning why default value is set like this. Why I mentioned quantum at 2400 as well many other users is, that not always we may see good or "expected" outcome to have it on default. I usually run the shaper with fq_codel either with default values or with quantum set to around 2400. For me the default values tent to work good, but when "tuning" quantum meaning setting it beyond MTU value I can see a bit better performance outcome.

performance for me = throughput/latency/jitter

However there is a catch, all these test concluded while tuning are usually done towards some bufferbloat testing sites like dslreports.com or waveform.com, which is questionable how much precise such testing can be.

Saying that, at least in my case I can let the fq_codel be set on defaults for all the parameters at 500/30 and have a very pleasant experience. I just like to tune & squeeze to get the most of a feature I can  :)

Regards,
S.

Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

The only time I can see "wierd behaviour" (not sure how to name this) is when my WAN connection is at full for a longer time such as Downloading 100G game from Steam, in this case I tent to see latency increase on my probes in the monitoring system that track destinations on internet.

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD