Suggestion for Bufferbloat fix. Fibre to the Home. No PPoE.

Started by cookiemonster, December 01, 2025, 07:09:25 PM

Previous topic - Next topic
Hi. Another "help with bufferbloat".
I am currently on OPN 25.1.12-amd64 on a VM that has been running fine and has been updated a few major releases. All good.
At some point in the past, perhaps 2 years ago I followed one of the threads here to get decent bufferbloat help. It worked fine and I got B on waveform site with only the "low latency gaming measure" being !. That was good enough for me. I don't do gaming. I only need video (MS Teams / zoom ) to work reliably when needed.
My ISP package is fibre to the premise at 520 Down / 72 Up. Their speeds are normally consistent.

I had what seemed some buffering last week and went to check settings. I realised perhaps I needed to reconfigure it so I did a) read a few recent to a max 24 months posts; b) checked the current docs. I admit I can't understand the current way to use the "limit" note of the docs, the reference to the bug.

I decided to set it up per docs and made note of what I had first.
Result: consistently C results. Includes reboots when changing the flows.

I went back to what I had and still mostly C, sometimes B.

So this is the background. Can someone make a suggestion what values to use?
These are the values I had before the change but now the results _appear_ worse. And yes, it doesn't make a whole lot of sense but I'm looking for another set of eyes in case I've stared too long.

Download pipe

Enabled X
Bandwidth 490
Bandwidth Metric Mbit/s
Queue
Mask (none)
Buckets
Scheduler type FlowQueue-CoDel
Enable CoDel
(FQ-)CoDel target
(FQ-)CoDel interval
(FQ-)CoDel ECN X
FQ-CoDel quantum
FQ-CoDel limit 20480
FQ-CoDel flows 8192
Enable PIE
Delay 1
Description Download pipe

 
Download queue

Enabled X
Pipe Download pipe
Weight 100
mask destination
Buckets
Enable CoDel
(FQ-)CoDel target
(FQ-)CoDel interval
(FQ-)CoDel ECN X
Enable PIE
Description Download queue


Download rule

Enabled X
Sequence 1
Interface WAN
Interface 2 None
Protocol ip
Max Packet Length
Source any
Invert source
Src-port any
Destination any
Invert destination
Dst-port any
DSCP Nothing selected
Direction in
Target Download queue
Description Download rule
 

The mask in the Download queue should be (none). Also, you should define the Upstream side of things as well.
Intel N100, 4* I226-V, 2* 82559, 16 GByte, 500 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

Is a downstream shaper (particularly a single queue) likely to have the effect you want? I used downstream shapers in the past, but my purpose was to control offered load by adding latency, using multiple queues on a CBQ shaper. I didn't bother after my link passed 10Mb; it did help at 6-10Mb.

I'd think a simple fair queue with no shaper would be the best option for you. I don't know the best way to accomplish that - perhaps open the pipe beyond 520Mb/s (toward single-station LAN speed). I haven't looked at the fq-codel implementation in... a while. The one I recall used a flow hash, and you could set the number of bits (up to 16, I believe). It looks like the ipfw implementation has that limit (65536). I'd think more can't hurt - fewer (potential) collisions. I wouldn't expect any negatives, but you never can tell. PIE just sounds like a RED implementation - I can't see that it'd have much if any effect, as I wouldn't expect your queue depths/times to reach discard levels.

Of course, you could have upstream issues, at any point in the path.

Quote from: meyergru on December 01, 2025, 07:28:42 PMThe mask in the Download queue should be (none). Also, you should define the Upstream side of things as well.
yes I tried with that removed as per docs. Still bad.
Anything else you can spot?
Edit: p.s. uploads seem very good in the bufferbloat tests but I can add them to the thread no problem. I wanted to keep it as tidy as possible.

Quote from: pfry on December 01, 2025, 08:18:47 PMIs a downstream shaper (particularly a single queue) likely to have the effect you want? I used downstream shapers in the past, but my purpose was to control offered load by adding latency, using multiple queues on a CBQ shaper. I didn't bother after my link passed 10Mb; it did help at 6-10Mb.

I'd think a simple fair queue with no shaper would be the best option for you. I don't know the best way to accomplish that - perhaps open the pipe beyond 520Mb/s (toward single-station LAN speed). I haven't looked at the fq-codel implementation in... a while. The one I recall used a flow hash, and you could set the number of bits (up to 16, I believe). It looks like the ipfw implementation has that limit (65536). I'd think more can't hurt - fewer (potential) collisions. I wouldn't expect any negatives, but you never can tell. PIE just sounds like a RED implementation - I can't see that it'd have much if any effect, as I wouldn't expect your queue depths/times to reach discard levels.

Of course, you could have upstream issues, at any point in the path.
You mean set it up as per the docs https://docs.opnsense.org/manual/how-tos/shaper_bufferbloat.html ?
But I can try see if I follow the thinking and put a pipe beyond the 520 Mbps, to see what happens. Thanks for the idea.
Going a little mad with this at the moment.

Thing is, I have a decent (for me) 520 Mbps bandwith. Normally I wouldn't bother with shaping but I seem to have the odd buffering now after this change I made. Frustratingly it is not better ie back to normal after restoring the previous settings.

To make it factual, my just-made 2 test results:
BUFFERBLOAT GRADE
B

LATENCY
Unloaded 26 ms
Download Active +39 ms

Upload Active +0 ms
SPEED ↓ Download 259.5 Mbps
↑ Upload 66.9 Mbps

Second:
BUFFERBLOAT GRADE
B

Your latency increased moderately under load.

LATENCY
Unloaded 21 ms
Download Active +42 ms
Upload Active +0 ms
SPEED ↓ Download 262.4 Mbps
↑ Upload 66.8 Mbps
==
So it's giving me Bs at the moment. Is this "good enough" leave-it-alone result? Tomorrow it might give me Cs though. I'll keep checking.

Cookie,

Looking at your original configuration on the very 1st post, it looks to be misaligned with the docs.

Please align the configuration exactly as is in the official documentation. It was tested on several different configurations (HW + WANs) and its designed to provide a proper baseline with minimal configuration needed. Which usually results B or higher scores, if you at least set the BW properly.

The main point of having properly configured FQ_C is to set properly the BW and to have Pipes and Queues for both Download and UPload. The rest of the parameters should be used for fine tuning.

Quote from: cookiemonster on December 01, 2025, 07:09:25 PMI admit I can't understand the current way to use the "limit" note of the docs, the reference to the bug.
Prior OPN 25.7.8 there was a BUG that caused a CPU hogging due to excessive logging caused when the limit queue is exceeded. So the advice was to let Limit blank. Franco did FIX this (well at least on OPN side). So now is safe and beneficial to use the Limit queue and set it to 1000 for Speeds under 10Gbit/s.

I did as well update the docs, PR was merged, when Ad will recompile the docs it will be updated
https://github.com/opnsense/docs/pull/811/files

-----------

Alright lets dissect this;

Quote from: pfry on December 01, 2025, 08:18:47 PMI'd think a simple fair queue with no shaper would be the best option for you. I don't know the best way to accomplish that - perhaps open the pipe beyond 520Mb/s (toward single-station LAN speed).

Your QoS/Shaping should be implemented on the interface that you want to control the bottleneck for. So closer to the source of bufferbloat. A FQ as such doesn't handle in anyway bufferbloat. FQ only shares the BW equally amongst all the flows. To control bufferbloat you need an AQM (FQ_Codel, FQ_Pie) or a SQM (CAKE).
Another point is, you should not set your Pipe to more than you have, this introduces issues. You can not give out what you don't have, in our case BW. By settings BW higher than you have you will end in bufferbloat land, and latency will go high-wire, and you are giving up the control to the ISP.


Quote from: pfry on December 01, 2025, 08:18:47 PMI haven't looked at the fq-codel implementation in... a while. The one I recall used a flow hash, and you could set the number of bits (up to 16, I believe).
FQ_C creates internal flow queues per 5-tuple using a HASH. There are examples where stochastic nature of hashing, multiple flows may end up being hashed into the same slot. This can be controlled by the flow parameter in FQ_C.

Quote from: pfry on December 01, 2025, 08:18:47 PMIt looks like the ipfw implementation has that limit (65536). I'd think more can't hurt - fewer (potential) collisions. I wouldn't expect any negatives, but you never can tell.
This is a very bad idea if we speak about the "limit parameter". Limit is effectively the Queue size for the internal flows created by FQ_C. If you have a long Queue, but you are not able to process the packets in the Queue in time you create latency. FQ_C because its an AQM, measure sojourn time of each packet in the queue, and if it exceeds it either marks it or drops. But having to big of a queue is still overall bad. We want to TAIL drop packets when we can not handle them and not store them.

limit parameter (max 20480) with flow parameter (max 65535).

Settings the flow parameter higher is not a bad idea, the desired outcome is to have as less possible overlapping flows into the same queue as possible. But this parameter the higher its set takes more memory (in reality its not so much).

Rule of thumb;
Limit > bellow 10Gbit/s should be around (good starting point) 1000 (usable since 25.7.8)
Flow > If possible set to max 65535

Quote from: pfry on December 01, 2025, 08:18:47 PMPIE just sounds like a RED implementation - I can't see that it'd have much if any effect, as I wouldn't expect your queue depths/times to reach discard levels.
I really don't want to go into PIE to much e.g FQ_PIE, it work similar to FQ_C, but it has different use case, so I will say this:

Pie 
- Probabilistic, gradual
- Usage in ISP networks, broadband, general traffic

Codel
- Adaptive, based on packet age
- Low-latency applications, real-time traffic

Regards,
S.

Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

Quote from: Seimus on Today at 02:42:26 AMRule of thumb;
Limit > bellow 10Gbit/s should be around 1000 (usable since 25.7.8)
Flow > If possible set to max 65535

Seimus,

If we have 4 pipes (up/down for control plane, up/down for data) then what is the recommendation for flow value?  Still 65535 for each pipe?


Nevermind, I forgot that the control pipes are not using FQ_Codel anyway.  They are QFQ as per the shaping documentation.