yet another 10G routing cry :)

Started by antst, June 30, 2024, 03:49:01 PM

Previous topic - Next topic
my HW setup is simple
proxmox host with x710, which is virtualized with SR-IOV.
my WAN is VLAN over XGS-PON, so, I have XGS-PON in SFP+ cage.
on proxmox I configured VLAN on X710 vf and use this as PCIe device in VM.
router VM is OPNsense 24.1.9_4.

What I see, download throughput is perfect and hist line speed (8Gbps) of my contract.
But upload is stuck in range 1.6-1.8G not matter what I tried.
It is works equally the same for both, local traffic generation (either with mutithreaded iperf3 of speedtest) or if I route traffic from one of client hosts.

In principal, with pfSense it is exactly the same. So, I'd assume thing is generic FreeBSD issue.

But I also tried to replace router with VyOS VM. And under linux I see full speed when traffic originates from the VyOS host, but....same 1.7Gbps upload when it routes from the client.
(and to exclude question about possible local network infra issues, there is not problem to get full 10G speed between both of them, OPN and VyOS to one of clients, both ways, and I also tried to measure by routing traffic from proxmox host, so no external infra involved).

looking at this VyOS and local generation of traffic, I really doubt that I have issue with SFP+, XGS-PON etc.
but then...why on VyOS routing in one direction is the same poor as with OPN...
And CPU is more than capable of doing it. In fact, on bare metal with C3558 CPU I was able to have 3.5G both ways, and in this case I have i5-12600H. And my CPU load is peanuts, in both cases, VyOS and OPN.

So, anyone has suggestions, what it could be? :)

investigating further, I see that ubuntu 24.04 as it comes as default (in similarly configured VM) also show such disbalance between download and upload speeds. (when measured from VM host directly to WAN)

Somehow, only VyOS push upload to where it should be....

And....if someone will be interested in it.
Answer is simple: BBR. This is what VyOS uses by default and only one which allows me to get maximal upload throughput  on my setup.

Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

I could not load tcp_bbr, looks like module is missing.
But I went, indeed, same way, but easier. Just set max_tx_rate for x710.

and from what I see (maybe I am wrong), things like BBR matter on end-devices (when I set BBR policy on linux box behind router, it can utilize changel properly, regardless of router settings).
So, in his case, on level of OPN, it has to be solved by traffic limiter with FQ_CoDel. But, I think, it is still not fixed in BSD and still hits signed or unsigned int32 (so 2Gbps or 4Gbps).

And, just in case if someone will step on the same issue.
1) I compiled custom kernel with BBR support. It solves the issue for traffic originating from the router, but not for routed traffic.  So, not answer.
2) rechecked traffic limiting story, and indeed, dummynet is still not fixed, while big is open for 8 years or so, therefore, maiximal configurable throughput is still 4gpbs , which is defined by uint32, so FQ-Codel, which would be proper answer, can't be used.

End of story. No solution aside limiting tx rate of nic (for those which can) seriously bellow maximal value (in my case, with 8gbps capable channel, I can avoids bufferbloat only if limit by 5.2gbps, everything above that is not stable).
Another option is to run VM on top of Linux host, use virtnet and deal with traffic shaping on the level of  Linux bridge.