default hw.vtnet.csum_disable=1 causes network slowness

Started by jauling, February 16, 2025, 02:09:04 AM

Previous topic - Next topic
Quote from: Patrick M. Hausen on March 06, 2025, 06:56:33 PMThank you for bringing this link to my attention. I have been following this problem for 2-3 years now and I cannot remember to have come across this one.

As a base line do you agree that hw.vtnet.csum_disable=1 is a sensible default given that without that setting using FreeBSD as a router inside KVM does not work at all? At least that's what I observe and what led me to suggest this default.

Or is it working for you without hw.vtnet.csum_disable=1 only not up to the performance figures that Linux achieves?


Hi Patrick,

Thank you for looking into this.

Regarding your question, in my opinion, hw.vtnet.csum_disable=1 is counterproductive both theoretically and practically. Theoretically, because OPNsense exposes these configurations in the main web interface under Interface > Settings, yet this "obscure," undocumented default parameter overrides (with no warning) the explicit main UI settings. Furthermore, those UI settings are already disabled by default.

In practical terms, I spent a considerable part of two working days struggling to diagnose the discrepancy between versions 24 and 25. Only after encountering a Reddit post by someone doing a similar investigation did I discover that hw.vtnet.csum_disable=1 was the culprit.

As a side note, the comment in the bug thread describing this as an "optimization" is alarming. An issue that causes broken routing/forwarding or a three- to six-fold loss in performance (in terms of bandwidth and CPU usage) isn't just an optimization problem—it's a serious bug.

Kind Regards,
Vasco

Quote from: vsc on March 07, 2025, 12:26:42 PMAs a side note, the comment in the bug thread describing this as an "optimization" is alarming. An issue that causes broken routing/forwarding or a three- to six-fold loss in performance (in terms of bandwidth and CPU usage) isn't just an optimization problem—it's a serious bug.

The comment refers to the KVM side not calculating the checksum to save that effort.
The bug is FreeBSD not implementing dealing with KVMs partial checksums.

The problem has some "management attention" now. If that leads to anything is difficult to predict in an open source project.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

March 07, 2025, 12:50:04 PM #17 Last Edit: March 07, 2025, 12:57:23 PM by meyergru
The "optimization" remark was for the vtnet subsystem in Linux/KVM. It refers to being able to pass 64K at a time between KVM hosts and Linux VMs, where it works automagically.

The problem is that the FreeBSD vtnet driver currently does not take care of the specific flags to correctly implement checksumming when packets are routed.

(Too late, Patrick just wrote it - the sad part is that it must be fixed upstream in FreeBSD and because of the nature of the bug manifesting predominantly in routers, it might get dismissed as "downstream" only)

Using a default value to make something broken work seems more appropriate than to use a value that definitely breaks things, IMHO.
Actually, the imposed limit by doing the checksumming in software is somewhere above 1 GBit/s, so I would argue that the percentage of people complaining about a default that breaks vtnet is bigger than the percentage of people who cannot reach speeds above 1.5 GBit/s.

Most people get KVM virtualisation wrong even without a hurdle like this.

However, until this gets finally fixed, maybe the implications of the setting should be documented in the official docs. @Monviech to the rescue!

I have included a discussion of this in my HOWTO.
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 770 up, Bufferbloat A