10GB LAN Performance

Started by johnoatwork, December 03, 2021, 11:21:39 AM

Previous topic - Next topic
January 02, 2022, 09:30:43 AM #15 Last Edit: January 02, 2022, 09:37:51 AM by johnoatwork
Oh, and for those who might come across this, here is my /boot/loader.conf.local for the X710-DA2.  Not sure this is optimal and always interested in suggestions on how it can be improved.


kern.ipc.nmbclusters=1000000
kern.ipc.nmbjumbop=524288
hw.intr_storm_threshold=10000
net.inet.tcp.tso=0
net.isr.maxthreads=-1
net.isr.bindthreads=1
dev.ixl.0.iflib.override_qs_enable=1
dev.ixl.1.iflib.override_qs_enable=1
dev.ixl.0.iflib.override_nrxqs=128
dev.ixl.1.iflib.override_nrxqs=128
dev.ixl.0.iflib.override_ntxqs=128
dev.ixl.1.iflib.override_ntxqs=128
dev.ixl.0.iflib.override_nrxds=128
dev.ixl.1.iflib.override_nrxds=128
dev.ixl.0.iflib.override_ntxds=128
dev.ixl.1.iflib.override_ntxds=128

May 29, 2022, 12:47:34 AM #16 Last Edit: May 29, 2022, 09:16:11 AM by rungekutta
Quote from: rungekutta on December 18, 2021, 10:36:54 AM
After all my woes (https://forum.opnsense.org/index.php?topic=25263.15) I managed to get forwarding performance up to ~5 Gb/s through the Chelsio T520-SO-CR and Ryzen hardware so bit weird that your performance is is so low after having followed similar steps. Will be interesting to hear your results on Intel X710. And as mentioned on Linux also. NB that's a side project for me as well - setting up a minimal Debian 11 with routing and firewall through nftables, also unbound and dhcp server etc. Not got as far as live-testing it yet but curious how it will perform in comparison.

Ok so I'm opening up this thread again now, because I've done exactly this. The results are kind of interesting.

Note first and foremost that I'm a big fan of OpnSense. The admin GUI is superb and it has really served me well (and continues to do so) and helped me get up the curve on networking stuff.

All that said, I think the testing reveals some differences between Linux and FreeBSD, or at least FreeBSD as configured in OpnSense. My Linux setup is a minimal Debian 11 VM in Proxmox with nftables for firewall & routing and dnsmasq for dns & dhcp. Dnsmasq forwards dns to unbound as a resolver. A bunch of rules that controll traffic between the various internal networks, and a bunch of dnat forwarding to services on the dmz, with hairpin.

I applied minimal tuning - increased some tcp buffer etc. Don't know if that made any difference or not.

Out of the box, iPerf3 against 2 other servers on the internal network are a solid 9.4-9.8 Gb/s and Debian still runs 95% idle. NAT routing performance out on the 10Gb wan (using fast.com and speedtest.com) varies according to client and time of day between 6-8 Gb/s while the Debian VM idles 98% (!).

Note that I'm not running suricata or any fancy metrics or instrumentation (only nftables stats).

Still, this is quite some difference. I never managed to get much more than 5-6Gb/s through OpnSense on the same hardware, and the CPU had to work much harder too.

Maybe it partly comes down to kernel optimizations for Ryzen? In any case, I wasn't expecting quite such a difference.

Would OpnSense ever consider re-basing on Linux? I realize it would be a non-trivial exercise... The TrueNas folks did it though...

It may just come down to BSDs lacking driver support for many things which is where unix shines since the community is much much bigger. That said, I can get a solid 9.8gb natted over opnsense, running an older xeon here. I get 9.8gb from opnsense to my unraid server running iperf3 (using -P8 for 8 threads) but unraid server to opnsense is only ~5-6gb sometimes 4, but I think that has more to do with unraid at that point since Im able to do 10gb 1 way.

OPNsense is BSD based. you should try Vyatta if you are looking for linux. No need to change what's Rockin'.

I also am able to sustain 9.8Gbps for minutes without any resource use... LOL

Good to hear those speeds are achievable. What NICs do you guys use? Did you have to fiddle with tunables in order to get the performance?

Fwiw I looked at Vyatta also but didn't really see the point. Nftables in itself is straightforward enough so not so much gained vs a vanilla Debian - where you also get more flexibility. In both cases losing out vs OpnSense's awesome gui.

lol... Let me understand this... you like the GUI and want to change the entire unerlay to something else because of GUI???? LOL

I like Ferraris, so let's make all 18 Wheeler trucks look like Ferrari! LOL

Indeed, I like the product including its gui and plug-in ecosystem and its community. And I am raising the question whether in the long run it would be better off based off Linux than BSD. I understand it's a sensitive topic for some. Alas iX systems and Netgate both seem to be heading in that direction so its not like nobody ever thought of it before. But feel free to lol if that makes you feel better ;-). Or maybe add some actual thoughts on the topic.

you seem not understand the BSD ecosystem. It's not your fault and that's OK.

No worries though.... :)

Quote from: rungekutta on June 06, 2022, 10:56:34 AM
Good to hear those speeds are achievable. What NICs do you guys use? Did you have to fiddle with tunables in order to get the performance?

Fwiw I looked at Vyatta also but didn't really see the point. Nftables in itself is straightforward enough so not so much gained vs a vanilla Debian - where you also get more flexibility. In both cases losing out vs OpnSense's awesome gui.

Mellanox ConnectX-3 10gb SFP dual port here, 1 to WAN and 1 to my LAN. No tunables set up.

Quote from: lilsense on June 06, 2022, 11:49:44 PM
you seem not understand the BSD ecosystem. It's not your fault and that's OK.
Thank you for your thoughtful contribution to the topic.

Quote from: jclendineng on June 07, 2022, 12:15:27 AM
Mellanox ConnectX-3 10gb SFP dual port here, 1 to WAN and 1 to my LAN. No tunables set up.

That's interesting. I have Chelsio NICs, which are supposedly well supported, but I had to mess around with tunables and settings before I managed to get netmap to run in native mode and offer half decent performance. https://forum.opnsense.org/index.php?topic=25263.0

Quote from: rungekutta on June 07, 2022, 07:27:17 PM
Quote from: jclendineng on June 07, 2022, 12:15:27 AM
Mellanox ConnectX-3 10gb SFP dual port here, 1 to WAN and 1 to my LAN. No tunables set up.

That's interesting. I have Chelsio NICs, which are supposedly well supported, but I had to mess around with tunables and settings before I managed to get netmap to run in native mode and offer half decent performance. https://forum.opnsense.org/index.php?topic=25263.0

further testing is seeing sub optimal speeds. 1 direction is 9.4gb or so (fine) the other direction is 5-6gb. I think that comes down to single thread performance. I have a good (older) cpu in a rack server and iperf doesnt even touch it but im thinking single thread is the issue as iperf is single thread even with -P set, as that only is streams and (my understanding is) that mulkti stream is still all single threaded per the og dev.

July 17, 2022, 01:43:10 PM #27 Last Edit: July 17, 2022, 02:16:13 PM by lilsense
Quote from: rungekutta on June 07, 2022, 07:22:10 PM
Quote from: lilsense on June 06, 2022, 11:49:44 PM
you seem not understand the BSD ecosystem. It's not your fault and that's OK.
Thank you for your thoughtful contribution to the topic.
Just to clarifying this, Netgate's TNSR is linux based and NOT free. Netgate is doing this for $$$$. TrueNAs is doing it for totally different reason and that's got nothing to do with FreeBSD slow networking speeds.

Here's an example of driver you can install on your FreeBSD, bnxt , to get 100G
Broadcom BCM57454 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb Ethernet

And!!!

here's how netflix is using FreeBSD:
https://papers.freebsd.org/2021/eurobsdcon/gallatin-netflix-freebsd-400gbps/

here's the YT:
https://www.youtube.com/watch?v=_o-HcG8QxPc

I can't wait for this years EuroBSDcon for this topic:
The "other" FreeBSD optimizations used by Netflix to serve video at 800Gb/s from a single server

July 21, 2022, 06:37:35 PM #28 Last Edit: July 21, 2022, 06:49:45 PM by rungekutta
Thanks for demystifying that earlier comment.  ;)

No doubt FreeBSD as well as other Unixes as well as Linux is capable of producing great results. And those Netflix stats are impressive. However, note also that the use case is very specific. They stream static files from SSDs and have carefully optimized and tuned everything along the way, from software to o/s to hardware and drivers. In some cases they have found and removed bottlenecks and submitted back to FreeBSD (e.g. async sendfile). They have also worked closely with AMD and Mellanox and others.

I'm sure some of that has benefitted FreeBSD more broadly, but I don't know how relevant it is to most users on this forum who are trying to get good filtering, routing and forwarding performance out of a range of different hardware, from small appliances to enterprise. And in addition, on a relatively complex setup that involves netmap, automatically generated firewall rules, and additional software such as suricata layered on top. That's quite a different gig.. and the complexity of it, in combination with the range of available x86 hardware, is presumably why this forum is so full of people reporting such a wide range of experience from OpnSense in terms of performance.

Btw, if you're looking for other extreme examples, Linux reached a forwarding packet rate of 1 terabit back in 2017 ;)
https://www.fiercetelecom.com/telecom/linux-foundation-s-fd-io-virtual-switch-project-doubles-packet-throughput-to-terabit-speeds

For the regular user though, I'm not sure this is any more relevant than the Netflix example. And I notice this is going off-topic as well - my intention was not to start an o/s war. Sorry. Will stop there.