APU2D4 very low throughput 1Gbit

Started by Burschi, May 24, 2021, 03:26:16 PM

Previous topic - Next topic
Alright then, and as @Ricardo mentioned, is this clock boost only for 1 core ?

Again want to say, I don't own one of these devices but I think a lot of the configs posted here will not work with later versions of OPNsense (20.7 and 21.1). Both OPNsense 20.7+ and pfSense 2.5+ use FreeBSD 12.x for their base. FreeBSD 12.x uses iflib for NIC queues and no longer contains many of the old tunables what we would have used in FreeBSD 11.x.

Because of this, most of the configs being posted here will not have any impact.

There are still some tunables that you can set on the igb NIC driver, primarily disabling flow control and disabling EEE. These are the "new" tunables needed in the FreeBSD 12.x series:
dev.igb.X.fc (X is the interface number)
dev.igb.X.eee_control (X is the interface number)

Setting both of these to 0 should disable the feature.

If you wish to check which options are available for the igb NICs, you can run the following at an SSH console
sysctl -a | grep igb

You will notice that if you run this command, there are now many different configurable settings that do not match any of the previously used configs that we relied on in FreeBSD 11.x.


It would be great to highlight these trapmines, as the average APU owner goes to Techlager.se or to some random Calomel article e.g. https://calomel.org/freebsd_network_tuning.html -> and apply the performance optimization sysctl-s that were relevant only for an older v10 or v11 freebsd release, and the current fbsd / hbsd / opnsense release runs v12.x, and benefit near-0% from them.

OK, after trying all (well, many...) options and tunables im under the impression that the following is the important part:

Quote from: opnfwb on May 31, 2021, 05:12:24 PM
[...]
Because of this, most of the configs being posted here will not have any impact.
[...]
You will notice that if you run this command, there are now many different configurable settings that do not match any of the previously used configs that we relied on in FreeBSD 11.x.
^Any hints on this from the experts?

While I do not consider myself an expert :D I do think Teklager actually left a hint. They specifically say no tuning is needed on pfSense 2.5 (which is FreeBSD 12.x based). What this really means is that anything 12.x based, to include OPNsense as well, will respond in a similar fashion.

Teklager also goes on to show single thread transfer tests with lower performance values when using pfSense 2.5 compared to pfSense 2.4 (and the FreeBSD 11.x tweaks).

Miroco posted the link with Teklager hinting at this.
Quote from: miroco on May 28, 2021, 01:56:04 AM
https://teklager.se/en/knowledge-base/apu2-1-gigabit-throughput-pfsense/

"Gigabit config for pfSense 2.5.0. No tweaks are required! Don't follow any of the information listed below for pfSense 2.4.5."

At this point I would try these 4 things and report back. It's also important to make sure that the iperf tests you run are pushing traffic through the firewall (have the client on LAN, and another server on WAN). Don't just host iperf on one of the firewall interfaces.

In your tuneables set the following:
hw.ibrs_disable: 1 (just disable this to test throughput, there are security implications)
vm.pmap.pti: 0 (just disable this to test throughput, there are security implications)
dev.igb.0.eee_control: 0 (disable Energy Efficient Ethernet, do this for all IGB interfaces present on the device)
dev.igb.0.fc: 0 (disable Flow Control, do this for all IGB interfaces present on the device)


Set those tuneables and reboot. Then re-run the throughput tests and see if there is an improvement. All traffic shaping and the Netflow Insight plugin on OPNsense should also be disabled during these tests.

Thanks for the effort @opnfwb, but it only yielded ~30Mb/sec, but with iperf3 -c <ip> -P 6 i get around 70Mb/sec...

I think the next thing to try just to rule out some weird inconsistency would be to attempt the same tests on the latest pfsense 2.5 and report back? If you're seeing the same limited throughput on the same platform that Teklager benchmarked then there has to be some other piece of the puzzle missing here. Maybe firmware or some other oddity?

Hey, i tried again with all settings reverted (somehow my box hung after setting the tunables so i had to use the serial...), and it says 20-25 Mb/sec even with -P 6 (and also for lower and higher values).
Im not sure if i can make the pfsense install easily since i have set up VLANs across my homenet; i was thinking about buying a 4-port nic and going VM (enough power to rule that out), since i have proxmox running anyways.
Maybe.

I don't know, man. I have a connection 600 Mbps down, 40 Mbps up and I can use all the speed with an APU2 (4 GB RAM).

Oh, yes, I don't use Suricata.
OPNsense HW:

Minisforum Venus series UN100C, 16 GB RAM, 512 GB SSD
T-bao N9N Pro, 16 GB RAM, 512 GB SSD

i have set up opnsense on my proxmox host with a intel I350 quad nic/passthrough with my original configuration from the APU2D4, and there i get 112 MBit/sec. So seems to be related to cpu? What is wrong with my APU and/or configuration?

You're not alone looking for drastic improvements Burshi. Apologies to pollute your thread. I'll create my own if you prefer it.
I'm using an APU4D4 I just put together. I've noticed the same problem on OPNSense. I've updated the board with the latest firmware and I have the latest OPNSense 21.1.6
LAN clients via a gigabit switch pull 1 Gbps client to client, to baseline.
Iperf against the firewall only get 390 Mbps.
Iperf through the firewall from public iperf servers hover on 290 Mbps.
Basic default rules and only two manual ones to catch stray DNS traffic. No Suricata,

So far the only tunables I've added are:
hw.igb.rx_process_limit="-1"
hw.igb.tx_process_limit="-1"
legal.intel_igb.license_ack="1"

I'm planning on adding all the ones admodovaris kindly offered and report back. I plan on adding a loader.conf.local & testing before adding them as tunables in the UI. Merging also the ones suggested by opnfwb (thank you).
I will add these:

amdtemp_load="YES"
ahci_load="YES"
aesni_load="YES"
if_igb_load="YES"
flowd_enable="YES"
flowd_aggregate_enable="YES"
legal.intel_igb.license_ack="1"
legal.intel_ipw.license_ack=1
legal.intel_iwi.license_ack=1
# this is the magic. If you don't set this, queues won't be utilized properly
# allow multiple processes for receive/transmit processing
hw.igb.rx_process_limit="-1"
hw.igb.tx_process_limit="-1"

net.pf.states_hashsize=2097152

hw.igb.num_queues=0

hw.igb.enable_aim=1

hw.igb.enable_msix=1
hw.pci.enable_msix=1
hw.igb.rx_process_limit="-1"
hw.igb.tx_process_limit="-1"

vm.pmap.pti = 0
hw.ibrs_disable = 0

hint.p4tcc.0.disabled=1
hint.acpi_throttle.0.disabled=1
hint.acpi_perf.0.disabled=1
dev.igb.0.eee_control=0
dev.igb.0.fc=0

hint.p4tcc.1.disabled=1
hint.acpi_throttle.1.disabled=1
hint.acpi_perf.1.disabled=1
dev.igb.1.eee_control=0
dev.igb.1.fc=0

hint.p4tcc.2.disabled=1
hint.acpi_throttle.2.disabled=1
hint.acpi_perf.2.disabled=1
dev.igb.2.eee_control=0
dev.igb.2.fc=0

hint.p4tcc.3.disabled=1
hint.acpi_throttle.3.disabled=1
hint.acpi_perf.3.disabled=1
dev.igb.3.eee_control=0
dev.igb.3.fc=0

Update:
Im running OPNsense in a vm on proxmox with a intel 4port nic passed through. I have max throughput with iperf3 (>100 MB/s), although performance is drastically decreased when using sensei, suricata or ntopng (~60 MB/s).

Im going on with the virtualized OPNsense, keeping the APU2D4 as a backup.

Thank you all for your help!

Hi, was wondering how to see and adjust these tunables:

net.inet.tcp.maxtcptw
net.inet.ip.dummynet.max_chain_len
net.inet.ip.fastforwarding

Can't find them using sysctl -a and adding a entry in loader.conf.local or system tunables on OPNsense via GUI just shows a message at boot up along the lines of: sysctl: unknown oid 'net.inet.ip.fastforwarding'.

I'm on opnsense 21.1, 64bit.

(Sorry for not being relevant to the topic, but a lot of tunables were being discussed so I thought someone here might know. Thanks).

Yup, a lot of tunables have been dropped in higher versions of FreeBSD.
OPNsense HW:

Minisforum Venus series UN100C, 16 GB RAM, 512 GB SSD
T-bao N9N Pro, 16 GB RAM, 512 GB SSD