Squeezing APu2 - Performance

Started by mfpck, September 03, 2020, 04:38:24 PM

Previous topic - Next topic
September 03, 2020, 04:38:24 PM Last Edit: September 03, 2020, 04:40:17 PM by mfpck
Hi,

I already tried a few tings but want to start from scratch so I resetet the opnsense 20.7.2 to factory defaults and everything except the wizard is untouched.

These are the bios settings (v4.12.0.3)
Boot order - type letter to move device to top.

  a USB
  b SDCARD
  c mSATA
  d SATA
  e mPCIe1 SATA1 and SATA2
  f iPXE (disabled)


  r Restore boot order defaults
  n Network/PXE boot - Currently Disabled
  u USB boot - Currently Enabled
  t Serial console - Currently Enabled
  k Redirect console output to COM2 - Currently Disabled
  o UART C - Currently Enabled
  p UART D - Currently Enabled
  m Force mPCIe2 slot CLK (GPP3 PCIe) - Currently Disabled
  h EHCI0 controller - Currently Disabled
  l Core Performance Boost - Currently Enabled
  i Watchdog - Currently Disabled
  j SD 3.0 mode - Currently Disabled
  g Reverse order of PCI addresses - Currently Disabled
  v IOMMU - Currently Disabled
  y PCIe power management features - Currently Disabled
  w Enable BIOS write protect - Currently Disabled
  x Exit setup without save
  s Save configuration and exit



First I connected my macbook via cat6 to the apu board on the lan port and wondering a bit about the ping times:
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=1.037 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.798 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.988 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=1.153 ms
64 bytes from 192.168.1.1: icmp_seq=11 ttl=64 time=1.072 ms
64 bytes from 192.168.1.1: icmp_seq=12 ttl=64 time=0.755 ms
64 bytes from 192.168.1.1: icmp_seq=13 ttl=64 time=1.318 ms
64 bytes from 192.168.1.1: icmp_seq=14 ttl=64 time=0.911 ms
64 bytes from 192.168.1.1: icmp_seq=15 ttl=64 time=0.624 ms


Then I run iperf3 against the apu default:
./iperf3 -c 192.168.1.1             
Connecting to host 192.168.1.1, port 5201
[  4] local 192.168.1.100 port 61123 connected to 192.168.1.1 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec  35.2 MBytes   295 Mbits/sec                 
[  4]   1.00-2.00   sec  34.5 MBytes   289 Mbits/sec                 
[  4]   2.00-3.00   sec  34.8 MBytes   292 Mbits/sec                 
[  4]   3.00-4.00   sec  35.4 MBytes   297 Mbits/sec                 
[  4]   4.00-5.00   sec  35.5 MBytes   298 Mbits/sec                 
[  4]   5.00-6.00   sec  35.4 MBytes   297 Mbits/sec                 
[  4]   6.00-7.00   sec  35.4 MBytes   297 Mbits/sec                 
[  4]   7.00-8.00   sec  35.8 MBytes   301 Mbits/sec                 
[  4]   8.00-9.00   sec  35.5 MBytes   298 Mbits/sec                 
[  4]   9.00-10.00  sec  35.5 MBytes   298 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec   353 MBytes   296 Mbits/sec                  sender
[  4]   0.00-10.00  sec   353 MBytes   296 Mbits/sec                  receiver


And again with -P 2 -t 20
./iperf3 -c 192.168.1.1 -p 5201 -P 2 -t 20
Connecting to host 192.168.1.1, port 5201
[  4] local 192.168.1.100 port 61125 connected to 192.168.1.1 port 5201
[  6] local 192.168.1.100 port 61126 connected to 192.168.1.1 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec  34.5 MBytes   290 Mbits/sec                 
[  6]   0.00-1.00   sec  35.1 MBytes   294 Mbits/sec                 
[SUM]   0.00-1.00   sec  69.7 MBytes   584 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]   1.00-2.00   sec  33.8 MBytes   284 Mbits/sec                 
[  6]   1.00-2.00   sec  34.6 MBytes   290 Mbits/sec                 
[SUM]   1.00-2.00   sec  68.4 MBytes   574 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]   2.00-3.00   sec  34.0 MBytes   286 Mbits/sec                 
[  6]   2.00-3.00   sec  35.1 MBytes   294 Mbits/sec                 
[SUM]   2.00-3.00   sec  69.1 MBytes   580 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]   3.00-4.00   sec  33.7 MBytes   282 Mbits/sec                 
[  6]   3.00-4.00   sec  34.7 MBytes   291 Mbits/sec                 
[SUM]   3.00-4.00   sec  68.3 MBytes   573 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]   4.00-5.00   sec  34.2 MBytes   287 Mbits/sec                 
[  6]   4.00-5.00   sec  34.9 MBytes   293 Mbits/sec                 
[SUM]   4.00-5.00   sec  69.1 MBytes   580 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]   5.00-6.00   sec  33.9 MBytes   284 Mbits/sec                 
[  6]   5.00-6.00   sec  34.6 MBytes   290 Mbits/sec                 
[SUM]   5.00-6.00   sec  68.5 MBytes   575 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]   6.00-7.00   sec  32.8 MBytes   275 Mbits/sec                 
[  6]   6.00-7.00   sec  33.3 MBytes   279 Mbits/sec                 
[SUM]   6.00-7.00   sec  66.1 MBytes   555 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]   7.00-8.00   sec  33.6 MBytes   282 Mbits/sec                 
[  6]   7.00-8.00   sec  34.4 MBytes   289 Mbits/sec                 
[SUM]   7.00-8.00   sec  68.1 MBytes   571 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]   8.00-9.00   sec  33.4 MBytes   280 Mbits/sec                 
[  6]   8.00-9.00   sec  33.5 MBytes   281 Mbits/sec                 
[SUM]   8.00-9.00   sec  66.9 MBytes   561 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]   9.00-10.00  sec  33.0 MBytes   277 Mbits/sec                 
[  6]   9.00-10.00  sec  33.6 MBytes   282 Mbits/sec                 
[SUM]   9.00-10.00  sec  66.6 MBytes   559 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]  10.00-11.00  sec  34.3 MBytes   287 Mbits/sec                 
[  6]  10.00-11.00  sec  34.9 MBytes   293 Mbits/sec                 
[SUM]  10.00-11.00  sec  69.1 MBytes   580 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]  11.00-12.00  sec  34.0 MBytes   286 Mbits/sec                 
[  6]  11.00-12.00  sec  34.9 MBytes   293 Mbits/sec                 
[SUM]  11.00-12.00  sec  69.0 MBytes   578 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]  12.00-13.00  sec  33.8 MBytes   284 Mbits/sec                 
[  6]  12.00-13.00  sec  34.3 MBytes   288 Mbits/sec                 
[SUM]  12.00-13.00  sec  68.1 MBytes   572 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]  13.00-14.00  sec  32.6 MBytes   274 Mbits/sec                 
[  6]  13.00-14.00  sec  33.1 MBytes   277 Mbits/sec                 
[SUM]  13.00-14.00  sec  65.7 MBytes   551 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]  14.00-15.00  sec  26.6 MBytes   223 Mbits/sec                 
[  6]  14.00-15.00  sec  27.0 MBytes   226 Mbits/sec                 
[SUM]  14.00-15.00  sec  53.6 MBytes   449 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]  15.00-16.00  sec  27.0 MBytes   226 Mbits/sec                 
[  6]  15.00-16.00  sec  27.3 MBytes   229 Mbits/sec                 
[SUM]  15.00-16.00  sec  54.3 MBytes   456 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]  16.00-17.00  sec  27.6 MBytes   231 Mbits/sec                 
[  6]  16.00-17.00  sec  27.9 MBytes   234 Mbits/sec                 
[SUM]  16.00-17.00  sec  55.5 MBytes   465 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]  17.00-18.00  sec  27.7 MBytes   232 Mbits/sec                 
[  6]  17.00-18.00  sec  27.9 MBytes   234 Mbits/sec                 
[SUM]  17.00-18.00  sec  55.6 MBytes   466 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]  18.00-19.00  sec  27.6 MBytes   231 Mbits/sec                 
[  6]  18.00-19.00  sec  27.8 MBytes   233 Mbits/sec                 
[SUM]  18.00-19.00  sec  55.4 MBytes   465 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[  4]  19.00-20.00  sec  27.6 MBytes   231 Mbits/sec                 
[  6]  19.00-20.00  sec  27.7 MBytes   232 Mbits/sec                 
[SUM]  19.00-20.00  sec  55.3 MBytes   464 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-20.00  sec   636 MBytes   267 Mbits/sec                  sender
[  4]   0.00-20.00  sec   636 MBytes   267 Mbits/sec                  receiver
[  6]   0.00-20.00  sec   647 MBytes   271 Mbits/sec                  sender
[  6]   0.00-20.00  sec   647 MBytes   271 Mbits/sec                  receiver
[SUM]   0.00-20.00  sec  1.25 GBytes   538 Mbits/sec                  sender
[SUM]   0.00-20.00  sec  1.25 GBytes   538 Mbits/sec                  receiver


Can somebody confirm equal results ?

The cpu boost seems to go dynamic up to 1400MHz without the turnables set and seem to have no effect on iperf tests but maybe on proxy and idp !?
https://github.com/pcengines/apu2-documentation/blob/master/docs/apu_CPU_boost.md

Does somebody ?

Futher I was woundering that I was not able to get simular results based on this:
https://teklager.se/en/knowledge-base/opnsense-performance-optimization/

Somebody ?






September 04, 2020, 02:02:01 PM #1 Last Edit: September 04, 2020, 02:06:06 PM by hushcoden
I did the same test with my Windows 10 laptop and my apu2e4 v4.12.0.3 and I also wasn't able to go near 1Gb/s (and I did also follow the settings performance suggestions by TekLager)  :-[

Max speed was 734Mb/s

I tested 10 times with option -P 10 -- screenshot attached

September 04, 2020, 07:07:05 PM #2 Last Edit: September 04, 2020, 07:09:07 PM by mfpck
Hi again,

My apu ist still untouched and based on your post I did the the test again with -P 10 as well-
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec   111 MBytes  92.9 Mbits/sec                  sender
[  4]   0.00-10.00  sec   109 MBytes  91.3 Mbits/sec                  receiver
[  6]   0.00-10.00  sec   119 MBytes   100 Mbits/sec                  sender
[  6]   0.00-10.00  sec   119 MBytes  99.9 Mbits/sec                  receiver
[  8]   0.00-10.00  sec   135 MBytes   114 Mbits/sec                  sender
[  8]   0.00-10.00  sec   135 MBytes   114 Mbits/sec                  receiver
[ 10]   0.00-10.00  sec  91.5 MBytes  76.7 Mbits/sec                  sender
[ 10]   0.00-10.00  sec  91.4 MBytes  76.7 Mbits/sec                  receiver
[ 12]   0.00-10.00  sec  42.9 MBytes  36.0 Mbits/sec                  sender
[ 12]   0.00-10.00  sec  42.8 MBytes  35.9 Mbits/sec                  receiver
[ 14]   0.00-10.00  sec  44.8 MBytes  37.6 Mbits/sec                  sender
[ 14]   0.00-10.00  sec  44.7 MBytes  37.5 Mbits/sec                  receiver
[ 16]   0.00-10.00  sec   175 MBytes   147 Mbits/sec                  sender
[ 16]   0.00-10.00  sec   175 MBytes   146 Mbits/sec                  receiver
[ 18]   0.00-10.00  sec   162 MBytes   136 Mbits/sec                  sender
[ 18]   0.00-10.00  sec   162 MBytes   136 Mbits/sec                  receiver
[ 20]   0.00-10.00  sec   110 MBytes  92.6 Mbits/sec                  sender
[ 20]   0.00-10.00  sec   110 MBytes  92.6 Mbits/sec                  receiver
[ 22]   0.00-10.00  sec  91.0 MBytes  76.3 Mbits/sec                  sender
[ 22]   0.00-10.00  sec  90.9 MBytes  76.3 Mbits/sec                  receiver
[SUM]   0.00-10.00  sec  1.06 GBytes   908 Mbits/sec                  sender
[SUM]   0.00-10.00  sec  1.05 GBytes   906 Mbits/sec                  receiver

iperf Done.


And now I am really confused because everything is default without any tunings at all  ???

Or does the reset to factory defaults does not reset the turnables?

Otherwise the general advise is to do not tune at all ;-) or we do not understand iperf.....


Does somebody is using the apu as a raw wireguard site2site device and is able to share a brief summary optionally with throughput benchmark stats with eg. iperf3 ?

I will do that pretty soon....