PC Engines APU2 1Gbit traffic not achievable

Started by Ricardo, July 27, 2018, 12:24:54 PM

Previous topic - Next topic
Quote from: mimugmail on October 09, 2018, 01:35:15 PM
Quote from: ricsip on September 13, 2018, 01:15:38 PM

Results (iperf -P 1 == single flow):
1)->firewall disabled, NAT disabled: can easily transmit 890-930 Mbit from WAN-->LAN, and vice versa, CPU load is approx 1x core 65% INT , another 1x core 10-30% in INT, the rest is idle. Throughput is stable, very minimal variation.
2)->firewall enabled, NAT disabled: this time its peak at 740-760 Mbit from WAN-->LAN, and vice versa, CPU load 1x 100% INT + 1x 20% INT, rest is idle. Occasionally, I get these strange drops to around 560 Mbit or to around 630 Mbit.
3)->firewall enabled, NAT enabled: LAN -->WAN: approx 650-720 Mbit, WAN-->LAN: around 460 Mbit constantly (100%+20% INT)

Results for 2) and 3) are not really consistent, and greatly vary between iperf sessions. So does the CPU load characteristics (sometimes less INT load results a higher throughput, other times double the INT load results much lower throughput).



I got exactly same results. After this I tried enabling hw offloading on the NIC but the system doesnt boot anymore .. also after reinstall. Have to dig trough later this week.

Similar results with vanilla 11.1, now upgrading to 11.2


October 18, 2018, 10:40:21 AM #62 Last Edit: October 18, 2018, 10:43:54 AM by miroco
An unintentional dubble post.

Quote from: mimugmail on October 17, 2018, 10:47:39 AM
Quote from: mimugmail on October 09, 2018, 01:35:15 PM
Quote from: ricsip on September 13, 2018, 01:15:38 PM

Results (iperf -P 1 == single flow):
1)->firewall disabled, NAT disabled: can easily transmit 890-930 Mbit from WAN-->LAN, and vice versa, CPU load is approx 1x core 65% INT , another 1x core 10-30% in INT, the rest is idle. Throughput is stable, very minimal variation.
2)->firewall enabled, NAT disabled: this time its peak at 740-760 Mbit from WAN-->LAN, and vice versa, CPU load 1x 100% INT + 1x 20% INT, rest is idle. Occasionally, I get these strange drops to around 560 Mbit or to around 630 Mbit.
3)->firewall enabled, NAT enabled: LAN -->WAN: approx 650-720 Mbit, WAN-->LAN: around 460 Mbit constantly (100%+20% INT)

Results for 2) and 3) are not really consistent, and greatly vary between iperf sessions. So does the CPU load characteristics (sometimes less INT load results a higher throughput, other times double the INT load results much lower throughput).



I got exactly same results. After this I tried enabling hw offloading on the NIC but the system doesnt boot anymore .. also after reinstall. Have to dig trough later this week.

Similar results with vanilla 11.1, now upgrading to 11.2

Same with 11.2. I'll now install OPNsense on a similar hardware to see if it's related to the hardware ..

Thanks for the constant status updates :) Eagerly waiting for your results.

By the way: pls dont forget that there is a current known issue in coreboot 4.8.x regarding CPU downlclocking:
https://github.com/pcengines/coreboot/issues/196

so make sure the poor performance is not because the APU lowers the clockrate after couple of minutes uptime to  @600 Mhz , instead of 1Ghz :)


But I'm running 4.0.18?


I tested some old Sophos UTM with Atom N540 processor and got in all directions with 1 or 10 streams only 500-600Mbit. I'm searching for a device quite comparable to the APU :)


October 20, 2018, 07:19:18 AM #67 Last Edit: October 20, 2018, 07:21:38 AM by ruffy91
i210 has software configurable flow control. Maybe the configuration is not that good?
Registers are:
The  following  registers are defined for  the implementation of  flow control:
• CTRL.RFCE  field is  used  to enable  reception of legacy flow  control packets and reaction to  them
• CTRL.TFCE  field is  used  to  enable transmission of  legacy flow  control packets
• Flow Control Address Low,  High  (FCAL/H) - 6-byte flow  control multicast address
• Flow Control Type  (FCT) 16-bit field to indicate flow control  type
• Flow Control bits in Device  Control  (CTRL) register  - Enables flow  control modes
• Discard PAUSE  Frames (DPF)  and  Pass MAC Control Frames (PMCF) in RCTL  - controls  the forwarding of control  packets to the  host
• Flow Control Receive Threshold High (FCRTH0)  - A  13-bit high  watermark indicating receive buffer fullness. A  single  watermark  is  used in  link FC  mode.
• DMA Coalescing  Receive Threshold  High (FCRTC) -  A 13-bit high  watermark indicating receive buffer fullness when in  DMA coalescing and Tx  buffer  is empty.  The  value in  this  register can be higher than  value placed in  the FCRTH0  register  since  the watermark needs to be set  to allow for only receiving a  maximum sized Rx packet before  XOFF flow  control  takes effect and  reception is stopped (refer to  Table 3-28  for  information on  flow  control  threshold calculation).
• Flow Control Receive Threshold Low (FCRTL0) - A  13-bit low watermark indicating receive buffer emptiness. A single watermark is used  in link FC mode.
• Flow Control Transmit Timer Value  (FCTTV) -  a set  of 16-bit timer values to include in  transmitted PAUSE  frame.  A single  timer is  used  in Link  FC mode
• Flow Control Refresh  Threshold Value (FCRTV) - 16-bit PAUSE refresh threshold  value
• RXPBSIZE.Rxpbsize  field is  used  to control  the size of the receive packet  buffer

The datasheet has very detailed descriptions on how flow control works: https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/i210-ethernet-controller-datasheet.pdf



Played with FC, tried again mixing setup with TSO, LRO, XCSUM .. always same result.
Found this one:

https://elatov.github.io/2017/04/pfsense-on-netgate-apu4-1gb-testing/

Dont have any other ideas now ..

I tried a test kernel from franco which might come with 19.1 and gained a slightly better rate from 480mbit to 510mbit .. ok, last test for today :)

I think such small difference cam easily be the random variation between test runs. I could see similar variations myself running on the same OS.

Anyway, thanks for your support, at least I know its not just me. Practically all Pcengines APU2 owners should consider something different for 1Gbit WAN. If opnsense will be installed on the board of course. :-)

Quote from: ricsip on October 22, 2018, 01:30:33 PM

Anyway, thanks for your support, at least I know its not just me. Practically all Pcengines APU2 owners should consider something different for 1Gbit WAN. If opnsense will be installed on the board of course. :-)

Why? It achieves 1GB with multiple streams easily .. why would someone need 1GB on 1 stream?

October 22, 2018, 02:02:23 PM #74 Last Edit: October 22, 2018, 02:05:27 PM by ricsip
Do you have any chance to access PPPOE-based WAN / PPPOE-based WAN simulator? As I also have issues to reach 1 Gbit even on multi-stream, if PPPOE is used for the WAN Aconnection. I already gave up hope for 1Gbit single-flow performance, but even multi-flow performance is quite low. Where connecting a PC to the same PPPOE WAN directly (no OPNSENSE router/firewall in front of the PC), I can achieve much higher speeds.