Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - pjdouillard

#2
Quote from: mimugmail on February 13, 2020, 04:26:17 PM
The reason why probably no dev answerd is that maybe none of the devs have either an APU or such a high bandwidth. Keep in mind that this is a community project. I for myself have only VDSL100 .. I have no idea how to help because I can't reproduce.

Maybe you can start with installing fresh pfsense, do a sysctl -a, output to file, do same for opnsense, and the diff them. Maybe pf has some other defaults.

Keep in mind that pfsense has about 100x bigger community, so the chance that one guy with an APU and enought knowledge to solve this and report the fix (not the problem) to upstream is 100x higher.

Since you didn't read the whole thread, I will make it short for you:
-pfSense has the same problem on the same APU and no one in that community has found a fix - anything that is posted elsewhere has been tested and doesn't provide any REAL single-thread / single-stream solution.
-You don't need 1+ Gbps ISP bandwidth to recreate the problem: a local network with CAT5E ethernet cables will do the job between 2 physical PCs.
-If the devs don't have access to a PCEngine APU, I can send them one for free if they care to fix the problem.
#3
Quote from: franco on February 13, 2020, 11:57:30 AM
I don't want to be unfriendly, but I'm definitely going to close this thread if people keep comparing apples and oranges.


Cheers,
Franco

Hello Franco,

I disagree as this isn't apples to oranges comparison, but as this thread is going on (started in July 2018 and still no resolution), comparing other firewalls with OPNsense running on the SAME hardware and saying what we are trying to solve the issue is the only thing we can do "on our side".  And up to now, not a single dev produced some help in this thread as to why we might be having the issue or some path of resolution/explanation.

The PCEngine hardware is used by a lots of people around the world (privately and commercialy) since many years (before OPNsense was forked) and it provides a lot and fills a segment on the market that other commercial brands can't even achieve for the same price (reliability and low power usage).  So we want to maximize our investment AND also use OPNsense because we like/prefer it over other firewalls.  Trying to muzzle or threathened us by closing the thread isn't the right direction imo and isn't what I am expecting from the OPNsense forum - and is a reason many of us left "that other well known firewall" for OPNsense.  We are not bitching but we are kind of fed up (in a way) by the lack of help or feedback by the guys who are making OPNsense.

So to be back on the thread itself, since other firewalls (Linux-based firewalls) are able to max the gigabit speed on any of the NIC of the APU2 from PCengine, we are all puzzled as to why OPNsense isn't capable of doing it.  FreeBSD has the best TCP/IP stack of the *NIX out there so what is the problem

We are not all Operating System developpers and thus are not equipped to check what's going on when a transfer is occuring on the APU2's NICs.  Is there an issue with FreeBSD/HardenedBSD and the Intel's NIC of the APU2?  Is there some other issue with FreeBSD/HardenedBSD not being able to turbo the AMD cpu at 1.4Ghz? Anything else?

We post on these forums to get (we hope) some answers from the devs themselves on some of the issues we encounters - like this one.  So please, dont turn into that other company but instead maybe forward the questions to the dev team so they can take a look.

Thank you for your comprehension.
#4
The same hardware with Linux based OS (IPFire and OpenWRT) are able to max that 1 Gbps NICs without problems (see post on previous page).
#5
If you followed the thread, you know that the APU2 can easily do 500+ Mbps with no tweaking.  The issue we have with it is being able to handle 1 Gbps links when OPNsense is on it vs IPFire or OpenWRT who are able on the same hardware to achieve 900+ Mbps.

When I have time I will try you settings just to see what happens.
#6
APU4B4 info (bought in June 2018):
root@OPNsense02:~ # dmidecode -t BIOS
# dmidecode 3.2
Scanning /dev/mem for entry point.
SMBIOS 2.8 present.

Handle 0x0000, DMI type 0, 26 bytes
BIOS Information
        Vendor: coreboot
        Version: v4.11.0.1
        Release Date: 12/09/2019
        ROM Size: 8192 kB
        Characteristics:
                PCI is supported
                PC Card (PCMCIA) is supported
                BIOS is upgradeable
                Selectable boot is supported
                ACPI is supported
                Targeted content distribution is supported
        BIOS Revision: 4.11
        Firmware Revision: 0.0


APU4D4 info (bought in November 2019):
root@OPNsense:~ # dmidecode -t BIOS
# dmidecode 3.2
Scanning /dev/mem for entry point.
SMBIOS 2.8 present.

Handle 0x0000, DMI type 0, 26 bytes
BIOS Information
        Vendor: coreboot
        Version: v4.11.0.1
        Release Date: 12/09/2019
        ROM Size: 8192 kB
        Characteristics:
                PCI is supported
                PC Card (PCMCIA) is supported
                BIOS is upgradeable
                Selectable boot is supported
                ACPI is supported
                Targeted content distribution is supported
        BIOS Revision: 4.11
        Firmware Revision: 0.0


#7
Same results for the APU4D4.

The BIOS of both PCEngines' board are the latest, but the cpu frequency seems capped at 1Ghz which would explain why we can only get around ~650Mbps at best on gigabit links.  That AMD GX-412TC can do 1.2Ghz on boost.
#8
I ran the commands you wrote on the console and here is what I've got from an APU4B4:

root@OPNsense02:~ # sysctl dev.cpu.0.freq_levels
dev.cpu.0.freq_levels: 1000/1008 800/831 600/628


Idle:
root@OPNsense02:~ # sysctl dev.cpu.0.freq
dev.cpu.0.freq: 1000


Under load:
root@OPNsense02:~ # sysctl dev.cpu.0.freq
dev.cpu.0.freq: 1000


So the frequency didn't really change.  Now with powerd running, here is the output where you will see the max frequency still being 1000Mhz:
root@OPNsense02:~ # sudo powerd -v
powerd: unable to determine AC line status
load   4%, current freq 1000 MHz ( 0), wanted freq  968 MHz
load   7%, current freq 1000 MHz ( 0), wanted freq  937 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq  907 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq  878 MHz
load   7%, current freq 1000 MHz ( 0), wanted freq  850 MHz
load   6%, current freq 1000 MHz ( 0), wanted freq  823 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq  797 MHz
changing clock speed from 1000 MHz to 800 MHz
load   0%, current freq  800 MHz ( 1), wanted freq  772 MHz
load   4%, current freq  800 MHz ( 1), wanted freq  747 MHz
load   6%, current freq  800 MHz ( 1), wanted freq  723 MHz
load   0%, current freq  800 MHz ( 1), wanted freq  700 MHz
load   0%, current freq  800 MHz ( 1), wanted freq  678 MHz
load   3%, current freq  800 MHz ( 1), wanted freq  656 MHz
load   5%, current freq  800 MHz ( 1), wanted freq  635 MHz
load   0%, current freq  800 MHz ( 1), wanted freq  615 MHz
load   0%, current freq  800 MHz ( 1), wanted freq  600 MHz
changing clock speed from 800 MHz to 600 MHz
load  10%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   5%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   8%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   7%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   3%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   5%, current freq  600 MHz ( 2), wanted freq  600 MHz
load  11%, current freq  600 MHz ( 2), wanted freq  600 MHz
load 143%, current freq  600 MHz ( 2), wanted freq 2000 MHz
changing clock speed from 600 MHz to 1000 MHz
load 130%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load  85%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 107%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 101%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 100%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 106%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 100%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1937 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1876 MHz
load   4%, current freq 1000 MHz ( 0), wanted freq 1817 MHz
load   6%, current freq 1000 MHz ( 0), wanted freq 1760 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1705 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1651 MHz
load   5%, current freq 1000 MHz ( 0), wanted freq 1599 MHz
load   8%, current freq 1000 MHz ( 0), wanted freq 1549 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1500 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1453 MHz
load   5%, current freq 1000 MHz ( 0), wanted freq 1407 MHz
load   8%, current freq 1000 MHz ( 0), wanted freq 1363 MHz
load   4%, current freq 1000 MHz ( 0), wanted freq 1320 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1278 MHz
load   3%, current freq 1000 MHz ( 0), wanted freq 1238 MHz
load   9%, current freq 1000 MHz ( 0), wanted freq 1199 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1161 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1124 MHz
load   3%, current freq 1000 MHz ( 0), wanted freq 1088 MHz
load   8%, current freq 1000 MHz ( 0), wanted freq 1054 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1021 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq  989 MHz
load  15%, current freq 1000 MHz ( 0), wanted freq  958 MHz
load   7%, current freq 1000 MHz ( 0), wanted freq  928 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq  899 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq  870 MHz
load   4%, current freq 1000 MHz ( 0), wanted freq  842 MHz
load   6%, current freq 1000 MHz ( 0), wanted freq  815 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq  789 MHz
changing clock speed from 1000 MHz to 800 MHz
load   0%, current freq  800 MHz ( 1), wanted freq  764 MHz
load   6%, current freq  800 MHz ( 1), wanted freq  740 MHz
load   6%, current freq  800 MHz ( 1), wanted freq  716 MHz
load   0%, current freq  800 MHz ( 1), wanted freq  693 MHz
load   0%, current freq  800 MHz ( 1), wanted freq  671 MHz
load   6%, current freq  800 MHz ( 1), wanted freq  650 MHz
load   4%, current freq  800 MHz ( 1), wanted freq  629 MHz
load   5%, current freq  800 MHz ( 1), wanted freq  609 MHz
load   0%, current freq  800 MHz ( 1), wanted freq  600 MHz
changing clock speed from 800 MHz to 600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   7%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   7%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   4%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   7%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   8%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   7%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   9%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   7%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   7%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   3%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   7%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   3%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   5%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   9%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   3%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   7%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load  10%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load  10%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   7%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   7%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   4%, current freq  600 MHz ( 2), wanted freq  600 MHz
load  75%, current freq  600 MHz ( 2), wanted freq 1200 MHz
changing clock speed from 600 MHz to 1000 MHz
load 293%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 364%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 382%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 373%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 254%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 248%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 250%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 269%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 370%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 345%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 282%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 250%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 276%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 254%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 251%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 258%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 267%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 273%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 238%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 270%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 267%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 273%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 264%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 276%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 249%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 241%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 254%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 266%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 254%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 250%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 247%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 257%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 288%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 263%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 241%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 273%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 257%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 264%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 256%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 263%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 256%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 254%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 257%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 248%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 263%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 261%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 264%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 261%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 254%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 261%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 261%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 272%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 241%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 254%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 247%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 260%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 258%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 244%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 251%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load  85%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load   4%, current freq 1000 MHz ( 0), wanted freq 1937 MHz
load 138%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 316%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 322%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 322%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 307%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 316%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 330%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 331%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 313%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 313%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 325%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 325%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 319%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 316%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 322%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 316%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 316%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 335%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 332%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 342%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 317%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 338%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 326%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 330%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 313%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 337%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 391%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 394%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 397%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 397%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 397%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 394%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 394%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 397%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 397%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 400%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 319%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 100%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 101%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 103%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 112%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 108%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 100%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 105%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 110%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 172%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 208%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 201%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 210%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 185%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 204%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 185%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 203%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 190%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 136%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 104%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 100%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load 103%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load  96%, current freq 1000 MHz ( 0), wanted freq 2000 MHz
load  13%, current freq 1000 MHz ( 0), wanted freq 1937 MHz
load   8%, current freq 1000 MHz ( 0), wanted freq 1876 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1817 MHz
load   4%, current freq 1000 MHz ( 0), wanted freq 1760 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1705 MHz
load   8%, current freq 1000 MHz ( 0), wanted freq 1651 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1599 MHz
load   4%, current freq 1000 MHz ( 0), wanted freq 1549 MHz
load   3%, current freq 1000 MHz ( 0), wanted freq 1500 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1453 MHz
load   3%, current freq 1000 MHz ( 0), wanted freq 1407 MHz
load   7%, current freq 1000 MHz ( 0), wanted freq 1363 MHz
load   4%, current freq 1000 MHz ( 0), wanted freq 1320 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1278 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1238 MHz
load   7%, current freq 1000 MHz ( 0), wanted freq 1199 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1161 MHz
load   4%, current freq 1000 MHz ( 0), wanted freq 1124 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1088 MHz
load   5%, current freq 1000 MHz ( 0), wanted freq 1054 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq 1021 MHz
load   4%, current freq 1000 MHz ( 0), wanted freq  989 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq  958 MHz
load   4%, current freq 1000 MHz ( 0), wanted freq  928 MHz
load   3%, current freq 1000 MHz ( 0), wanted freq  899 MHz
load   4%, current freq 1000 MHz ( 0), wanted freq  870 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq  842 MHz
load   4%, current freq 1000 MHz ( 0), wanted freq  815 MHz
load   0%, current freq 1000 MHz ( 0), wanted freq  789 MHz
changing clock speed from 1000 MHz to 800 MHz
load   6%, current freq  800 MHz ( 1), wanted freq  764 MHz
load   0%, current freq  800 MHz ( 1), wanted freq  740 MHz
load   5%, current freq  800 MHz ( 1), wanted freq  716 MHz
load   3%, current freq  800 MHz ( 1), wanted freq  693 MHz
load   4%, current freq  800 MHz ( 1), wanted freq  671 MHz
load   0%, current freq  800 MHz ( 1), wanted freq  650 MHz
load   4%, current freq  800 MHz ( 1), wanted freq  629 MHz
load   3%, current freq  800 MHz ( 1), wanted freq  609 MHz
load   9%, current freq  800 MHz ( 1), wanted freq  600 MHz
changing clock speed from 800 MHz to 600 MHz
load   4%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   4%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   3%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   3%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   4%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   4%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   6%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   5%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   3%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   4%, current freq  600 MHz ( 2), wanted freq  600 MHz
load   0%, current freq  600 MHz ( 2), wanted freq  600 MHz
load  25%, current freq  600 MHz ( 2), wanted freq  600 MHz
^Ctotal joules used: 73.271


I will post back with the APU4D4 and see the difference.
#9
You're totally right and fixed it.
#10
I wouldn't say that NAT always reduces throughput as it depends on what devices are used.
APUs and lot of other cheap and low powered devices do have issues with NAT yes - it was the main reason why I ditched many consumer grade routers when I got fiber 1 Gbps at home 4 years ago.  Back then, only the Linksys 3200ACM was able to keep up the speed with NAT active... until mysteriously - like hundreds of other people that posted on the Linksys forums - connections started to drop randomly and Internet connectivity became a nightmare.

That's when I started looking for something better and I ended up with pfSense on a SG-3100 two years ago.  All my problem were solved and still are up to this day.

#11
Haven't had the time to setup the APU, but I re-did the same test under ESXi because I was curious of the performance I could reach.

The ESXi 6.7 host is a Ryzen 2700X processor with 32GB and it's storage hooked on a networked FreeNAS.  All four vms were running on it with 2 vcpu and 4 GB RAM each.

The virtual switch bandwidth from svr1 to svr2 direct iperf3 bandwidth was ~24 Gbps.
Then the same flow but svr1 having to pass through fw1 (NAT+Rules), then fw2 (NAT+Rules) then reaching svr2 gave an iperf3 bandwidth of ~4Gbps.

That's a far cry from what I've achieved on faster hardware under VirtualBox lol.

On another subject: I had an issue with this setup under ESXi as the Automatic NAT rules weren't generated for some reason on both firewalls (they were under VirtualBox though). I find that odd, but I recall a few weeks ago while I was giving a class at the college and was using OPNsense for setting up an OpenVPN vpn with my students I was seeing internal network address reaching my firewall WAN port.  The day before, I wasn't seeing this and I didn't change the setup, so I blamed VirtualBox for the problem... but now, I see the same behavior under ESXi and I am wondering if there is an issue with the automatic Outbound NAT rules generation somehow.  What is causing this behavior?
#12
Be cautious about what you read on single threaded process being a limiting factor. 

When a single traffic flow enters a device, the ASIC performs the heaving lifting most of the time.  The power required afterward to analyze, route, NAT, etc, that traffic is done most of the time by a cpu core (or 1 thread) somewhere up the process stack.
But that process cannot be well distributed (or parallelized) on many threads (cores) for a single traffic flow - it would be  inefficient in the end since the destination is the same for all the threads and they would have to 'wait' after each other and thus slowing other traffic flow that requires processing.

When multiple traffic flows are entering the same device, of course the other cpu cores will be used to handle the load appropriately.

The only ways to optimize or accelerate single traffic flow on a cpu core are:
-good and optimized network code
-the appropriate network drivers that 'talk' to the NIC
-speedier cpu core (aka higher frequency (GHz/MHz)

A comparison of this behavior is the same kind of (wrong) thinking that people think about link aggregation: if we bundled 4 x 1 Gbps links together, people will think that their new speed for single flow traffic is now 4 Gbps and they are surprised to see that their max speed is still only 1 Gbps because of 'slow' 1-link wire is 1 Gbps.  On multiple traffic flows, then the compound traffic will reach the 4 Gbps speed because now each one of the 1 Gbps links are being used.

I hope that clears up some confusion.

But in the end, there is definitely something not running properly on both OPNsense and pfSense on those APU boards.
The APU's hardware is ok - many and I have showed that.
So what remains are:
a) bad drivers for the Intel 210/211 NICs
b) bad code optimization (the code itself or the inability to make the cpu core reach its 1.4Ghz turbo speed),
c) both a & b

The Netgate SG-3100 that I have has an ARM Cortex-A9 which is a dual-core cpu running at 1.6Ghz and its able to keep that 1 Gbps speed.  And we saw above that pfSense if somewhat faster on the APU compared to OPNsense.  IMO, I really think we are facing a NIC driver issue from FreeBSD for the Intel 210/211 chipset.
#13
I've setup another test lab (under VirtualBox) to test the iperf3 speed between 2 Ubuntu server each behind an OPNsense 19.7.8 (fresh update from tonight!).  All VMs are using 4 vcpu and 4 GB of RAM.

-First iperf3 test (60 seconds, 1 traffic flow):
The virtual switch performance between SVR1 and SVR2 connected together yields ~2.4Gbps of bandwidth

-Second iperf3 test (60 seconds, 1 traffic flow):
This time, SVR1 is behind FW1 and SVR2 is behind FW2.  Both FW1 and FW2 are connected directly on the same virtual switch. Minimum rules are set to allow connectivity between SVR1 and SVR2 for iperf3.  Both FW1 and FW2 are NATing outbound connectivity. The performance result yields ~380Mbps.

-Third iperf3 test with PPPoE (60 seconds, 1 traffic flow):
FW1 has the PPPoE Server plugin installed and configured.  FW2 is the PPPoE client that will initiate the connection. The performance result yields ~380Mbps.

-Fourth iperf3 test with PPPoE (60 seconds, 2 traffic flow): ~380Mbps

-Fifth iperf3 test with PPPoE (60 seconds, 4 traffic flow): ~390Mbps

So unless I missed something, PPPoE connectivity doesn't affect network speed as I mentionned earlier.

I will try to replicate the same setup but with 2 x APU2 and post back the performance I get.
#14
My APU2 is connected via a CAT6a Ethernet cable to the ISP's modem, which in turn is connected via another CAT6a Ethernet cable to the Fiber Optic transceiver. Then the connection between the ISP's modem is done via PPPoE (which I don't managed - it's done automatically and setup by the ISP).

So the APU2 isn't doing the PPPoE connectivity (as it would have been in this typical scenario 15 years ago via DSL for example) and it is a good thing.  Now if your setup requires the APU2 to perform the PPPoE connectivity, that doesn't really impact the transmission speed.
#15
This topic didn't received much love since the last few months, but I can attest the issue is still present: OPNsense 19.7 on APU2 cannot reach 1Gbps from WAN to LAN with default setup on a single traffic flow.

So I dug around and found a few threads here and there about this, and finally found this topic to which I am replying.  I saw many did some tests, saw the proposed solution at TekLager, etc, but they don't really adress the single flow issue.

I've read about the mono-thread vs multi-thread behavior of the *BSD vs Linux, but single flow traffic will only use 1 thread anyway so I had to discard that too as a probable cause.

I then decided to make my own tests and see if this was related to a single APU2 or all of them.  I've tested 3 x APU2 with different firewall and this is the speed I get with https://speedtest.net (with NATing enable of course):

OPNsense   down: ~500 Mbps    up: ~500 Mbps
pfSense      down: ~700 Mbps    up: ~700 Mbps
OpenWRT   down: ~910 Mbps    up: ~910 Mbps
IPFire         down: ~910 Mbps    up: ~910 Mbps

pfSense on Netgate 3100   down: ~910 Mbps    up:~910 Mbps

My gaming PC (8700k) connected directly into the ISP's modem     down: ~915 Mbps  up:~915 Mbps

I also did some tests by virtualizing all these firewalls (except OpenWRT) on my workstation (AMD 3950X) with VirtualBox (Type 2 Hypervisor - not the best I know didn't had the time to setup something on the ESXi cluster) and you can substract ~200Mbs from all the speeds above.  So that means, even virtualized, IPfire is faster than both OPNsense and pfSense running of the APU2.  I also saw that all of them are using only ONE thread and using almost the same amount of CPU% when the transfer is going on.

My conclusions so far are these:
-The PC Engine APU2 is not the issue - probably a driver issue for OPNsense/pfSense
-Single threaded use for single traffic flow is not the issue either since some firewalls are able to max the speed on 1 thread
-pfSense is still based on FreeBSD which has one the best network stack in the world but it might not use the proper drivers for the NICs on the APU - that's my feeling but can't check this.
-OPNsense is now based on HardenBSD (which is a fork of FreeBSD) and add lots of exploit mitigations directly into the code. Those security enhancements might be the issue with the APU2 slow transfer speed.  OPNsense installed on premise with a ten year old Xeon X5650 (2.66Ghz) can run at 1 Gbps without breaking a sweat.  So maybe a few MHz more are required for OPNsense to max that 1 Gbps pipe.
-OpenWRT and IPFire are Linux based and they benefit from a much broader 'workforce' optimizing everything around them.  NICs are probably detected properly and the proper drivers are being used + the nature of how Linux works could also help in speeding everything a little bit more. And the Linux kernel is a dragster vs FreeBSD kernel (sorry FreeBSD but I still love you since I am wearing your t-shirt today!!).

My next steps would be if I have time, to do direct speed tests internally with iperf3 in order to have another speed chart I can refer too.

Edit: FreeBSD vs HardenedBSD Features Comparison https://hardenedbsd.org/content/easy-feature-comparison

Edit 2: Another thing that came to my mind is the ability of the running OS (in our case OPNsense) to be able to 'turbo' the cores up to 1.4Ghz on the AMD GX-412TC cpu that the APU2 uses.  The base frequency is 1Ghz but with turbo it can reach 1.4Ghz.  I am running the 4.10 latest firmware, but I can't (don't know how) to validate what frequency is being used when doing a transfer.  That would really justify the difference in transfer speed as to why OPNsense can't max a 1 Gbps link while others can.  Link on how to upgrade the bios in the APU2 : https://teklager.se/en/knowledge-base/apu-bios-upgrade/