OPNsense Forum

English Forums => Hardware and Performance => Topic started by: thowe on January 30, 2021, 12:09:42 pm

Title: [SOLVED] APU - Is your CPU clock stuck at low frequency after some time?
Post by: thowe on January 30, 2021, 12:09:42 pm
The APU2/3/4 devices are predestined for use as routers under OPNsense: Inexpensive reliable hardware with open source BIOS. Accordingly, it is often used by users here in the forum.

The performance of the used SOC AMD GX412TC is certainly much lower compared to i3 or i5 systems. However, I can say from experience that they are more than sufficient in many cases and more performance is simply not necessary. And that's why I prefer such a cost-effective and power-saving device whenever possible.

However, questions have been piling up here and in other forums over the past few months as to why APU-based firewall performance dips and doesn't seem to be enough for some, while it's fast enough for others in a similar context.

Of course, this always depends on the individual case: What services are enabled on the firewall? Does WAN have to be connected via PPPoE? How large is the MTU? Are the correct tunables set? Etc.

Meanwhile I am not sure if there is not one more aspect: I have two APU2 in use (APU2C4 and APU2E4). I noticed that they are really responsive and powerful after a reboot. After some time (several hours), they become noticeably sluggish and the CPU utilization suddenly seems higher.

I then noticed that after a reboot, the APU can scale up the frequency to the nominal maximum, which is 1GHZ. After a while, however, the maximum achievable frequency seems to be reduced to 600MHz. This then remains the same until a reboot. After the reboot, the 1GHz can be reached again for a certain time. Until suddenly the 600MHz limit applies again.

I would be interested to know how the situation looks like with your APU. If you want to participate, you can post your observations here. In the following I describe how you can determine the maximum busy clock.


You need console or SSH access to your OPNsense:

All the measurements are done on the console or in an SSH shell on the OPNsense. If you do not have access to the console, you can set up SSH access as follows:
Note: After taking the measurements, the access can be deactivated again for security reasons.


You need to install and use the tool turbostat:

The measurements are done with the tool turbostat, which can be installed as follows:

Code: [Select]
pkg add http://pkg0.isc.freebsd.org/FreeBSD:12:amd64/latest/All/turbostat-4.17_2.txz
rehash

Before using turbostat you have to load the kernel module cpuctl once before doing measurements:

Code: [Select]
kldload cpuctl

A measurement series is started as follows:
Code: [Select]
turbostat --interval 3
After that the tool prints the CPU statistics every 3 seconds.

After a reboot everything runs normally and the output shows that the Bzy_MHz is near 1GHz:
Code: [Select]
root@router:~ # turbostat --interval 3
turbostat version 17.06.23 - Len Brown <lenb@kernel.org>
turbostat: /dev/cpuctl0 missing, kldload cpuctl: No such file or directory
root@router:~ # kldload cpuctl
root@router:~ # turbostat --interval 3
turbostat version 17.06.23 - Len Brown <lenb@kernel.org>
CPUID(0): AuthenticAMD 13 CPUID levels; family:model:stepping 0xf:30:1 (15:48:1)
CPUID(1): SSE3 MONITOR - - - TSC MSR - -
CPUID(6): APERF, No-TURBO, No-DTS, No-PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, No-EPB
CPUID(7): No-SGX
NSFOD /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       355     38.76   915     998     170
0       0       247     28.12   880     998     32
1       1       222     25.44   871     998     69
2       2       414     44.66   928     998     43
3       3       536     56.82   943     998     26
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       313     34.39   910     998     334
0       0       410     44.19   928     998     184
1       1       336     36.68   915     998     80
2       2       236     26.66   885     998     49
3       3       269     30.02   898     998     21
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       267     29.60   904     998     247
0       0       520     54.88   947     998     68
1       1       289     31.59   914     998     82
2       2       127     15.56   813     998     42
3       3       135     16.36   825     998     55
^C

When the OPNsense has been running for a few hours, the output shows that the Bzy_MHz is below 600MHz (even under maximum load):
Code: [Select]
root@router:~ # turbostat --interval 3
turbostat version 17.06.23 - Len Brown <lenb@kernel.org>
CPUID(0): AuthenticAMD 13 CPUID levels; family:model:stepping 0xf:30:1 (15:48:1)
CPUID(1): SSE3 MONITOR - - - TSC MSR - -
CPUID(6): APERF, No-TURBO, No-DTS, No-PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, No-EPB
CPUID(7): No-SGX
NSFOD /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       52      8.61    599     998     162
0       0       78      13.01   599     998     37
1       1       37      6.21    599     998     69
2       2       44      7.40    599     998     35
3       3       47      7.82    599     998     21
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       57      9.56    598     997     208
0       0       87      14.48   598     996     51
1       1       45      7.53    598     996     79
2       2       55      9.15    599     998     62
3       3       42      7.09    599     998     16
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       59      9.81    599     999     297
0       0       94      15.67   600     1000    67
1       1       45      7.56    600     1000    104
2       2       60      9.97    599     998     68
3       3       36      6.01    599     998     58
^C

When you report your observations, it would be interesting to know which BIOS version you have installed (can be conveniently viewed in the Hardware Information widget) and whether you have the Core Performance Boost feature set to enabled or disabled in the BIOS (in newer BIOSes, the default value is enabled).
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: hushcoden on January 31, 2021, 02:34:04 pm
I've got an APU2E4 running firmware v4.13.0.2 and the core performance boost feature is enabled - OPNsense 21.1-amd64

Code: [Select]
root@hush:/ # turbostat --interval 3
turbostat version 17.06.23 - Len Brown <lenb@kernel.org>
CPUID(0): AuthenticAMD 13 CPUID levels; family:model:stepping 0xf:30:1 (15:48:1)
CPUID(1): SSE3 MONITOR - - - TSC MSR - -
CPUID(6): APERF, No-TURBO, No-DTS, No-PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, No-EPB
CPUID(7): No-SGX
NSFOD /sys/devices/system/cpu/cpu2/cpufreq/scaling_driver
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       436     46.23   944     998     2631
0       0       397     42.67   929     998     153
1       1       432     45.78   944     998     2270
2       2       404     42.63   947     998     99
3       3       513     53.83   953     998     109
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       431     45.38   951     998     671
0       0       351     37.51   936     998     48
1       1       527     55.12   957     998     454
2       2       375     39.53   950     998     65
3       3       472     49.38   955     998     104
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       314     32.95   954     998     1267
0       0       204     22.01   925     998     171
1       1       358     37.30   959     998     701
2       2       264     27.91   947     998     195
3       3       431     44.56   967     998     200
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       301     31.59   953     998     1600
0       0       219     23.64   928     998     67
1       1       306     31.95   957     998     1278
2       2       170     18.10   937     998     132
3       3       510     52.68   967     998     123
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       269     27.81   966     998     783
0       0       279     29.08   960     998     145
1       1       350     36.08   969     998     374
2       2       235     24.30   966     998     135
3       3       211     21.78   969     998     129
^C
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: Patrick M. Hausen on January 31, 2021, 03:35:12 pm
Looks like this on my apu4d4:
Code: [Select]
Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz IRQ
- - 87 14.47 599 998 7731
0 0 28 4.71 599 998 1023
1 1 42 6.95 599 998 2394
2 2 147 24.60 599 998 543
3 3 129 21.62 599 998 3771

BIOS is coreboot v4.13.0.2.
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: thowe on January 31, 2021, 08:39:22 pm
Thanks! Interesting!

@hushcoden: Your CPU seems to run normal - up to 1GHz. How long has it been running, when you run turbostat?

@pmhausen: Your CPU seems to be stuck at 600 MHz. As are mine after some time after a reboot.

The question is: What are the differences between the setups?

What I can say: I have tried multiple BIOS versions without any difference. And even disabling Core Power Boost in BIOS didn't help. CPU is still stuck a 600MHz after one or two hours.

Maybe we will find the root cause.
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: Patrick M. Hausen on January 31, 2021, 09:15:58 pm
Code: [Select]
root@opnsense:~ # uptime
 9:14PM  up  5:34, 1 user, load averages: 0.34, 0.42, 0.36
root@opnsense:~ # turbostat --interval 2
turbostat version 17.06.23 - Len Brown <lenb@kernel.org>
CPUID(0): AuthenticAMD 13 CPUID levels; family:model:stepping 0xf:30:1 (15:48:1)
CPUID(1): SSE3 MONITOR - - - TSC MSR - -
CPUID(6): APERF, No-TURBO, No-DTS, No-PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, No-EPB
CPUID(7): No-SGX
NSFOD /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver
Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz IRQ
- - 409 43.61 938 998 7590
0 0 143 15.96 896 998 564
1 1 528 55.93 944 998 2895
2 2 494 52.45 942 998 822
3 3 472 50.11 941 998 3309
^C

What I changed was: configure powerd to use "Maximum" instead of "Hiadaptive".

I'll check again tomorrow in the morning.
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: thowe on January 31, 2021, 09:21:09 pm
That is interesting! Thanks.
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: hushcoden on January 31, 2021, 10:04:49 pm
I've attached my power savings settings and my tunables.

Code: [Select]
root@hush:/ # uptime
 9:02PM  up 2 days,  2:11, 1 user, load averages: 1.01, 0.91, 0.85
root@hush:/ # turbostat --interval 3
turbostat version 17.06.23 - Len Brown <lenb@kernel.org>
CPUID(0): AuthenticAMD 13 CPUID levels; family:model:stepping 0xf:30:1 (15:48:1)
CPUID(1): SSE3 MONITOR - - - TSC MSR - -
CPUID(6): APERF, No-TURBO, No-DTS, No-PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, No-EPB
CPUID(7): No-SGX
NSFOD /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       323     34.65   932     998     4694
0       0       368     39.41   935     998     89
1       1       394     42.28   932     998     4448
2       2       331     35.47   934     998     89
3       3       197     21.45   921     998     68
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       178     19.21   926     998     1916
0       0       236     25.55   926     998     6
1       1       162     17.25   940     998     1742
2       2       190     20.53   924     998     154
3       3       124     13.52   914     998     14
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       219     23.63   927     998     3277
0       0       295     31.81   927     998     87
1       1       258     27.77   929     998     3176
2       2       120     12.70   941     998     13
3       3       204     22.25   916     998     1
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       138     14.68   938     998     4244
0       0       146     15.74   930     998     100
1       1       115     12.15   944     998     3978
2       2       163     17.21   945     998     166
3       3       128     13.63   936     998     0
^C
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: Patrick M. Hausen on February 01, 2021, 08:41:42 am
@thowe: update as promised.
Code: [Select]
Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz IRQ
- - 418 44.10 947 998 7750
0 0 236 25.09 940 998 534
1 1 238 26.08 911 998 1829
2 2 554 58.06 954 998 2052
3 3 643 67.16 958 998 3335

I immediately suspected powerd when I read your post, but changing the settings did not have an immediate effect without a reboot, so it took me some time to have an opportunity for that.

@hushcoden is not running powerd at all ... which would fit the picture.
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: thowe on February 01, 2021, 10:17:03 am
This would mean that the problem occurs in connection with powerd.

Short-term recommendation for maximum performance would be to switch off powerd or set it to profile Maximum. However, with the effect that unnecessarily much energy is consumed.

I have a ticket open with the BIOS developers and will provide the information there. Because the goal should be that we can use the power dynamically without burning energy senselessly.
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: thowe on February 01, 2021, 05:44:03 pm
I have now also set my powerd to maximum and restarted. Can confirm that the problem does not occur.

Code: [Select]
Core    CPU     Avg_MHz Busy%   Bzy_MHz TSC_MHz IRQ
-       -       523     55.64   940     998     536
0       0       431     46.42   927     999     194
1       1       486     52.01   935     998     123
2       2       619     65.20   950     998     92
3       3       557     58.92   945     998     127
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: Ricardo on February 06, 2021, 02:30:16 pm
This damn AMD jaguar SoC still has lot of problems even after so many years of maturity (this exact 600Mhz stuck problem seems recurring every single year, even if it was "solved" in coreboot multiple times).
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: thowe on February 06, 2021, 04:19:53 pm
I think it is not so clear what is the cause this time. Powerd? SOC? BIOS? Something else?
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: Patrick M. Hausen on February 06, 2021, 05:05:16 pm
If it works as expected without powerd then my personal solution is "don't run it, then". I mean, the APU series of devices are such low power platforms anyway, there's probably not much to gain.
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: dave on February 10, 2021, 06:48:54 pm
A ~12 watt device left running 24/7 is gonna cost you ~£10/year... so....

Thing I'd like to know is why, inspite of the newer BIOS's, the CPU still does boost to 1.4Ghz....?
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: thowe on February 10, 2021, 11:00:58 pm
Quote
Thing I'd like to know is why, inspite of the newer BIOS's, the CPU still does boost to 1.4Ghz....?

Yes, this should be the case if it is enabled in the BIOS. However, it is not easy to verify since the boost is CPU-internal and can hardly be observed from the outside. The easiest way to find out is with benchmarks, e.g. with stress-ng. There are typical bogo ops values for CPB on and off.
Title: Re: APU - Is your CPU clock stuck at low frequency after some time?
Post by: thowe on November 23, 2021, 09:28:35 am
Update:

Since BIOS version v4.14.0.1 (June 2021) the problem seems to be fixed:

Quote
with the recent v4.14.0.1 we have fixed some issues related to CPU boost and C-states which may help with the problem of idling CPUs and stuck frequencies. It should also improve the stability of the BSD systems.

coreboot didn't include the core C6 (CC6) save state memory in the memory map. OS could accidentally access this memory and overwrite core states. CC6 is required for CPU boost to work and is a lower power state for a core.

I can confirm that with the current version v4.14.0.6 I don't see the problem anymore. I.e. with the current BIOS you can turn on the powerd under OPNsense on an APU2 and leave it on highadaptive. The CPU adapts well to the load and always goes back to maximum frequency if necessary.

If you don't need gigabit internet, the APU2 is still an extremely proven and stable base for OPNsense. I have two of them.
Title: Re: [SOLVED] APU - Is your CPU clock stuck at low frequency after some time?
Post by: thowe on February 16, 2022, 01:52:19 pm
Originally it effectively looked like the problem was solved for quite some time.

To be sure, I just logged into the firewall again and looked.

Unfortunately, I find that despite new firmware, the problem is still there.

The BIOS developers have been informed. In the meantime better turn off Powerd or set it to maximum.
Title: Re: [SOLVED] APU - Is your CPU clock stuck at low frequency after some time?
Post by: johnmcallister on February 28, 2024, 04:27:23 am
I don't think I'm necro'ing this thread since I'm using identical hardware (APU2E4,) just on a later version of OPNsense which may have broken something.

My system:
Code: [Select]
root@myopnsenserouter:~ # uname -a
FreeBSD myopnsenserouter.mydomain.com 13.2-RELEASE-p10 FreeBSD 13.2-RELEASE-p10 stable/24.1-n254984-f7b006edfa8 SMP amd64

I installed turbostat, and after successfully loading the kernel module:

Code: [Select]
kldload cpuctl

when running turbostat it executes and appears to start OK, but after 5 seconds it core-dumps:

Code: [Select]
root@myopnsenserouter:~ # turbostat
turbostat version 17.06.23 - Len Brown <lenb@kernel.org>
CPUID(0): AuthenticAMD 13 CPUID levels; family:model:stepping 0xf:30:1 (15:48:1)
CPUID(1): SSE3 MONITOR - - - TSC MSR - -
CPUID(6): APERF, No-TURBO, No-DTS, No-PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, No-EPB
CPUID(7): No-SGX
NSFOD /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver
Floating exception (core dumped)


Here is the dmesg output:
Code: [Select]
pid 65735 (turbostat), jid 0, uid 0: exited on signal 8 (core dumped)
My intuition is that the device tree or device names have moved & turbostat is simply crashing because it can't find the device to sample?

Any further hints or comments as to how to fix this, or if turbostat's functionality is broken by changes in OPNsense over the past couple years?
Title: Re: [SOLVED] APU - Is your CPU clock stuck at low frequency after some time?
Post by: johnmcallister on February 28, 2024, 04:38:06 am
Replying further, as I found another relevant thread on this turbostat core-dump issue:

https://forum.opnsense.org/index.php?topic=30148.msg145554#msg145554 (https://forum.opnsense.org/index.php?topic=30148.msg145554#msg145554)

In my case, I think I did install the most-recent version of turbostat, but it's still crashing. Oh well, I guess no resolution at this time:

Code: [Select]
root@myopnsenserouter:~ #  file /usr/local/sbin/turbostat
/usr/local/sbin/turbostat: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 13.2, FreeBSD-style, stripped

root@myopnsenserouter:~ # uname -a
FreeBSD myopnsenserouter.mydomain.com 13.2-RELEASE-p10 FreeBSD 13.2-RELEASE-p10 stable/24.1-n254984-f7b006edfa8 SMP amd64