Temperature: Dashboard Temps differ massively from CLI

fastboot · December 01, 2024, 02:15:53 PM

Hi folks,

I am running a Protectli 6630 with a "12th Gen Intel(R) Core(TM) i3-1215U (6 cores, 8 threads)".

When I reload the dashboard I get the following:

Code Select

CPU 0 = 81°C
CPU 1 = 81°C
CPU 2 = 58°C
CPU 3 = 58°C
CPU 4 = 79°C
CPU 5 = 79°C
CPU 6 = 79°C
CPU 7 = 79°C
CPU 8 = 79°C

When just ideling on the Dashboard its:

Code Select

CPU 0 = 58°C
CPU 1 = 58°C
CPU 2 = 54°C
CPU 3 = 54°C
CPU 4 = 58°C
CPU 5 = 58°C
CPU 6 = 58°C
CPU 7 = 58°C
CPU 8 = 58°C

In the CLI I issued two commands, which also differ.

sysctl dev.cpu | grep temperature | sort

Code Select

dev.cpu.0.temperature: 40.0C
dev.cpu.1.temperature: 40.0C
dev.cpu.2.temperature: 35.0C
dev.cpu.3.temperature: 35.0C
dev.cpu.4.temperature: 40.0C
dev.cpu.5.temperature: 40.0C
dev.cpu.6.temperature: 40.0C
dev.cpu.7.temperature: 40.0C

sysctl -a | grep temperature | sort

Code Select

dev.cpu.0.temperature: 49.0C
dev.cpu.1.temperature: 49.0C
dev.cpu.2.temperature: 41.0C
dev.cpu.3.temperature: 42.0C
dev.cpu.4.temperature: 50.0C
dev.cpu.5.temperature: 52.0C
dev.cpu.6.temperature: 52.0C
dev.cpu.7.temperature: 52.0C

1. Could you tell me please why the temperatures Dashboard and CLI differ that much?
2. I do not understand the difference between the CLI commands.
3. On which values can I rely?

Thanks a lot in advance.

Cheers,

fb

meyergru · December 01, 2024, 04:31:29 PM

Use your google-fu and you will find a ton of threads about it.

AhnHEL · December 01, 2024, 06:57:35 PM

I think meyergru is tired of repeating his answers and he's not wrong.

You are using cpu cycles to run that command and one is a lot more intensive than the other which is why there is a difference in the temps. That answers questions 1 and 2.

To answer your third, I would rely on the "sysctl dev.cpu | grep temperature | sort" more since its showing the temperature with the command skewing the results the least.

meyergru · December 01, 2024, 07:42:42 PM

I did not want to sound rude, but sometimes I wished people would actually use the forum search or the tutorial section before asking questions. For obvious reasons, this particular topic has just made it into here, as point 17.

fastboot · December 03, 2024, 10:09:30 PM

Oh, I don't mind.

Actually the answers of the other threads do not really answer my questions. But well...
Maybe because its related to my lack of knowledge of Unix distros. In fact I come from the Linux side. There is no issue at all with lm-sensors for instance. Doesnt matter how many times I issue it or even have it inside of what so ever dashboard. Literally I am reporting real time data to Grafana of xx Servers without any tangible or measurable effect.

So surely I am still wondering why this has such a huge effect. For me its around 30-40°C for a simple temperature view.
So in fact the data is useless then to keep an eye on the temperatures. Sad but true.

At least the Link: https://forum.opnsense.org/index.php?topic=41759.0 enlightened me a bit. But not fully tbh.

And nice, that I made it in your top #17. Lemme try to go at least to the top three :p

AhnHEL · December 04, 2024, 02:22:56 AM

rickyricky does a good job of explaining it.

https://forum.opnsense.org/index.php?topic=36234.msg220563#msg220563

Quote from: rickyricky on November 24, 2024, 10:48:46 PM
you're querying the same values, but one method parses MUCH less data than the other...

Here is the dev.cpu method...
```
root@router-02:~ # sysctl dev.cpu | wc -l
273
```

Looking for temp by looking at only needs to export and grep through 273 lines.

Here is the sysctl -a method...
```
root@router-02:~ # sysctl -a | wc -l
16497
```

The sysctl has to export 16000+ more lines than the one that only looks at cpu values, then has to grep through those 16k values to find the ones that match the grep.

The 2nd command finishes quickly, but it still causes enough additional cpu load to show the temp has been raised by the time temperature is filtered out by the grep command.

passeri · December 04, 2024, 02:40:12 AM

Worrying about CPU temperature can be a little obsessive, and a little off the point.

If you are aware of the long run capacity of your system to dissipate heat, indicated by the average difference between stable CPU and ambient temperatures, then the focus should be on the ambient temperature. Take care of that and for all that its reading may bounce around, the CPU will look after itself.

OPNenthu · December 04, 2024, 05:36:23 AM

Some scenarios in which I think this might be not benign:

1) Newcomers to OPNsense, especially those with newly purchased devices, who don't have a prior idea of what a normal baseline temperature is. They come away with a false impression.

2) Small fanless devices (Intel N-series, for example) with limited thermal margins may be pushed closer to throttle limits.

3) Hardware vendor support channels receiving reports about temperature hikes on devices running recent OPNsense.

Having said that, I can see the great discussions and progress on the GitHub issue links. Many thanks :)

franco · December 04, 2024, 09:36:16 AM

> 1) Newcomers to OPNsense, especially those with newly purchased devices, who don't have a prior idea of what a normal baseline temperature is. They come away with a false impression.

Ok but this is directly tied to how some modern hardware seems to be built. This hasn't been an issue for a decade. My biggest gripe is changing the OS to accommodate how *some* hardware dissipates heat is a hilarious premise in order to match user expectations. But the sensor is really this hot so what reality do we want to live in?

> 2) Small fanless devices (Intel N-series, for example) with limited thermal margins may be pushed closer to throttle limits.

I think this is exactly my point. The N series in particular has been the source of a number of funky issues due to design and build choices. Fixing this in software has become the trend.

> 3) Hardware vendor support channels receiving reports about temperature hikes on devices running recent OPNsense.

But they would on any hardware given the circumstances? After all it *is* the correct temperature? Are we disputing this again?

We're changing this for 25.1 although some people have complained at the approach. This really is not to fix in any sensible way to mask a real world hardware quirk.

Cheers,
Franco

Patrick M. Hausen · December 04, 2024, 09:44:59 AM

Why not change the widget title to "CPU temperature" and use dev.cpu exclusively?

fastboot · December 04, 2024, 10:33:08 AM

Quote from: Patrick M. Hausen on December 04, 2024, 09:44:59 AM
Why not change the widget title to "CPU temperature" and use dev.cpu exclusively?

Very good point! I was just thinking to create a widget for me like that. Because at the end of the day I want to rely on data. If the data is not correct, I don't need it.

The other point about new hardware. Yes, my hardware is brand new. But actually I am not really a newcomer. It's just the hardware. I even discussed the temperatures with the vendor, like if they are normal, or if this is an issue with OPNsense itself.

Meanwhile I did some Tests with a Linux Distro, lm-sensors and stress-ng. The behavior is absolutely normal. I reach the ~80°C when I max out the cores for 30+ minutes. In idle it's in the range 30-35°C. I do this with any new hardware I build. Well, this one I did not build myself, but you get my point I suppose. When I stress test the CPU, also both FANs speed up and cool the system.

I really do appreciate the effort of the OPNsense team to create this lovely piece of software. I really do. But also do I appreciate accuracy and data I can rely on.

Patrick M. Hausen · December 04, 2024, 10:52:30 AM

Quote from: fastboot on December 04, 2024, 10:33:08 AM
Because at the end of the day I want to rely on data. If the data is not correct, I don't need it.

[...]

I really do appreciate the effort of the OPNsense team to create this lovely piece of software. I really do. But also do I appreciate accuracy and data I can rely on.

But the data shown is correct. In the moment the dashboard is rendered the CPU temperature is higher, because of the processing taking place to display the dashboard.

franco · December 04, 2024, 10:53:35 AM

> Why not change the widget title to "CPU temperature" and use dev.cpu exclusively?

These are the ones with the "bad" reading? :) You may be thinking of solving an adjacent issue which also plays into the fact the kernel fakes amdtemp-per-CPU temperatures for no apparent reason.

Cheers,
Franco

Patrick M. Hausen · December 04, 2024, 10:58:14 AM

Quote from: franco on December 04, 2024, 10:53:35 AM
> Why not change the widget title to "CPU temperature" and use dev.cpu exclusively?

These are the ones with the "bad" reading? :) You may be thinking of solving an adjacent issue which also plays into the fact the kernel fakes amdtemp-per-CPU temperatures for no apparent reason.

I suggested this because apparently the processing effort of `sysctl -a | grep` is what increases the temperature in an attempt to catch every sensor that might be present while most people will be satisfied to monitor CPU temperature alone.

Interestingly I notice almost nothing of this effect on my hardware:

Code Select

root@opnsense:~ # sysctl dev.cpu.0.temperature
dev.cpu.0.temperature: 44.1C
root@opnsense:~ # sysctl -a | grep dev.cpu.0.temperature
dev.cpu.0.temperature: 44.6C

franco · December 04, 2024, 11:11:20 AM

# sysctl dev.cpu | grep temperature

This may be an option losing track of all other sensors, but only if it doesn't heat everything up for the people experiencing this? ;)

Temperature: Dashboard Temps differ massively from CLI

fastboot

December 01, 2024, 02:15:53 PM

meyergru

December 01, 2024, 04:31:29 PM #1

AhnHEL

December 01, 2024, 06:57:35 PM #2 Last Edit: December 01, 2024, 09:20:05 PM by AhnHEL

meyergru

December 01, 2024, 07:42:42 PM #3

fastboot

December 03, 2024, 10:09:30 PM #4

AhnHEL

December 04, 2024, 02:22:56 AM #5

passeri

December 04, 2024, 02:40:12 AM #6

OPNenthu

December 04, 2024, 05:36:23 AM #7

franco

December 04, 2024, 09:36:16 AM #8

Patrick M. Hausen

December 04, 2024, 09:44:59 AM #9

fastboot

December 04, 2024, 10:33:08 AM #10

Patrick M. Hausen

December 04, 2024, 10:52:30 AM #11

franco

December 04, 2024, 10:53:35 AM #12

Patrick M. Hausen

December 04, 2024, 10:58:14 AM #13

franco

December 04, 2024, 11:11:20 AM #14