CPU temp reporting

Started by Dimi3, October 01, 2023, 09:39:00 AM

Previous topic - Next topic
@meyergru  It looks like that second CPU0 belongs to dev.pchtherm.0 but then why does the formatting in the widget express it as CPU0?
AhnHEL (Angel)

I merged this now since I haven't seen a better approach yet:

https://github.com/opnsense/core/commit/eded37411f

It would be nice if configd had a "prefetch" and "serve expired" type of metric but in the average case this is more than enough. A suboptimal workaround can be discussed, but it will always look a bit strange in my opinion (running random actions at boot to fix an edge case on certain hardware).


Cheers,
Franco

@franco I applied the patch, can it omit the tjmax graph outputs?  Its not easy on the eyes.  I'm still having issue with the pchtherm.0 section that shows up as another CPU0.
AhnHEL (Angel)

Can you dump your

# configctl system sensors

here?

The 100 degree ones look like thresholds from Intel CPU maybe. I don't have the hardware to verify.

The widget is another issue as the patch only deals with the backend. It would be nicer if the widget would not do too much magic here guestimating where the reading comes from and making similar readings collapsible maybe.

But we can get there step by step. Thanks for testing!


Cheers,
Franco





Thank you, yes.

# configctl system sensors
dev.cpu.0.coretemp.tjmax
dev.cpu.0.temperature
dev.cpu.1.coretemp.tjmax
dev.cpu.1.temperature
dev.cpu.10.coretemp.tjmax
dev.cpu.10.temperature
dev.cpu.11.coretemp.tjmax
dev.cpu.11.temperature
dev.cpu.2.coretemp.tjmax
dev.cpu.2.temperature
dev.cpu.3.coretemp.tjmax
dev.cpu.3.temperature
dev.cpu.4.coretemp.tjmax
dev.cpu.4.temperature
dev.cpu.5.coretemp.tjmax
dev.cpu.5.temperature
dev.cpu.6.coretemp.tjmax
dev.cpu.6.temperature
dev.cpu.7.coretemp.tjmax
dev.cpu.7.temperature
dev.cpu.8.coretemp.tjmax
dev.cpu.8.temperature
dev.cpu.9.coretemp.tjmax
dev.cpu.9.temperature
dev.pchtherm.0.ctt
dev.pchtherm.0.pmtemp
dev.pchtherm.0.t0temp
dev.pchtherm.0.t1temp
dev.pchtherm.0.t2temp
dev.pchtherm.0.temperature
AhnHEL (Angel)

Thanks I added this one to ignore the coretemp threshold:

https://github.com/opnsense/core/commit/d314680

What I really dislike is FreeBSD kernel faking the CPU temperatures for amdtemp which isn't even per core, but was made this way to match the coretemp behaviour (which is also quite bulky with lots of cores).

pchtherm(4) is also annoying with only one reading and the rest being thresholds which are hard to separate from each other.


Cheers,
Franco

The 'pchtherm0.temperature' is the only relevant reading so the others should just be omitted/ignored for the widget.  Then again if the dev.cpu readings are available, do we even need the pchtherm at all for the widget?
AhnHEL (Angel)

Yes, the whole point was being inclusive of unknown sensors that may be useful and as you can see it's a wild west of interesting and not so interesting metrics in incoherent form.

I think we should make a default filter in the widget and go from there. Users can then adjust if needed. Adding too much glue in the backend is probably not a good approach (and it already filters a bit much for technical reasons, oh well).


Cheers,
Franco