OPNsense Forum

English Forums => 24.7, 24.10 Legacy Series => Topic started by: retatefw on August 23, 2024, 08:03:42 AM

Title: Thermal Sensors widget display
Post by: retatefw on August 23, 2024, 08:03:42 AM
Thermal sensors widget display for an i7-13700K processor displays the 24 CPUs and it also displays a "CPU 50" and a "Zone 0". Does anyone know what "CPU 50" and "Zone 0" are reporting on?

Title: Re: Thermal Sensors widget display
Post by: doktornotor on August 23, 2024, 09:30:12 AM
Zone 0 looks like some normal thermal zone sensor. No idea about CPU 50. Anyway, the widget is only parsing info provided by kernel drivers and the hardware, see /usr/local/opnsense/scripts/system/temperature.sh
Title: Re: Thermal Sensors widget display
Post by: retatefw on August 24, 2024, 01:48:40 AM
Thanks for the pointer to the script. Executing the script returned the following for CPU 50 and Zone 0.

dev.t5nex.0.temperature=79
hw.acpi.thermal.tz0.temperature=27.9C

The t5nex is a thermal sensor on a Chelsio T520-BT ethernet NIC. I believe the temperature for this device is being reported in Fahrenheit based on a quick search (widget is reporting it as "C").
Title: Re: Thermal Sensors widget display
Post by: doktornotor on August 24, 2024, 09:19:55 AM
Quote from: retatefw on August 24, 2024, 01:48:40 AM
The t5nex is a thermal sensor on a Chelsio T520-BT ethernet NIC. I believe the temperature for this device is being reported in Fahrenheit based on a quick search (widget is reporting it as "C").

Well, I would not be so sure. (Use your finger or - safer - IR thermometer.) They do run HOT. Around 70C definitely not unusual. 79F ~= 26C: very unlikely.
Title: Re: Thermal Sensors widget display
Post by: franco on August 24, 2024, 09:30:17 AM
Well, the "C" temperature readings from sysctl are actually in Kelvin internally so:

> hw.acpi.thermal.tz0.temperature=27.9C

being

> dev.t5nex.0.temperature=79 (fahrenheit)

seems about the same

> Around 70C definitely not unusual. 79F ~= 26C: very unlikely

And the conversion matches between the two. But all of this could be coincidence.


Cheers,
Franco
Title: Re: Thermal Sensors widget display
Post by: franco on August 24, 2024, 09:32:14 AM
The "portable" way to get all valid Kelvin (actual temperature) readings is this...

# sysctl -aF | awk -F ": " '$2 ~ "^IK" { print $1 }' | grep -v "\._" | sort

Pretty sure it will stop listing dev.t5nex.0.temperature then.


Cheers,
Franco
Title: Re: Thermal Sensors widget display
Post by: doktornotor on August 24, 2024, 09:37:29 AM
Looking elsewhere (1 (https://forum.netgate.com/topic/168790/chelsio-t520-temp/3), 2 (https://forum.netgate.com/topic/100796/resolved-new-chelsio-t520-running-very-hot)) - as said, getting those to <30C would be quite a cooling effort.

Title: Re: Thermal Sensors widget display
Post by: franco on August 24, 2024, 09:42:31 AM
Then it's coincidence indeed or maybe even a bug. Someone should convert those temp readings to use the IK formatter in the kernel to be sure.


Cheers,
Franco
Title: Re: Thermal Sensors widget display
Post by: retatefw on August 25, 2024, 05:07:27 AM
Quote from: doktornotor on August 24, 2024, 09:19:55 AM
Quote from: retatefw on August 24, 2024, 01:48:40 AM
The t5nex is a thermal sensor on a Chelsio T520-BT ethernet NIC. I believe the temperature for this device is being reported in Fahrenheit based on a quick search (widget is reporting it as "C").

Well, I would not be so sure. (Use your finger or - safer - IR thermometer.) They do run HOT. Around 70C definitely not unusual. 79F ~= 26C: very unlikely.

IR thermometer confirmed the heatsink temperature in the 79C range which is well beyond the specified operating range maximum of 55C for the card. I replaced the card with an identical T520-BT that I had in spares, and it is running at 53C.

Thanks for the input on this and the creation of the temperature widget which has allowed me to identify an apparently failing card that I had no reason to suspect. The Chelsio cards are 5+ years old so I will retire both of them.
Title: Re: Thermal Sensors widget display
Post by: doktornotor on August 25, 2024, 10:02:48 AM
Quote from: retatefw on August 25, 2024, 05:07:27 AM
IR thermometer confirmed the heatsink temperature in the 79C range which is well beyond the specified operating range maximum of 55C for the card. I replaced the card with an identical T520-BT that I had in spares, and it is running at 53C.

Just noting here, those specs are "ambient" temperature. The vendor claims they'll survive up to 125C (measured by the sensor on the card). I guess if you manage to remove the heatsink, do some cleaning and apply some good thermal paste, they'll keep running a lot cooler for quite a while. IMHO they should be shipped with some fan but that'd cost couple of extra bucks...  ::)
Title: Re: Thermal Sensors widget display
Post by: Patrick M. Hausen on August 25, 2024, 12:55:08 PM
The 10G copper interfaces on my Supermicro boards also run at 70 Celsius and above. Although the systems are generally well ventilated.