SMART not working correctly?

Started by 0xDEADC0DE, February 10, 2025, 11:39:19 PM

Previous topic - Next topic
I have the SMART status of my HDD on my dashboard.


It shows, okay, but the disk has many errors. I only recognized it when I updated to 25.1 and took around 1 hour to install and reboot.
Is it a bug in the SMART plugin?

Error 623 occurred at disk power-on lifetime: 23008 hours (958 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 e0 f5 18 40  Error: UNC at LBA = 0x0018f5e0 = 1635808

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 48 e0 f5 18 0a 08   1d+02:41:17.786  READ FPDMA QUEUED
  60 08 48 e0 f5 18 0a 08   1d+02:41:17.785  READ FPDMA QUEUED
  60 40 38 e8 73 01 00 08   1d+02:41:17.785  READ FPDMA QUEUED
  61 40 30 e8 73 01 00 08   1d+02:41:17.785  WRITE FPDMA QUEUED
  61 40 28 a8 0b 00 00 08   1d+02:41:17.784  WRITE FPDMA QUEUED

Error 622 occurred at disk power-on lifetime: 23008 hours (958 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 01 e0 f5 18 40  Error: UNC at LBA = 0x0018f5e0 = 1635808

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 40 e0 f5 18 0a 08   1d+02:40:41.447  READ FPDMA QUEUED
  60 08 38 c8 60 17 0a 08   1d+02:40:41.442  READ FPDMA QUEUED
  60 08 30 f8 04 1c 0a 08   1d+02:40:41.438  READ FPDMA QUEUED
  60 40 28 e8 be 16 0a 08   1d+02:40:41.433  READ FPDMA QUEUED
  60 08 20 e8 3b c3 0d 08   1d+02:40:41.423  READ FPDMA QUEUED

Error 621 occurred at disk power-on lifetime: 23008 hours (958 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 01 5f e7 2f 4a  Error: UNC 1 sectors at LBA = 0x0a2fe75f = 170911583

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 01 5f e7 2f 0a 08   1d+02:40:30.984  READ DMA
  c8 00 01 5f e7 2f 0a 08   1d+02:40:24.877  READ DMA
  c8 00 01 5f e7 2f 0a 08   1d+02:40:18.729  READ DMA
  c8 00 01 5f e7 2f 0a 08   1d+02:40:12.657  READ DMA
  c8 00 01 5f e7 2f 0a 08   1d+02:40:06.568  READ DMA

Error 620 occurred at disk power-on lifetime: 23008 hours (958 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 01 5f e7 2f 4a  Error: UNC 1 sectors at LBA = 0x0a2fe75f = 170911583

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 01 5f e7 2f 0a 08   1d+02:40:24.877  READ DMA
  c8 00 01 5f e7 2f 0a 08   1d+02:40:18.729  READ DMA
  c8 00 01 5f e7 2f 0a 08   1d+02:40:12.657  READ DMA
  c8 00 01 5f e7 2f 0a 08   1d+02:40:06.568  READ DMA
  c8 00 01 5e e7 2f 0a 08   1d+02:40:00.468  READ DMA

Error 619 occurred at disk power-on lifetime: 23008 hours (958 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 01 5f e7 2f 4a  Error: UNC 1 sectors at LBA = 0x0a2fe75f = 170911583

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 01 5f e7 2f 0a 08   1d+02:40:18.729  READ DMA
  c8 00 01 5f e7 2f 0a 08   1d+02:40:12.657  READ DMA
  c8 00 01 5f e7 2f 0a 08   1d+02:40:06.568  READ DMA
  c8 00 01 5e e7 2f 0a 08   1d+02:40:00.468  READ DMA
  c8 00 01 5e e7 2f 0a 08   1d+02:39:54.317  READ DMA

It looks like the disk/storage subsystem has an actual problem.
It can happen with a connection not making good contact, say the device was moved, or a cable was moved.
My suggestion is to get asap a backup of the config and then proceed to reseat connections. You need to power it down, then reseat, just to ensure a firm connection, then power on again. The config backup is in case the disk is actually dying and this action is its last. Then you need to reinstall and restore the configuration file.
Naturally if you still have shell, you can ask smart as well:
$ sudo smartctl -a /dev/ada0

The smart plugin will only show the smart status that smartctl would show (which you have ommited). If despite those past errors, it shows OK, then the GUI will show the same. So, the correct question would probably be: Why does my drive show an OK status despite having obvious errors?
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 770 up, Bufferbloat A

Quote from: cookiemonster on Today at 11:06:11 AM[...]
Naturally if you still have shell, you can ask smart as well:
[...]

...Or "Services: SMART" from the GUI.

I never look at the "Health"; mostly "Attributes".