Hello,
We are facing the same issue as https://forum.opnsense.org/index.php?topic=26661.0 (https://forum.opnsense.org/index.php?topic=26661.0) with version 22.7.4-amd64. Opnsense is working for ~20h, and then it's not responding anymore.
Error messages:
- Solaris: warning pool has encountered an uncorrectable io error suspended
- the console is showing a some CAM errors, device not ready, ahci reset and CAM time out...
It's really look like an hardware problem, but our disk was tested with a long smartctl test and there is no error.
Do you know how to resolve this issue ?
Cheers
Those zfs errors indicate either a bad disk or a bad connection to the drive. What SMART tests did you run? I would try reseating the cables connecting the disk for a start.
I run this one: smartctl -t long /dev/ada0
, zroot is on /dev/ada0p4 (there is only one drive by host).
I will try to reseat cables. But I run two servers with HA, and I randomly got the error on both.
What is the hardware you are using for the servers? Also, brand and model of the drives? Are you using a raid controller?
I have got the problem this morning too.
Hardware is custom:
No RAID (AHCI SATA)
Base Board Information
Manufacturer: MSI
Product Name: H81M-E34 (MS-7817)
CPU x1
Version: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz
Voltage: 1.2 V
External Clock: 100 MHz
Max Speed: 3800 MHz
Current Speed: 3700 MHz
Memory Device x2
Size: 4 GB
Type: DDR3
Type Detail: Synchronous
Speed: 1333 MT/s
Manufacturer: 0420
Rank: 1
Configured Memory Speed: 1333 MT/s
Minimum Voltage: 1.35 V
Maximum Voltage: 1.5 V
Configured Voltage: 1.5 V
Disk x1
SSD 64G
Just before the host crash, I have seen zfs errors... I don't see them after a reboot. I need to reinstall:
pool: zroot
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub repaired 0B in 00:00:06 with 0 errors on Mon Sep 26 14:48:34 2022
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
ada0p4 ONLINE 0 4.22G 0
errors: Permanent errors have been detected in the following files:
<metadata>:<0x0>
<metadata>:<0x8416>
<metadata>:<0x34>
<metadata>:<0x43>
<metadata>:<0x46>
zroot/ROOT/default:<0x0>
My guess is the disk is bad and needs replacing. You can try a reinstall, but I expect the problems to return.
Hello,
I just want to thank you because I changed both disks and I don't have this error anymore.
Strangely, those disks have no problem with another OS...
Everythink is running well after loosing one day because of the option Firewall/Settings/Advanced/Disable reply-to ;D
Glad to hear to got things working.
Another OS may not care about disk errors...until you experience data loss/corruption...