System stopped working during upgrade to 24.1.9

Started by ddt3, June 27, 2024, 11:42:41 AM

Previous topic - Next topic
June 27, 2024, 11:42:41 AM Last Edit: July 04, 2024, 08:29:48 AM by ddt3
I am running:
OPNsense 24.1.8-amd64 on an Intel(R) Celeron(R) N5105 @ 2.00GHz (4 cores, 4 threads).

Using the webgui I tried to uprgade from 24.1.8 to 24.1.9 but the upgrade failed.
Now when I retry to do an update it shows:
***GOT REQUEST TO CHECK FOR UPDATES***
Currently running OPNsense 24.1.8 at Thu Jun 27 11:00:38 CEST 2024
Fetching changelog information, please wait... done
Updating OPNsense repository catalogue...
Fetching meta.conf: . done
Fetching packagesite.pkg: .......... done
Processing entries: .......... done
OPNsense repository update completed. 845 packages processed.
Updating mimugmail repository catalogue...
Waiting for another process to update repository mimugmail
All repositories are up to date.
Checking integrity... done (0 conflicting)
Your packages are up to date.
Checking for upgrades (42 candidates): .......... done
Processing candidates (42 candidates): .......... done
Checking integrity...Child process pid=81326 terminated abnormally: Bus error
***DONE***


Logging into the console and trying from command line, I get this error:
Enter an option: 12

Fetching change log information, please wait... done

This will automatically fetch all available updates and apply them.

Proceed with this action? [y/N]: y

/usr/local/opnsense/scripts/shell/firmware.sh: less: Input/output error


Using a shell I get the same error when using "less":
root@OPNsense:~ # which uptime
/usr/bin/uptime
root@OPNsense:~ # which less
/usr/bin/less
root@OPNsense:~ # less
/usr/bin/less: Input/output error.
root@OPNsense:~ # uptime
11:04AM  up 15 days, 10:24, 2 users, load averages: 0.29, 0.31, 0.26


I am afraid my disk is failing so wanted to look at smartctl, but no device is shown in the webgui:


On the command line I have now idea which device to use:
root@OPNsense:~ # geom disk list
root@OPNsense:~ #


dmesg does show messages like this:
g_vfs_done():gpt/rootfs[WRITE(offset=114130190336, length=262144)]error = 6
g_vfs_done():gpt/rootfs[WRITE(offset=114156208128, length=32768)]error = 6
g_vfs_done():gpt/rootfs[WRITE(offset=114156470272, length=294912)]error = 6
g_vfs_done():gpt/rootfs[WRITE(offset=114156994560, length=294912)]error = 6
vm_fault: pager read error, pid 550 (ruby31)
vm_fault: pager read error, pid 4922 (ruby31)
vm_fault: pager read error, pid 11023 (ruby31)
vm_fault: pager read error, pid 21105 (ruby31)
vm_fault: pager read error, pid 26145 (ruby31)


I admit I am a total n00b when it comes to Freebsd , so is my disk indeed dying(dead?)

It is my experience that once a disk is failing it is best to first make sure to get all the information from the system, while it is still running. A system that is still runnning with disk issues, often does not reboot because of these disk issues. Next to that without my opnsense server I would not be connected to the internet (makes it hard to search google, download opnsense ISO's etc.). so I stalled addressing this issue for as long as I could (was working from home and although things appeared messed up , I still had internet)

Unfortunately the system started degrading over the days and finally my internet connection broke, dns and dhcp stopped working  (but opnsense would still be responding to ping). So I had everything prepared and I was ready to re-install the system but as I no longer had anything to lose, I just rebooted the system.

This fixed the issue, after reboot the disk showed up again:
root@OPNsense:~ # geom disk list
Geom name: nvd0
Providers:
1. Name: nvd0
   Mediasize: 128035676160 (119G)
   Sectorsize: 512
   Mode: r3w3e8
   descr: Hoodisk SSD
   lunid: 6479a747c030213d
   ident: M1YLCKC21272049
   rotationrate: 0
   fwsectors: 0
   fwheads: 0

root@OPNsense:~ #


And the SMART plugin does again show useable information:



So I think the title of this post should have been:
"System stopped working during upgrade to 24.1.9"

Apparently as no one replied to this post, I am the only one to experience this?