Failed upgrade 23.1.11_2 to 23.7 resulting in can't load 'kernel'

Started by mfalkvidd, February 26, 2026, 01:43:09 PM

Previous topic - Next topic
I'm trying to catch up with the latest release. Two weeks ago I upgraded from v22 to v23. Had to dig out a serial cable so I could (re)configure the interfaces over serial, but except that the upgrade was fine.

Today I initiated the update from 23.1.11_2 to 23.7 through the web UI. It indicated that base, kernel and packages needed update.
The web UI stalled at the kernel update.
An hour later I could still access internet from my devices (through Opnsense). Opnsense still responded to ping. But ssh hanged (no connection timeout, no connection refused, just hanging indefinitely). Same with web interface.
Serial console gave me login prompt, but after providing the password nothing happened.

I pulled the power, waited a bit and powered on again. Got this:

PC Engines apu4
coreboot build 20212402
BIOS version v4.13.0.4
4080 MB ECC DRAM

SeaBIOS (version rel-1.12.1.3-0-g300e8b70)

Press F10 key now for boot menu

Booting from Hard Disk...



            /  __  |/ ___ |/ __  |
            | |  | | |__/ | |  | |___  ___ _ __  ___  ___
            | |  | |  ___/| |  | / __|/ _ \ '_ \/ __|/ _ \
            | |__| | |    | |  | \__ \  __/ | | \__ \  __/
            |_____/|_|    |_| /__|___/\___|_| |_|___/\___|

 +-----------------------------------------+     @@@@@@@@@@@@@@@@@@@@@@@@@@@@
 |                                         |   @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 |  1. Boot Multi user [Enter]             |   @@@@@                    @@@@@
 |  2. Boot Single user                    |       @@@@@            @@@@@
 |  3. Escape to loader prompt             |    @@@@@@@@@@@       @@@@@@@@@@@
 |  4. Reboot                              |         \\\\\         /////
 |  5. Cons: Serial                        |   ))))))))))))       (((((((((((
 |                                         |         /////         \\\\\
 |  Options:                               |    @@@@@@@@@@@       @@@@@@@@@@@
 |  6. Kernel: default/kernel (1 of 2)     |       @@@@@            @@@@@
 |  7. Boot Options                        |   @@@@@                    @@@@@
 |                                         |   @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 |                                         |   @@@@@@@@@@@@@@@@@@@@@@@@@@@@
 +-----------------------------------------+
   Autoboot in 0 seconds. [Space] to pause     23.1 ``Quintessential Quail'' |

Loading kernel...
Failed to load kernel 'kernel'
can't load 'kernel'

can't load 'kernel'

I proceeded with boot kernel.old which was successful (except that it was still on 23.1 of course). This error was noted during boot though:
swapon: adding /dev/ada0p3 as swap device
.ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib /usr/local/lib/compat/pkg /usr/local/lib/compat/pkg /usr/local/lib/ipsec /usr/local/lib/perl5/5.32/mach/CORE
32-bit compatibility ldconfig path:
done.
>>> Invoking early script 'upgrade'
!!!!!!!!!!!! ATTENTION !!!!!!!!!!!!!!!
! A critical upgrade is in progress. !
! Please do not turn off the system. !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Version number mismatch, aborting.
    Kernel: 13.1
    Base:   13.2
>>> Invoking early script 'configd'
Starting configd.

There is 3.3GB available space on zroot. I did a scrub two weeks ago which resulted in 0B repaired and 0 errors.

Any ideas on how to proceed from here?

IMO you just need to retry the upgrade since nothing was done. The kernel should apply as long as the disk is dependable. If not it's time for a reinstall anyway (and investigate replacing the disk).


Cheers,
Franco

Thanks. Will do, but through cli instead of web so I can see what is happening.

How can I check the disk, besides scrubbing the pool?

The health audit in the firmware status is a first good preliminary check for consistency. I'm not an expert on ZFS disk health monitoring, but there should be ample information here in the forum about it from Patrick et al.


Cheers,
Franco

If a scrub returns no error all data and metadata that is actually on the disk is guaranteed ok.

For possible device errors, end of lifetime notifications, etc. check out Scrutiny:

https://forum.opnsense.org/index.php?topic=48101.0


Honestly I am puzzled nobody ever commented on my HOWTO or came back with questions. Disk monitoring, like temperature and fans (if present) is essential, IMHO.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: Patrick M. Hausen on February 26, 2026, 08:25:20 PMIf a scrub returns no error all data and metadata that is actually on the disk is guaranteed ok.

For possible device errors, end of lifetime notifications, etc. check out Scrutiny:

https://forum.opnsense.org/index.php?topic=48101.0


Honestly I am puzzled nobody ever commented on my HOWTO or came back with questions. Disk monitoring, like temperature and fans (if present) is essential, IMHO.
Since you mention it Patrick. I had a disk dying on me that prompted me to look back at your post. I have now implemented scrutiny as my first docker thingie. I've always disliked docker but for this which was a pretty useful tool without alternative, I had to bite the bullet and go docker.
Since then and that has been in the last couple of weeks, I have also tried to find a way to send notifications out fromm scrutiny. I came stuck and then surprising myself I was able to integrate this docker thinghie with another, called mattermost. I'm still struggling with the notifications, the reason for installing mattermost but making progress. Point being, thank you for sharing.

February 26, 2026, 11:05:09 PM #6 Last Edit: February 26, 2026, 11:06:44 PM by Patrick M. Hausen
If you have a volume mounted for the scrutiny configuration directory you can create a file named scrutiny.yaml and configure notification like in this example file:

https://github.com/Starosdev/scrutiny/blob/master/example.scrutiny.yaml

I tried it with plain email - works.

E.g.
notify:
  urls:
    - "mattermost://[username@]mattermost-host/token[/channel]"
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Ran opnsense-update -u which completed ok. Rebooted and got same error as the first time (can't load kernel).