Cannot upgrade from 24.1.10_8 to 24.7

Started by Blatancy2409, August 31, 2024, 04:38:37 PM

Previous topic - Next topic
I initiated an upgrade from the UI, the system rebooted, and now it's in a weird state of base being from 24.1 and kernel from 24.7:

base   24.1.8   593.6MiB   OPNsense   BSD2CLAUSE   FreeBSD userland set   
kernel   24.7.1   175.2MiB   OPNsense   BSD2CLAUSE   FreeBSD kernel set   
pkg   1.19.2_1   14.9MiB   OPNsense   BSD2CLAUSE   Package manager   


I rolled back to known working 24.1 via bectl and initiated the upgrade again via SSH, and still the same result. Please advise what to do.

Can you post the output of a healthcheck ? Also how full is the HDD  ?

Sure.

No disk space problems that I can see:

root@OPNsense:~ # df -h
Filesystem                   Size    Used   Avail Capacity  Mounted on
zroot/ROOT/24.1.10_8         207G    2.3G    205G     1%    /
devfs                        1.0K    1.0K      0B   100%    /dev
/dev/gpt/efiboot0            260M    1.8M    258M     1%    /boot/efi
zroot                        205G     96K    205G     0%    /zroot
zroot/usr/home               205G     96K    205G     0%    /usr/home
zroot/var/crash              205G     96K    205G     0%    /var/crash
zroot/var/audit              205G     96K    205G     0%    /var/audit
zroot/tmp                    205G    624K    205G     0%    /tmp
zroot/usr/ports              205G     96K    205G     0%    /usr/ports
zroot/var/mail               205G    136K    205G     0%    /var/mail
zroot/var/log                206G    1.4G    205G     1%    /var/log
zroot/var/tmp                205G    100K    205G     0%    /var/tmp
zroot/usr/src                205G     96K    205G     0%    /usr/src
devfs                        1.0K    1.0K      0B   100%    /var/dhcpd/dev
devfs                        1.0K    1.0K      0B   100%    /var/unbound/dev
/usr/local/lib/python3.11    207G    2.3G    205G     1%    /var/unbound/usr/local/lib/python3.11
/lib                         207G    2.3G    205G     1%    /var/unbound/lib

root@OPNsense:~ # zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zroot   224G  12.3G   212G        -         -    16%     5%  1.00x    ONLINE  -



Health Audit log also looks normal:

***GOT REQUEST TO AUDIT HEALTH***
Currently running OPNsense 24.1.10_8 at Sat Aug 31 11:02:32 EDT 2024
>>> Root file system: zroot/ROOT/24.1.10_8
>>> Check installed kernel version
Version 24.1.8 is correct.
>>> Check for missing or altered kernel files
No problems detected.
>>> Check installed base version
Version 24.1.8 is correct.
>>> Check for missing or altered base files
No problems detected.
>>> Check installed repositories
OPNsense
>>> Check installed plugins
os-api-backup 1.1
os-ddclient 1.22
os-igmp-proxy 1.5_2
os-iperf 1.0_1
os-mdns-repeater 1.1_1
os-net-snmp 1.5_4
os-nut 1.8.1_2
os-realtek-re 1.0
os-telegraf 1.12.11
os-udpbroadcastrelay 1.0_3
os-upnp 1.5_6
os-wol 2.4_2
>>> Check locked packages
No locks found.
>>> Check for missing package dependencies
Checking all packages: .......... done
>>> Check for missing or altered package files
Checking all packages: .......... done
>>> Check for core packages consistency
Core package "opnsense" has 68 dependencies to check.
Checking packages: ..................................................................... done
***DONE***

Looks good. Did you try ugrading from another mirror ? Anything unusual in the output while the upgrade is running ?

Nothing unusual. I can try a different mirror.

Will upgrading via serial help?

I hooked up the serial console and that what I see:

After the restart, the update script executes and then complains:

/usr/local/lib/ipsec /usr/local/lib/perl5/5.36/mach/CORE
32-bit compatibility ldconfig path:
done.
>>> Invoking early script 'upgrade'
!!!!!!!!!!!! ATTENTION !!!!!!!!!!!!!!!
! A critical upgrade is in progress. !
! Please do not turn off the system. !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
/usr/local/sbin/opnsense-update: /var/cache/opnsense-update/.sets.pending/base-freebsd-version/bin/freebsd-version: Permission denied
!!!!!!!!!!!! ATTENTION !!!!!!!!!!!!!!!
! A critical upgrade is in progress. !
! Please do not turn off the system. !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Installing packages-24.7-amd64.tar...
pkg-1.19.2_1: already unlocked
Updating OPNsense repository catalogue...
pkg-static: Repository OPNsense has a wrong packagesite, need to re-create database
Fetching meta.conf: . done
Fetching packagesite.pkg: .......... done
Processing entries:
pkg-static: wrong architecture: FreeBSD:14:amd64 instead of FreeBSD:13:amd64
pkg-static: repository OPNsense contains packages with wrong ABI: FreeBSD:14:amd64
Processing entries... done
Unable to update repository OPNsense
Error updating repositories!
Rebooting now.
Waiting (max 60 seconds) for system process `vnlru' to stop... done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining... 0 0 0 0 0 0 0 done
All buffers synced.
Uptime: 34s
uhub1: detached
uhub0: detached



/usr/local/sbin/opnsense-update: /var/cache/opnsense-update/.sets.pending/base-freebsd-version/bin/freebsd-version: Permission denied

That's the issue, but I'm not sure what's going on... wrong permissions on /var for root??


Cheers,
Franco

Here is the ls:

root@OPNsense:~ # ls -al /var/cache/opnsense-update/
total 10
drwxr-x---  3 root  wheel  3 Aug 31 08:50 .
drwxr-xr-x  6 root  wheel  6 May 28 08:51 ..
drwxr-x---  2 root  wheel  4 Aug 31 08:50 77836


I didn't touch anything in /var

August 31, 2024, 10:28:38 PM #8 Last Edit: August 31, 2024, 10:36:39 PM by jcsp101
I tried to update from 24.1 to 24.7 multiple times and got the same weird issue after reboot that OP described as well (kernel showing an update to 24.1 from 24.7) as well as internet not working.

I tried it from the Web UI first, then via SSH, also tested live boot as well.

Updating from web UI - the update "completes" but after reboot incorrect versions shows and the Internet doesn't work (though local access to devices like home assistant does) and a kernel "update" is available from from 27.1 to 24.1.

Update via SSH shell update as well (option 12) - after reboot, same as above, no internet, and webUI shows incorrect versions

I also tried a live boot of the new version and importing my current config, which also fails in that internet doesn't work, though this might be because of services not aligning with settings (I use unbound and adguard DNS both as packages on OPNsense)

my ssd has tons of space - just found this thread so haven't tried another mirror yet, but i did try a health audit which yielded a checksum mismatch with my adguard home package though i don't think this is related (i assume it's since i hit update in the adguard webUI rather than letting opnsense package manager update adguard).

Additional note:
I first noticed this some weeks ago, and i wasn't even on 24.1.10 yet - and one thing i tried was declining the major update and instead trying to do the latest minor update (to 24.1.10), and that one was successful.

Quote from: Blatancy2409 on August 31, 2024, 10:10:12 PM
Here is the ls:

root@OPNsense:~ # ls -al /var/cache/opnsense-update/
total 10
drwxr-x---  3 root  wheel  3 Aug 31 08:50 .
drwxr-xr-x  6 root  wheel  6 May 28 08:51 ..
drwxr-x---  2 root  wheel  4 Aug 31 08:50 77836


I didn't touch anything in /var

Before anything, make a copy of the config file first.

Since you know how to use bectl, what's the output for this command ?

opnsense-update -bkr 24.7.3


If it exits cleanly asking for a reboot do as it says, ssh in and check for updates to sync the rest of the packages.


Whatever output you can post here - no matter how irrelevant it may seem - could be helpful. This is what I would do anyway before either

a) Franco chimes in with a better solution
OR
b) time is of the essence and installing 24.7 from scratch and importing the config is something that cannot be avoided anymore

Quote from: jcsp101 on August 31, 2024, 10:28:38 PM
I tried to update from 24.1 to 24.7 multiple times and got the same weird issue after reboot that OP described as well (kernel showing an update to 24.1 from 24.7) as well as internet not working.

I tried it from the Web UI first, then via SSH, also tested live boot as well.

Updating from web UI - the update "completes" but after reboot incorrect versions shows and the Internet doesn't work (though local access to devices like home assistant does) and a kernel "update" is available from from 27.1 to 24.1.

Update via SSH shell update as well (option 12) - after reboot, same as above, no internet, and webUI shows incorrect versions

I also tried a live boot of the new version and importing my current config, which also fails in that internet doesn't work, though this might be because of services not aligning with settings (I use unbound and adguard DNS both as packages on OPNsense)

my ssd has tons of space - just found this thread so haven't tried another mirror yet, but i did try a health audit which yielded a checksum mismatch with my adguard home package though i don't think this is related (i assume it's since i hit update in the adguard webUI rather than letting opnsense package manager update adguard).

Additional note:
I first noticed this some weeks ago, and i wasn't even on 24.1.10 yet - and one thing i tried was declining the major update and instead trying to do the latest minor update (to 24.1.10), and that one was successful.

It is unlikely you have the same issue, better open another thread for this - could be AGH causing the lack of Internet and that may be an easy fix allowing you to recover and complete the 24.7 upgrade

Quote from: newsense on August 31, 2024, 11:40:02 PM

Before anything, make a copy of the config file first.

Since you know how to use bectl, what's the output for this command ?

opnsense-update -bkr 24.7.3


If it exits cleanly asking for a reboot do as it says, ssh in and check for updates to sync the rest of the packages.


Whatever output you can post here - no matter how irrelevant it may seem - could be helpful. This is what I would do anyway before either

a) Franco chimes in with a better solution
OR
b) time is of the essence and installing 24.7 from scratch and importing the config is something that cannot be avoided anymore

I really appreciate you guys looking into this! Thank you!

I know how to use bectl, I'm quite nerdy but on the Linux side of things not bsd :)

As far as I understand, this command triggers the update; why is it  24.7.3, not  24.7_9 ?

I prepared two USB drives already to install a fresh 24.7, haha, but I really want to avoid the downtime.

Quote from: newsense on August 31, 2024, 11:44:50 PM
Quote from: jcsp101 on August 31, 2024, 10:28:38 PM
I tried to update from 24.1 to 24.7 multiple times and got the same weird issue after reboot that OP described as well (kernel showing an update to 24.1 from 24.7) as well as internet not working.

I tried it from the Web UI first, then via SSH, also tested live boot as well.

Updating from web UI - the update "completes" but after reboot incorrect versions shows and the Internet doesn't work (though local access to devices like home assistant does) and a kernel "update" is available from from 27.1 to 24.1.

Update via SSH shell update as well (option 12) - after reboot, same as above, no internet, and webUI shows incorrect versions

I also tried a live boot of the new version and importing my current config, which also fails in that internet doesn't work, though this might be because of services not aligning with settings (I use unbound and adguard DNS both as packages on OPNsense)

my ssd has tons of space - just found this thread so haven't tried another mirror yet, but i did try a health audit which yielded a checksum mismatch with my adguard home package though i don't think this is related (i assume it's since i hit update in the adguard webUI rather than letting opnsense package manager update adguard).

Additional note:
I first noticed this some weeks ago, and i wasn't even on 24.1.10 yet - and one thing i tried was declining the major update and instead trying to do the latest minor update (to 24.1.10), and that one was successful.

It is unlikely you have the same issue, better open another thread for this - could be AGH causing the lack of Internet and that may be an easy fix allowing you to recover and complete the 24.7 upgrade

OK will do

Quote from: Blatancy2409 on September 01, 2024, 04:07:58 AM
I really appreciate you guys looking into this! Thank you!

I know how to use bectl, I'm quite nerdy but on the Linux side of things not bsd :)

As far as I understand, this command triggers the update; why is it  24.7.3, not  24.7_9 ?

I prepared two USB drives already to install a fresh 24.7, haha, but I really want to avoid the downtime.

My goal is to try and get you on 24.7 as efficiently as possible - if it works. Clearly the standard method is failing for you right now.

If you reboot cleanly into 24.7.x you're half way there, the packages can be done after while being online, with a final reboot for a clean start with everything up to date.


The weird permission issue in /var is clearly non-standard and I'm wondering if it was caused by an accident or if there could be something related to the drive that causes ZFS to behave in an unexpected way leading to that permission thing.

Skipping any update in any released sequence of updates has proven to be problematic for devices running software as "firmware" such as OPNsense.   A question for the OPNsense maintainers is whether OPNsense should prevent skipping of even minor update releases to insure that all updates require and may assume that all the prior updates have been executed in chronological sequence.

This could be done by OPNsense's update logic when an updater attempts to skip over an update (often to jump directly to a major release such as 24.7).