BUG: ZFS RAIDZ BOOT!

Started by macafee, September 24, 2025, 03:53:29 PM

Previous topic - Next topic
Recently, I fresh installed the 25.7 opnsense using zfs raidz1 with 3-disks for booting.
Unfortunately, the disk0 was failed and I rebooted the system today.
Now the system can't boot because the boot files in disk1 efi partition was wrong.
Those files were totally different with the disk0 in the efi partition and the disk1 efi partition only 779KB.
The disk3 efi partition was not recognized.
I need newfs_msdos -F32 -c 1 disk1 and cp the /boot/loader.efi to the efi partition to fix the boot problem.
After that, I can boot into system and replace the disk0 with the new disk.
What an amazing BUG this is!!!

September 24, 2025, 06:18:54 PM #1 Last Edit: September 24, 2025, 06:31:17 PM by BrandyWine
Mini-pc N150 i226v x520, FREEDOM

Quote from: BrandyWine on September 24, 2025, 06:18:54 PMCan you fit in a 4th drive?

Also, perhaps related --> https://forums.freebsd.org/threads/openzfs-2-3-on-14-3-stable-boot-fails-with-zfs-unsupported-feature-org-openzfs-raidz_expansion-after-zpool-upgrade.98998/
I haven't a 4th disk.
The article is about raidz_expansion and gptzfsboot but I use the UEFI boot without raidz_expansion.

To some degree I think this is a bug in the FreeBSD ZFS installer offering these modes, because our hybrid ZFS installer is a modified copy of it.


Cheers,
Franco

September 29, 2025, 03:58:35 PM #4 Last Edit: September 29, 2025, 09:10:20 PM by Patrick M. Hausen
I know this is a not quite satisfactory situation but the maintenance and update of the boot loaders on your disk is currently left to the administrator to perform manually in FreeBSD.

After a new installation you can e.g. copy the first two partitions (EFI and legacy boot loader) from the first disk to all other ones with "dd".

After a zpool upgrade you must also update the boot loaders or risk the system to become unbootable.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: Patrick M. Hausen on September 29, 2025, 03:58:35 PMAfter a zpool upgrade you must also update the boot loaders or risk the system to become unbootable.

I think it's better to update bootloaders before doing a zpool upgrade. 
zpool upgrade may introduce new features.  Old bootloaders don't understand new features leading to unbootable systems.
New bootloaders are backwards compatible;  they understand all features that came before.
So, my opinion, upgrade bootloaders before zpool upgrade.


Yes, I implied in the same session and definitely before rebooting :-)
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Yep, I was covering the case of "oh crap lost power/rebooted before I could update bootloaders".
Also if your boot device is a mirror, make sure to update both.
(Yes, I've hit both over my course of using ZFS)

September 29, 2025, 11:24:57 PM #8 Last Edit: September 30, 2025, 12:14:37 AM by Jose
This is one of the reasons I've created an `bootcode-update` utility to update my FreeBSD hosts boot mirrors whenever I upgrade zroot zpool, or the GPT/EFI bootcode gets updated from freebsd-update, also it works on RAIDZx when I've tested some time ago.

The `bootcode-update` utility can be found HERE so you can see how it works and make your own script for update automation.

Be aware that this utility is an experimental attempt for it and I haven't updated that in a while, though I still use it at my own risk on my FreeBSD host and OPNsense, though it does not support yet with the new FreeBSD EFI gpt/label layout just gpt/id, thinking to update it though.

Also be aware that updating bootcode either GPT/EFI/ZFS may prevent you from boot very old Boot Environments as expected, in such case the user need to manually mount said BE's and update the new bootcode files manually or rollback them on disks, though if the zroot zpool was upgraded this may also prevent old BE's from booting.

So always make sure you want to really update them.

Regards
OPNSense on Bhyve VM set with 2CPU, 4GB-RAM, 120GB-ZFS, Transparent Filtering Bridge(TFB).
Intel i5-2390T with 32GB-RAM and Intel I350-T4(2-Ports Passthrough for OPNsense + VirtIO).
System running Jails, MEDIA/SMB/NFS/SSH servers etc.., ZFS-Mirrors for boot and storage.

September 30, 2025, 01:52:59 PM #9 Last Edit: September 30, 2025, 02:01:27 PM by Jose
Hello, I've updated the `bootcode-update` utility to support GPT labels for compatibility with later FreeBSD releases in case someone wants to play with on a VM.

Sample output from my FreeBSD host:
root@nas-mserver: ~# bootcode-update -v
bootcode-update 0.3.6
root@nas-mserver: ~# bootcode-update -e

UEFI Partition: [ ada0p1 ]
Disk Serial:    [ TNS519GYXXXXXX ]
Proceed with EFI bootcode update for the following geom: [ada0p1] (Y/n)?: y
Proceeding...
=> Updating EFI bootcode on ada0p1
/boot/loader.efi -> /boot/efi/efi/boot/bootx64.efi
/boot/loader.efi -> /boot/efi/efi/freebsd/loader.efi
=> Success!


UEFI Partition: [ ada1p1 ]
Disk Serial:    [ 140817TM85A3TDXXXXXX ]
Proceed with EFI bootcode update for the following geom: [ada1p1] (Y/n)?: y
Proceeding...
=> Updating EFI bootcode on ada1p1
/boot/loader.efi -> /tmp/boot_esp/efi/boot/bootx64.efi
/boot/loader.efi -> /tmp/boot_esp/efi/freebsd/loader.efi
=> Success!


Sample output from my OPNsense VM:
root@fw-opnsense:~ # uname -a
FreeBSD fw-opnsense.arpa 14.3-RELEASE-p2 FreeBSD 14.3-RELEASE-p2 stable/25.7-n271676-ab2281de1853 SMP amd64
root@fw-opnsense:~ # bootcode-update -v
bootcode-update 0.3.6
root@fw-opnsense:~ # bootcode-update -e

UEFI Partition: [ vtbd0p1 ]
Disk Serial:    [ BHYVE-125E-B3XX-XXXX ]
Proceed with EFI bootcode update for the following geom: [vtbd0p1] (Y/n)?: y
Proceeding...
=> Updating EFI bootcode on vtbd0p1
/boot/loader.efi -> /boot/efi/efi/boot/bootx64.efi
/boot/loader.efi -> /boot/efi/efi/freebsd/loader.efi
=> Success!

Sample output updating GPT/ZFSBOOT:
root@nas-mserver: ~# bootcode-update -g

Boot Partition: [ ada0p2 ]
Disk Serial:    [ TNS519GYXXXXXX ]
Pool Member:    [ zroot: '/dev/ada0p4' ]
Proceed with GPT/ZFS bootcode update for the following geom: [ada0p2] (Y/n)?: y
Proceeding...
=> Updating GPT/ZFS bootcode on ada0p2
partcode written to ada0p2
bootcode written to ada0
=> Success!


Boot Partition: [ ada1p2 ]
Disk Serial:    [ 140817TM85A3TDXXXXXX ]
Pool Member:    [ zroot: '/dev/ada1p4' ]
Proceed with GPT/ZFS bootcode update for the following geom: [ada1p2] (Y/n)?: y
Proceeding...
=> Updating GPT/ZFS bootcode on ada1p2
partcode written to ada1p2
bootcode written to ada1
=> Success!

Regards
OPNSense on Bhyve VM set with 2CPU, 4GB-RAM, 120GB-ZFS, Transparent Filtering Bridge(TFB).
Intel i5-2390T with 32GB-RAM and Intel I350-T4(2-Ports Passthrough for OPNsense + VirtIO).
System running Jails, MEDIA/SMB/NFS/SSH servers etc.., ZFS-Mirrors for boot and storage.

Nice, thank you. May consider picking this up in core in the future if boot code incompatibilities are to become more common.


Cheers,
Franco

Quote from: franco on September 30, 2025, 01:56:23 PMNice, thank you. May consider picking this up in core in the future if boot code incompatibilities are to become more common.


Cheers,
Franco

Hi Franco, I've edited the previous post and added the output for "gpz/zfsboot" code update as well for reference.

Regards
OPNSense on Bhyve VM set with 2CPU, 4GB-RAM, 120GB-ZFS, Transparent Filtering Bridge(TFB).
Intel i5-2390T with 32GB-RAM and Intel I350-T4(2-Ports Passthrough for OPNsense + VirtIO).
System running Jails, MEDIA/SMB/NFS/SSH servers etc.., ZFS-Mirrors for boot and storage.