OPNsense Forum
English Forums => General Discussion => Topic started by: ajm on February 12, 2022, 03:14:11 pm
-
I feel a bit ashamed having to ask this here but I've not found any references to the correct way under OPNsense to mount a second zpool at system start.
The system is already root-on-ZFS, and the second zpool created no problems and was mounted (see output below), however after a reboot its no longer mounted.
I'm unfamiliar with the OPNsense system startup, so unsure as to the 'correct' way to do this. Perhaps a 'syshook' script, or modify /etc/fstab ? Any advice gratefully recieved..
root@a-fw:~ # camcontrol devlist
<ULTIMATE CF CARD Ver7.01C> at scbus0 target 0 lun 0 (pass0,ada0)
<CT2000MX500SSD1 M3CR043> at scbus1 target 0 lun 0 (pass1,ada1)
root@a-fw:~ # gpart create -s GPT /dev/ada1
ada1 created
root@a-fw:~ # gpart add -t freebsd-zfs -a 4k /dev/ada1
ada1p1 added
root@a-fw:~ # gpart modify -l tank -i 1 /dev/ada1
ada1p1 modified
root@a-fw:~ # gpart show
=> 40 30408256 ada0 GPT (14G)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 4194304 2 freebsd-swap (2.0G)
4196352 26210304 3 freebsd-zfs (12G)
30406656 1640 - free - (820K)
=> 40 3907029088 ada1 GPT (1.8T)
40 3907029088 1 freebsd-zfs (1.8T)
root@a-fw:~ # zpool create -m /tank tank /dev/ada1p1
root@a-fw:~ # zfs set atime=off tank
root@a-fw:~ # zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
tank 1.81T 432K 1.81T - - 0% 0% 1.00x ONLINE -
zroot 12G 805M 11.2G - - 2% 6% 1.00x ONLINE -
root@a-fw:~ # mount
zroot/ROOT/base-setup on / (zfs, local, noatime, nfsv4acls)
devfs on /dev (devfs)
zroot on /zroot (zfs, local, noatime, nfsv4acls)
zroot/tmp on /tmp (zfs, local, noatime, nosuid, nfsv4acls)
zroot/var/audit on /var/audit (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/usr/src on /usr/src (zfs, local, noatime, nfsv4acls)
zroot/usr/ports on /usr/ports (zfs, local, noatime, nosuid, nfsv4acls)
zroot/usr/home on /usr/home (zfs, local, noatime, nfsv4acls)
zroot/var/crash on /var/crash (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/var/mail on /var/mail (zfs, local, nfsv4acls)
zroot/var/log on /var/log (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/var/tmp on /var/tmp (zfs, local, noatime, nosuid, nfsv4acls)
devfs on /var/dhcpd/dev (devfs)
devfs on /var/unbound/dev (devfs)
tank on /tank (zfs, local, noatime, nfsv4acls)
-
TBH I'm scratching my head a bit now..
None of the zfs or zpool commands such as zpool list, zpool status return ANY info about the new pool.
The disk and partition appear to available to the system so I don't understand why ZFS isn't finding the new pool.
root@a-fw:~ # geom -t
Geom Class Provider
ada0 DISK ada0
ada0 PART ada0p1
ada0p1 DEV
ada0p1 LABEL gpt/gptboot0
gpt/gptboot0 DEV
ada0 PART ada0p2
ada0p2 DEV
swap SWAP
ada0 PART ada0p3
ada0p3 DEV
zfs::vdev ZFS::VDEV
ada0 DEV
ada1 DISK ada1
ada1 PART ada1p1
ada1p1 DEV
ada1p1 LABEL gpt/tank
gpt/tank DEV
ada1 DEV
-
Aha ! zpool import DOES find it, and reports errors. I will clean the disk off and try again..
root@a-fw:~ # zpool import
pool: tank
id: 17258291557216105619
state: UNAVAIL
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
config:
tank UNAVAIL insufficient replicas
ada1 UNAVAIL invalid label
pool: tank
id: 8003397822133176640
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:
tank ONLINE
ada1p1 ONLINE
-
OK, done that, now I'm back to where I started. The new pool doesn't mount at boot, but if a I do a zpool import, it then mounts as expected.
-
Further checks look OK I think ?
Its listed in the cachefile and has 'canmount' set, so it should be mounted at boot, no ?
root@a-fw:~ # zpool get cachefile
NAME PROPERTY VALUE SOURCE
tank cachefile - default
zroot cachefile - default
root@a-fw:~ # zfs get all tank
NAME PROPERTY VALUE SOURCE
tank type filesystem -
tank creation Sat Feb 12 15:04 2022 -
tank used 10.6M -
tank available 1.76T -
tank referenced 10.1M -
tank compressratio 1.00x -
tank mounted yes -
tank quota none default
tank reservation none default
tank recordsize 128K default
tank mountpoint /tank local
tank sharenfs off default
tank checksum on default
tank compression off default
tank atime off local
tank devices on default
tank exec on default
tank setuid on default
tank readonly off default
tank jailed off default
tank snapdir hidden default
tank aclmode discard default
tank aclinherit restricted default
tank createtxg 1 -
tank canmount on default
tank xattr on default
tank copies 1 default
tank version 5 -
tank utf8only off -
tank normalization none -
tank casesensitivity sensitive -
tank vscan off default
tank nbmand off default
tank sharesmb off default
tank refquota none default
tank refreservation none default
tank guid 6254459496930362475 -
tank primarycache all default
tank secondarycache all default
tank usedbysnapshots 0B -
tank usedbydataset 10.1M -
tank usedbychildren 516K -
tank usedbyrefreservation 0B -
tank logbias latency default
tank objsetid 54 -
tank dedup off default
tank mlslabel none default
tank sync standard default
tank dnodesize legacy default
tank refcompressratio 1.00x -
tank written 10.1M -
tank logicalused 10.2M -
tank logicalreferenced 10.0M -
tank volmode default default
tank filesystem_limit none default
tank snapshot_limit none default
tank filesystem_count none default
tank snapshot_count none default
tank snapdev hidden default
tank acltype nfsv4 default
tank context none default
tank fscontext none default
tank defcontext none default
tank rootcontext none default
tank relatime off default
tank redundant_metadata all default
tank overlay on default
tank encryption off default
tank keylocation none default
tank keyformat none default
tank pbkdf2iters 0 default
tank special_small_blocks 0 default
-
root@a-fw:~ # cat /usr/local/etc/rc.loader.d/20-zfs
# ZFS standard environment requirements
kern.geom.label.disk_ident.enable="0"
kern.geom.label.gptid.enable="0"
vfs.zfs.min_auto_ashift=12
opensolaris_load="YES"
zfs_load="YES"
Why is 'kern.geom.label.disk_ident.enable' & 'kern.geom.label.gptid.enable' disabled ?
(Edit. I think the reason is to supress the display of long GPTID strings)
When (re) creating the zpool, I opted to use a GPT label instead of the partition number.
(Clutching at straws here.. :))
-
Not sure how this works. We do mount using "mount -a" but maybe that historically ignores ZFS due to only looking at /etc/fstab? At least this file has the auto-mount flag.
Cheers,
Franco
-
Hrm, we also use "zfs mount -va". -a does mount all according to manual page:
Mount all available ZFS file systems. Invoked automatically as part of the boot process if configured
Cheers,
Franco
-
Thanks.
On this 22.1 system, system boot does not auto-mount my second zpool 'tank'.
Manually executing 'zfs mount -va' doesn't either, only 'zpool import tank':
root@a-fw:~ # zfs mount -va
root@a-fw:~ # mount
zroot/ROOT/22.1-base on / (zfs, local, noatime, nfsv4acls)
devfs on /dev (devfs)
zroot/var/mail on /var/mail (zfs, local, nfsv4acls)
zroot/tmp on /tmp (zfs, local, noatime, nosuid, nfsv4acls)
zroot/var/audit on /var/audit (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot on /zroot (zfs, local, noatime, nfsv4acls)
zroot/usr/home on /usr/home (zfs, local, noatime, nfsv4acls)
zroot/var/crash on /var/crash (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/usr/src on /usr/src (zfs, local, noatime, nfsv4acls)
zroot/var/log on /var/log (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/var/tmp on /var/tmp (zfs, local, noatime, nosuid, nfsv4acls)
zroot/usr/ports on /usr/ports (zfs, local, noatime, nosuid, nfsv4acls)
devfs on /var/dhcpd/dev (devfs)
devfs on /var/unbound/dev (devfs)
root@a-fw:~ # zpool import tank
root@a-fw:~ # mount
zroot/ROOT/22.1-base on / (zfs, local, noatime, nfsv4acls)
devfs on /dev (devfs)
zroot/var/mail on /var/mail (zfs, local, nfsv4acls)
zroot/tmp on /tmp (zfs, local, noatime, nosuid, nfsv4acls)
zroot/var/audit on /var/audit (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot on /zroot (zfs, local, noatime, nfsv4acls)
zroot/usr/home on /usr/home (zfs, local, noatime, nfsv4acls)
zroot/var/crash on /var/crash (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/usr/src on /usr/src (zfs, local, noatime, nfsv4acls)
zroot/var/log on /var/log (zfs, local, noatime, noexec, nosuid, nfsv4acls)
zroot/var/tmp on /var/tmp (zfs, local, noatime, nosuid, nfsv4acls)
zroot/usr/ports on /usr/ports (zfs, local, noatime, nosuid, nfsv4acls)
devfs on /var/dhcpd/dev (devfs)
devfs on /var/unbound/dev (devfs)
tank on /tank (zfs, local, noatime, nfsv4acls)
I'll do some further testing on FreeBSD 13.0.
PS. Further on this, after a reboot 'zfs mount -va' doesn't mount the 2nd. zpool. If I then mount it using 'zpool import tank', and then unmount, 'umount /tank', if I THEN use 'zfs mount -va', it IS mounted successfully.
-
@franco which version of zfs is used on 22.1 ?
I ask as I'm pretty sure the mounting requires the zfs canmount property and a mountpoint property. I'll have a dig at the relevant man pages to confirm.
-
Sorry I was thinking of a dataset. The question was for a pool, my mistake.
-
I think the problem is you have used an invalid character in the label. Normally when creating gpt labels, they are different from the device names i.e not /dev/daX types. And that is the reason you have the error
ada1 UNAVAIL invalid label
I think if you recreate the label without the forward slash it will work as intended. I imagine there are dmesg logs attempting to mount it on boot.
-
Hi, thx for the reply. Sorry for any confusion but that error was resolved by clearing the disk then re-creating the pool, see the following post.
The problem seems to be that ZFS does not identify the second pool attached to the system, so no attempt to mount it is made at system startup.
I've not yet been able to establish the mechanism by which ZFS gets this info, in order to debug further.
Do you have an OPNsense system there with 2+ pools mounted ?
-
I created a FreeBSD 13.0-RELEASE-p7 boot disk for the machine, and did some comparative testing vs OPNsense 22.1.
I found that as expected, after doing a 'zpool import tank', and rebooting, the 'tank' pool was mounted at boot under FreeBSD, but not under OPNsense.
I did some various other tests, including destroying and recreating the pool, exporting/importing under both OPNsense 22.1 and FreeBSD 13.0. The ZFS startup script under '/etc/rc.d' is the same. The contents of '/etc/zfs/zpool.cache' were the same on both systems.
The only apparent difference between the two systems was the failure to auto mount under OPNsense.
So that I could get on with my project, I've created a hacky syshook script to import (&mount) the pool, but it would be good to have this fixed properly.
I suppose the next steps would be to get some more lowlevel debug info about ZFS startup, but I guess that would be done using a debug kernel ?
-
For historic reasons and uncontrollable results we don't go through /etc/rc.d for our boot sequence.
/etc/rc.d/zfs zfs_start_main() seems to run zfs mount -va as expected and follows up with zfs share -a but I'm unsure if that would be the relevant difference (the manual page doesn't explain what "sharing" means).
If this is hidden in scripting I suspect FreeBSD does more than it should or "zfs mount -a" doesn't do it's job as it was documented?
About the version version:
# zfs version
zfs-2.1.2-FreeBSD_gaf88d47f1
zfs-kmod-2.1.2-FreeBSD_gaf88d47f1
Cheers,
Franco
-
PS: relevant bit might be "zpool" script https://github.com/opnsense/src/commit/74e2b24f2c369
-
PS: relevant bit might be "zpool" script https://github.com/opnsense/src/commit/74e2b24f2c369 (https://github.com/opnsense/src/commit/74e2b24f2c369)
Thanks franco.
This is an interesting one.
ajm On my OPN I only have the single root pool. It's a very small APU device, I've not added another disk because I don't like the idea or running stripped storage i.e. no redundancy on it, being a firewall with limited cpu and memory resources.
I don't have a way to test for you unfortunately.
But I do have a server based on freebdsd 12 with multiple pools.
One thing I notice but I can't tell if is part of the problem is that my understanding is that on freebsd the zpool import is done by scanning geom devices. On my opn and non-opn geoms listing, all those in a zfs pool have an attribute "zfs::vdev ZFS::VDEV " and it seems missing from yours on your "geom -t" listing for ada1 where tank is.
My OPN pool:
@OPNsense:~ % geom -t
Geom Class Provider
ada0 DISK ada0
ada0 PART ada0p1
ada0p1 LABEL gpt/efiboot0
gpt/efiboot0 DEV
ada0p1 LABEL msdosfs/EFISYS
msdosfs/EFISYS DEV
ada0p1 DEV
ada0 PART ada0p2
ada0p2 LABEL gpt/gptboot0
gpt/gptboot0 DEV
ada0p2 DEV
ada0 PART ada0p3
ada0p3 DEV
swap SWAP
ada0 PART ada0p4
ada0p4 DEV
zfs::vdev ZFS::VDEV
ada0 DEV
One of my storage systems:
~]$ geom -t
Geom Class Provider
da0 DISK da0
da0 PART da0p1
da0p1 LABEL gpt/sysboot0
gpt/sysboot0 DEV
da0p1 DEV
da0 PART da0p2
da0p2 LABEL gpt/swap0
gpt/swap0 DEV
swap SWAP
da0p2 DEV
da0 PART da0p3
da0p3 LABEL gpt/sysdisk0
gpt/sysdisk0 DEV
zfs::vdev ZFS::VDEV
da0p3 DEV
da0 DEV
da1 DISK da1
da1 PART da1p1
da1p1 DEV
zfs::vdev ZFS::VDEV
da1 DEV
da2 DISK da2
da2 PART da2p1
da2p1 DEV
zfs::vdev ZFS::VDEV
da2 DEV
da3 DISK da3
da3 PART da3p1
da3p1 DEV
zfs::vdev ZFS::VDEV
da3 DEV
da4 DISK da4
da4 PART da4p1
da4p1 DEV
da4 PART da4p2
da4p2 DEV
da4 PART da4p3
da4p3 DEV
da4 DEV
da5 DISK da5
da5 PART da5p1
da5p1 LABEL gpt/sysboot1
gpt/sysboot1 DEV
da5p1 DEV
da5 PART da5p2
da5p2 LABEL gpt/swap1
gpt/swap1 DEV
da5p2 DEV
da5 PART da5p3
da5p3 LABEL gpt/sysdisk1
gpt/sysdisk1 DEV
da5p3 DEV
da5 DEV
da6 DISK da6
da6 PART da6p1
da6p1 LABEL gpt/HPE_Disk1
gpt/HPE_Disk1 DEV
zfs::vdev ZFS::VDEV
da6p1 DEV
da6 DEV
da7 DISK da7
da7 PART da7p1
da7p1 LABEL gpt/PCK96S7X
gpt/PCK96S7X DEV
zfs::vdev ZFS::VDEV
da7p1 DEV
da7 DEV
da8 DISK da8
da8 PART da8p1
da8p1 LABEL gpt/PCJPJYRX
gpt/PCJPJYRX DEV
zfs::vdev ZFS::VDEV
da8p1 DEV
da8 DEV
da9 DISK da9
da9 PART da9p1
da9p1 LABEL gpt/PCK93TSX
gpt/PCK93TSX DEV
zfs::vdev ZFS::VDEV
da9p1 DEV
da9 DEV
cd0 DISK cd0
cd0 DEV
gzero ZERO gzero
gzero DEV
I think it relates to gpart usage. I'll see if I can dig something out of the command you used.
-
Yeah, thanks guys.
The info posted previously was a WIP, some of it is 'stale'. I'd already focussed on 'geom -t' and the contents of '/etc|boot/zfs/zpool.cache' is being possible causes, but they seem to check out OK (see below). This was by comparison with a 'working' FreeBSD system.
I think I'll need to dig further into ZFS to make any more progress with it, but work and other stuff will get in the way for a bit. So I've worked-around the issue for now, got the pool 'auto mounting', and got my jails running.
Re. choice of single disk, this is by-design a highly resource-constrained system, where every watt-hour is counted, but the benefits of ZFS over other fs options (mainly COW and all it brings) makes it a no-brainer even on a single SSD. The data will get backed-up.
root@a-fw:~ # geom -t
Geom Class Provider
ada0 DISK ada0
ada0 PART ada0p1
ada0p1 DEV
ada0p1 LABEL gpt/gptboot0
gpt/gptboot0 DEV
ada0 PART ada0p2
ada0p2 DEV
swap SWAP
ada0 PART ada0p3
zfs::vdev ZFS::VDEV
ada0p3 DEV
ada0 DEV
ada1 DISK ada1
ada1 PART ada1p1
ada1p1 DEV
ada1p1 LABEL gpt/tank
zfs::vdev ZFS::VDEV
gpt/tank DEV
ada1 DEV
root@a-fw:~ # zdb -U /etc/zfs/zpool.cache
tank:
version: 5000
name: 'tank'
state: 0
txg: 2157
pool_guid: 3111308251436133108
errata: 0
hostid: 3119175440
hostname: 'a-fw.<fqdn redacted>'
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 3111308251436133108
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 12494104300729996690
path: '/dev/gpt/tank'
whole_disk: 1
metaslab_array: 256
metaslab_shift: 34
ashift: 12
asize: 2000394125312
is_log: 0
create_txg: 4
com.delphix:vdev_zap_leaf: 129
com.delphix:vdev_zap_top: 130
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
zroot:
version: 5000
name: 'zroot'
state: 0
txg: 136118
pool_guid: 11119205119676167574
errata: 0
hostname: ''
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 11119205119676167574
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 11612196972070245027
path: '/dev/ada0p3'
whole_disk: 1
metaslab_array: 256
metaslab_shift: 29
ashift: 12
asize: 13414957056
is_log: 0
create_txg: 4
com.delphix:vdev_zap_leaf: 130
com.delphix:vdev_zap_top: 131
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
-
One thing struck me about zpool.cache is the 'working' pool (zroot) metadata reported by 'zdb -U' has a null hostname, wheras the non-working pool (tank) has a FQDN. I was wondering at what point in the boot the system knows its FQDN, if thats after ZFS loads, maybe that would explain why ZFS is failing to mount the 2nd pool. Clutching at straws again.. !
-
After poking at ZFS-based images I think that zfs-import is probably something that coulb be run, but under the assumption that all available pools are actually meant to be auto-loaded. It's like ignoring /etc/fstab and just probing all devices for something to load rather than what was being configured.
I'm not sure what's better.
Cheers,
Franco
-
Thx, I'm well aware my workaround is a bit hacky.
However as its likely this will be the only OPN host I'll want to do this on, I can't really justify looking any closer at it right now as there's so many other things still 'to-do'.
-
Well, I'm just trying t say if there is a per-pool setting for auto-mount maybe that would be the nicest thing to probe and import on demand? :)
Cheers,
Franco
-
Hmmm... 'If'....
TBH when I was looking at this the other day, I got the impression I'd reached the limit of readily-available documentation and knowledge, and that to go any further, I would risk plunging down a rabbit-hole that I really didn't need to.
I'd love to help improve OPNsense by taking an initiative on 'issues' I encounter, but sometimes you have to say to yourself, 'Lifes too short for that !' (I've just turned 59 :( )
PS. Woohoo ! I'm now a 'junior member' ! I wish I felt a bit junior than I do ;)
-
You're doing good, don't worry. :)
Ok so back to "canmount" idea I found:
https://docs.oracle.com/cd/E19253-01/819-5461/gdrcf/index.html
# zpool import -aN
# zfs mount -va
The -N for import would be important to avoid mounting something we shouldn't. The theory would be we could get the canmount datasets to mount this way.
What do you think?
-
Ok, so I may be tempted to have another look..
BTW:
Karma: 1024 = 1 KiloKarma !
Nice !
PS. I confess that as a long-term Sun admin during the 90's-00's, I have an almost biological inability to have anything to do with O*****. So I prefer to use docs from the Open* world :)
-
You're doing good, don't worry. :)
Ok so back to "canmount" idea I found:
https://docs.oracle.com/cd/E19253-01/819-5461/gdrcf/index.html
# zpool import -aN
# zfs mount -va
The -N for import would be important to avoid mounting something we shouldn't. The theory would be we could get the canmount datasets to mount this way.
What do you think?
Good find. I use another stripped-down freebsd distro for my storage and all pools get mounted without problems on reboots. This is why this one got me scratching my head. I'll ask their devs if they could tell me what mechanism is used. I'll post when I hear.
-
@franco the pool and datasets import is here https://sourceforge.net/p/xigmanas/code/HEAD/tree/trunk/etc/rc.d/zfs (https://sourceforge.net/p/xigmanas/code/HEAD/tree/trunk/etc/rc.d/zfs) (BSD license). So it's very similar approach but it does like you wondered, use -f to force.
-
Thanks, a bit strange it does import/mount by force then umount again. It's more or less import -N then... let's try that in our code.
Cheers,
Franco
-
https://github.com/opnsense/core/commit/51bdcb64ac
# opnsense-patch 51bdcb64ac
Let me know how that works for you.
Cheers,
Franco
-
Thanks ! To test this patch, I took the following actions:
1. Commented out the 'zpool import tank' in my syshook script.
2. Created new BE, rebooted, applied the patch to the new BE, rebooted.
I noted the following message during boot:
Mounting filesystems...
cannot import 'tank': pool was previously in use from another system.
Last accessed by a-fw.<domain redacted> (hostid=b9ead710) at Thu Feb 24 15:46:17 20 22
The pool can be imported, use 'zpool import -f' to import the pool.
I then just tried a manual import, without -f, this succeeded.
So it still seems to me that problem is due to pool being recognised as being from 'another system', although it is not. It's as if the host doesn't know its FQDN at the point the initial 'zpool import' happens.
Puzzlingly, although 'zdb -U /boot/zfs/zpool.cache' includes the hostname:, neither 'zpool get all' nor 'zfs get all' do (nor do they list the hostid), so I don't know where that's being stored on the pool.
-
Alternatively, is the pool to hostname/hostid association kept on the host (not the pool), and is that in zpool.cache ?
There seem to be two paths in use for this file, /boot/zfs/zpool.cache and /etc/zfs/zpool.cache.
On my host, the former path contains data only on zroot, the second one contains data on both zroot and the second pool. Maybe the wrong path is being used at boot ? !t seems a bit odd there are 2 different files with the same name...
PS. I just booted the host up into FreeBSD 13.0, and its the same there, two different files with the same name, the one under /boot only has the root pool data in it, the one under /etc has both.
-
After doing a couple of checks under FreeBSD 13.0, I exported the second pool 'tank', and shut the host down.
I then rebooted back into the OPN 22.1 with the patch. The second pool was imported OK at boot time.
I then ran:
root@a-fw:~ # zdb -U /etc/zfs/zpool.cache
tank:
version: 5000
name: 'tank'
state: 0
txg: 206460
pool_guid: 3111308251436133108
errata: 0
hostname: ''
<snip>
The 'hostname is null.
I then exported the pool, reimported it, then checked again:
root@a-fw:~ # zpool export tank
root@a-fw:~ # zpool import tank
root@a-fw:~ # zdb -U /etc/zfs/zpool.cache
tank:
version: 5000
name: 'tank'
state: 0
txg: 206553
pool_guid: 3111308251436133108
errata: 0
hostid: 3119175440
hostname: 'a-fw.<domain redacted>'
<snip>
The 'hostname' is the FQDN.
I suppose this makes sense as in the first case the hostname has not yet been set when the filesystems are being mounted.
-
Thanks for testing. This level of sanity checking seems a bit excessive within ZFS, but it is what it is.
Looking at some older tickets like https://github.com/openzfs/zfs/issues/2794 it looks like it could be the host id instead of the host name that is causing this. The file is under /etc/hostid and now understanding you are booting into FreeBSD from the same machine could cause this bouncing?
One approach to test would be to unify both files.
Cheers,
Franco
-
The implementations I've seen scan both locations for the cache file, I think it is normal.
More pertinent is the default I've always seen is the zpool create leaves the 'hostname' property empty. We can see that on the zroot pool on a OPN zfs installation.
@ajm are you sure it wasn't added when creating it or after? Is it worth a test from fresh install, maybe on a vm?
I'm going to see if I can set one up myself.
-
Update for myself.
Created a VM with 22.1 iso. Installed with zfs on root.
root@OPNsense:~ # zdb -U /boot/zfs/zpool.cache
zroot:
version: 5000
name: 'zroot'
state: 0
txg: 4
pool_guid: 5855889087766899018
errata: 0
hostid: 350323563
hostname: 'OPNsense.localdomain'
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 5855889087766899018
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 7374185842686849048
path: '/dev/ada0p3'
whole_disk: 1
metaslab_array: 67
metaslab_shift: 28
ashift: 12
asize: 3109552128
is_log: 0
create_txg: 4
com.delphix:vdev_zap_leaf: 65
com.delphix:vdev_zap_top: 66
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
I need to create a second pool and verify if it also populates the hostname and edit this post.
Update with findings after adding a second pool:
root@OPNsense:~ # truncate -s 64M poolA
root@OPNsense:~ # ls
.cshrc .lesshst .profile .vimrc
.history .login .shrc poolA
root@OPNsense:~ # zpool create -m /tank tank /root/poolA
root@OPNsense:~ # zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
tank 48M 456K 47.6M - - 5% 0% 1.00x ONLINE -
zroot 2.75G 1.02G 1.73G - - 1% 37% 1.00x ONLINE -
root@OPNsense:~ # ls /etc/zfs/
compatibility.d/ zpool.cache
taroot@OPNsense:~ # zdb -U /boot/zfs/zpool.cache
zroot:
version: 5000
name: 'zroot'
state: 0
txg: 4
pool_guid: 5855889087766899018
errata: 0
hostid: 350323563
hostname: 'OPNsense.localdomain'
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 5855889087766899018
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 7374185842686849048
path: '/dev/ada0p3'
whole_disk: 1
metaslab_array: 67
metaslab_shift: 28
ashift: 12
asize: 3109552128
is_log: 0
create_txg: 4
com.delphix:vdev_zap_leaf: 65
com.delphix:vdev_zap_top: 66
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
root@OPNsense:~ # zdb -U /etc/zfs/
compatibility.d/ zpool.cache
root@OPNsense:~ # zdb -U /etc/zfs/zpool.cache
tank:
version: 5000
name: 'tank'
state: 0
txg: 4
pool_guid: 379491550502733551
errata: 0
hostid: 350323563
hostname: 'OPNsense.localdomain'
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 379491550502733551
create_txg: 4
children[0]:
type: 'file'
id: 0
guid: 12194019493862066782
path: '/root/poolA'
metaslab_array: 67
metaslab_shift: 24
ashift: 12
asize: 62390272
is_log: 0
create_txg: 4
com.delphix:vdev_zap_leaf: 65
com.delphix:vdev_zap_top: 66
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
zroot:
version: 5000
name: 'zroot'
state: 0
txg: 248
pool_guid: 5855889087766899018
errata: 0
hostname: ''
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 5855889087766899018
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 7374185842686849048
path: '/dev/ada0p3'
whole_disk: 1
metaslab_array: 67
metaslab_shift: 28
ashift: 12
asize: 3109552128
is_log: 0
create_txg: 4
com.delphix:vdev_zap_leaf: 65
com.delphix:vdev_zap_top: 66
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
root@OPNsense:~ #
-
In my installs I don't have "hostid" property in either cache file.
If this truly is the protection against importing a foreign pool it makes sense that it needs to be imported manually once and exported cleanly to get the native hostid for the current host embedded?
Or does it "sticky" to the hostid if the host where the pool was created?
Cheers,
Franco
-
That's what has me scratching my head franco.
Default zroot pool from install: My live OPN doesn't have a "hostid" and "hostname" is empty.What I'm trying to do is try to verify if the current code works and so far this is an edge case. I'd be worried if OPN had to modify code that should just work and maybe a method of installation like convert from vanilla freebsd creates some odd artifacts like the second cache file.
I'm going open minded but in the hope that a clean installation will just work as intended.
On a test VM I just spun up both attributes "hostid" and "hostname" are populated. I haven't created the second pool yet.
-
It's not probematic per-se, but the issue I see is that since we use the bsdinstall to install ZFS and the hostid isn't there even though we set /etc/hostid on the first boot even on the installer I'm not sure how this happens... maybe a hostid tool is used and doesn't return anything for install media?
At least for subsequent pools the hostid is set?
Cheers,
Franco
-
I'm stumped. I've rebooted the test VM with the second pool and that second pool is not mounted. Seems to confirm the report.
I am not sure the hostid and hostnames values or lack of them are a problem but all my little test seem to show is that the second pool appears in only one of the cache files but I'd be very surprised if that wasn't normal behaviour.
Feels like we need an opinion from someone with a better understanding of zfs.
Anything else you think I could do to diagnose more, just let me know.
-
Looking up https://serverfault.com/questions/954479/always-do-zpool-export-for-easier-and-or-more-reliable-recovery maybe this is solely due to missing manual export?
From what I can tell the installer on FreeBSD (and therefor for us regarding ZFS) exports its pool at the end of the installation. We do the same for our ZFS image support recently added to tools which we have tested without an issue.
From what I can find WRT FreeBSD rc scripting it never exports its pools so that would point to the same conclusion?
I'd think import -f manually should do the trick. If not export is worth an additional try but then it simply must work?
Cheers,
Franco
-
Exactly. If vanilla FreeBSD rc does not export it as part of the shutdwon sequence, then a -f would be required.
But that led to the initial report that errored and suspected hostid or hostnname, no?
I think I need to pull in your patch to my test VM and try, right?
-
Yes, that would help.
Cheers,
Franco
-
I've done the patching, followed by reboot and unfortunately the second pool did not get mounted/imported.
No errors I can see in dmesg or /var/log/system/{date}.log
The same behaviour though, zpool import
with any option aka -Na or -f fail with "no pools to import" or similar (have had to connect and disconnect from VPNs thorughout the day).
What I'm unable due to time is to follow my gut. The zfs cache files are the same and I think the problem is with devices labels but can't confirm due to time. I think we should log as issue to investigate via github.
-
Thanks guys.
Sorry I've not been able to put more time into it, due to things 'IRL'.
-
I know this is an older thread, but unsettled.
I was having the exact same problem, after adding a second SSd to my system (protectli 6wd).
I had created/ recreated, imported, etc... It was at the point where if I imported it manually after booting it worked, but after rebooting it was not mounted.
What I did that appears to have fixed the issue is (the name of my pool is ext)
## Import the pool
$ sudo zpool import ext
## Verify that it's there
$ df -kh /ext
Filesystem Size Used Avail Capacity Mounted on
ext 430G 104K 430G 0% /ext
And then manually exported it
$ sudo zpool export ext
And then rebooted. It's been automatically mounting ever since.
My best guess is that since the pool wasn't mounted by the boot process, that when the system reboots it doesn't cleanly shut it down.
YMMV, but hey, if it works...
Fratz
-
For context, the discussed change was added in 22.1.4:
https://github.com/opnsense/changelog/blob/master/community/22.1/22.1.4#L20
The pool requires one manual import/export if it was not a local pool before.
Cheers,
Franco