OPNsense Forum

Archive => 19.7 Legacy Series => Topic started by: SimpleRezo on October 10, 2019, 11:37:50 am

Title: Nano image bug since 19.7 on APU
Post by: SimpleRezo on October 10, 2019, 11:37:50 am
Hi

I encoutered several times the same issue with OPNSense 19.7 on APU3D4 with a SSD 16GB.
 
It works but after some reboot (3/4), it does not boot anymore, because it cannot mount partition:

Code: [Select]
Trying to mount root from ufs:/dev/ufs/OPNsense_Nano [rw]...
WARNING: / was not properly dismounted
random: unblocking device.
Setting hostuuid: dd974d29-fee3-11e6-b15e-000db950feac.
Setting hostid: 0xc348cc7e.
Starting file system checks:
/dev/ufs/OPNsense_Nano: CYLINDER GROUP 0: BAD MAGIC NUMBER
/dev/ufs/OPNsense_Nano: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY.
Automatic file sysuhub1: 4 ports with 4 removable, self powered
tem check failed; help!
ERROR: ABORTING BOOT (sending SIGTERM to parent)!

So I tried to boot on a FreeBSD Live system on USB and fix the issue, BUT it's not possible at all:
Code: [Select]
# fsck -y /dev/ada0a
** /dev/ada0a
** Last Mounted on /
** Phase 1 - Check Blocks and Sizes
CYLINDER GROUP 0: BAD MAGIC NUMBER
UNEXPECTED SOFT UPDATE INCONSISTENCY

REBUILD CYLINDER GROUP? yes
[ this error happens 32 times for cylinders 0 to 31... ]
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
UNREF DIR  I=494302  OWNER=root MODE=40755
SIZE=512 MTIME=Sep 19 15:00 2019
RECONNECT? yes

NO lost+found DIRECTORY
CREATE? yes

CYLINDER GROUP 0: BAD MAGIC NUMBER
UNEXPECTED SOFT UPDATE INCONSISTENCY

REBUILD CYLINDER GROUP? yes

fsck_ffs: bad inode number 0 to ginode

If I re-run fsck, the 'BAD MAGIC NUMBER' are still there!

Any ideas why it's happening (and again on different harware!) and how to fix/prevent that issue ??
Title: Re: Disk issue on reboot
Post by: SimpleRezo on October 21, 2019, 01:14:24 pm
After a lot of tests, it seems to be related to "nano" images since 19.7.
Maybe it's related to the partition scheme that seems to have changed since 19.7.
It could be also specific to the hardware we are using (PCEngines APU).

With 19.7:
Code: [Select]
# bunzip2 -c OPNsense-19.7-OpenSSL-nano-amd64.img.bz2 | dd of=/dev/ada0 bs=64M
0+727313 records in
48+0 records out
3221225472 bytes transferred in 249.678551 secs (12901491 bytes/sec)
# fsck /dev/ada0a
** /dev/ada0a
** Last Mounted on
** Phase 1 - Check Blocks and Sizes
CYLINDER GROUP 0: BAD MAGIC NUMBER
REBUILD CYLINDER GROUP? [yn] ^C

***** FILE SYSTEM MARKED DIRTY *****

With 19.1:
Code: [Select]
# bunzip2 -c OPNsense-19.1.4-OpenSSL-nano-amd64.img.bz2 | dd of=/dev/ada0 bs=64M
0+738237 records in
48+0 records out
3221225472 bytes transferred in 234.867398 secs (13715081 bytes/sec)
root@boxlive:/tmp # fsck /dev/ada0a
** /dev/ada0a
** Last Mounted on
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
35209 files, 271164 used, 513915 free (11 frags, 64238 blocks, 0.0% fragmentation)

***** FILE SYSTEM IS CLEAN *****
Title: Re: Nano image bug since 19.7 on APU
Post by: maweber on November 20, 2019, 11:53:11 pm
me too,
after an upgrade
Did you find a solution?
Title: Re: Nano image bug since 19.7 on APU
Post by: SimpleRezo on November 26, 2019, 01:37:12 am
No, unfortunely. For now we are deploying with 19.1 and then upgrading to 19.7 to avoid the issue...
Title: Re: Nano image bug since 19.7 on APU
Post by: maweber on November 26, 2019, 07:47:22 am
I managed to boot into the current usb serial installer, preselect the config from the internal apu ssd (side load from within installer), having it confirmed on screen and installed the fresh version with the old config over the internal ssd.
i was astonished the old fs was well intact.
Title: Re: Nano image bug since 19.7 on APU
Post by: bobbis on November 27, 2019, 09:14:31 pm
same here, running OPNsense 19.7.6 on USB Thumbstick, upgrading to 19.7.7 on console via ssh, after reboot no boot anymore ...
Title: Re: Nano image bug since 19.7 on APU
Post by: 172pilot on June 25, 2020, 08:36:21 pm
Any update on this?  I'm installing on a Watchguard X750 and am having the same problem.. After a reboot or two, I seem to be getting some corruption that isn't fixed (regardless whether it's orderly shutdown or not).  Based on this thread, I did a full upgrade from the console, and upon first boot, it was completely unable to boot..  Very strange!

TIA..
Title: Re: Nano image bug since 19.7 on APU
Post by: franco on June 26, 2020, 10:15:46 am
When on 20.7 we might try to change the utility that is used to build nano images. Chances are:

The utility behaves better on 12.1, or

another utility behaves better in general.

Read all the code here:

https://github.com/opnsense/tools/blob/master/build/nano.sh#L61-L68

Suggestions to make this instantly better and retain compat for everyone would be a way forward.


Cheers,
Franco
Title: Re: Nano image bug since 19.7 on APU
Post by: SimpleRezo on July 27, 2020, 05:13:05 pm
=> https://github.com/opnsense/tools/issues/189
Title: Re: Nano image bug since 19.7 on APU
Post by: franco on July 28, 2020, 12:01:14 pm
Have you tried the 20.7.r1 image? "-f" is a fix we did for too little inodes created on the system (which Python will not like very much as it has too many files). Removing this to fix something else would be reckless.


Cheers,
Franco
Title: Re: Nano image bug since 19.7 on APU
Post by: SimpleRezo on August 04, 2020, 05:50:59 pm
We have tried with 20.7 release, the nano image has the same issue.