Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - chindokae

#1
25.7 Series / Re: pkg update crashes and bricks firewall
September 10, 2025, 04:33:25 AM
Quote from: BrandyWine on September 10, 2025, 12:57:21 AM
Quote from: chindokae on September 10, 2025, 12:08:11 AM>>> Check for core packages consistency
Core package "opnsense" not known to package database.

I think this issue has been spoken to before by OPNsense folks here on the forum. Need to fix this item.
I think the fix was to install it from cli.

Maybe start with
pkg check -d
pkg clean -an (check and see what dry run says)
pkg clean -a (cleans the cache out)
pkg install -f opnsense
reboot

Then re-run the audit. If it's all good then use GUI update to see what it does.

Also use search feature, usually comes back with something good.
https://forum.opnsense.org/index.php?topic=48599.msg245505#msg245505


None of those things had any effect yesterday and I tend let Chat do my searching these days, although I did start here with a search.   As always, knowing the root cause of the problem and searching for that initially always seems to work a lot better than having to work from the initial presentation of the problem to its eventual resolution.  Makes you look smarter, too. 

The resolution to this was purging the sqlite database files - which failed many times yesterday - then trying again today after 25.7.3 was released, reinitializing the local database with packagesite, then working through the sequence of update steps that eventually resolved the issue and got it patching via the GUI again.

This problem started with patching and ended with it.  No amount of trying to update while 27.5.2 was the latest release on the repos worked, but as soon as 27.5.3 showed up, the recommended recovery techniques worked and I could patch from the console, the boot menu, or the GUI.  I copied the steps out of history, and today, they worked.

It is nice to see that the cpu-microcode-intel package now deals with the firmware issue.  Now this product works like the other major Unix distros.

I think I'll go with Occam's Razor on this one and say the cause was trying to update to 25.7.2.



#2
25.7 Series / Re: pkg update crashes and bricks firewall
September 10, 2025, 12:08:11 AM
Quote from: BrandyWine on September 09, 2025, 04:11:10 AM
Quote from: chindokae on September 09, 2025, 01:22:29 AMTo recap
How about run the audit from the Gui.


If you mean the one on the update page:

Version 25.7 is correct.
>>> Check for missing or altered kernel files
No problems detected.
>>> Check installed base version
Version 25.7 is correct.
>>> Check for missing or altered base files
No problems detected.
>>> Check installed repositories
OPNsense (Priority: 11)
>>> Check installed plugins
os-cpu-microcode-intel 1.1
>>> Check locked packages
No locks found.
>>> Check for missing package dependencies
Checking all packages: ........ done
>>> Check for missing or altered package files
Checking all packages: ........ done
>>> Check for core packages consistency
Core package "opnsense" not known to package database.

The the packagesite package has been upgraded a few times.  It reports 898 packages and no errors, but even though it appears 27.1.3 is out as of today, it couldn't be installed - until one last update tonight and not it is downloading 171 packages.
#3
25.7 Series / Re: pkg update crashes and bricks firewall
September 09, 2025, 01:22:29 AM
 Te recap - I am running the latest microcode - 0x1d and I applied the recommended sysctl settings at boot.

I have cleaned out the corrupt files in the /var/db/pkg and /var/cache/pkg directories and bootrapped  pkg again.

I had to reinstall it (pkg) for no apparent reason, since I didn't remove it, but it went in cleanly.

I was able to install smartmontools and os-cpu-microcode-intel.

I downloaded images today that are labeled 25.7 and appear to be running that, but the updated versions I see in the repo are not available for download.

pkg says I am up to date even though I see what appear to be two new revisions, 25.7.2 and 25.7.1.

I can only check for updates from the command line because letting the GUI do that cause it to lock up.

All the segmentation faults, etc, are now fixed from the cli.

The repo is not approve reproach - the smartmontools package did not install via pkg install because it tried to whack and entire /man folder.  It tried to move /usr/local/share/man/.pkgtemp.man8.W8KvzBQqO5Xf over usr/local/share/man/man8 which contains a lot of manpage files.  It wasn't trying to copy into it, it tried to replace it and was thankfully blocked.   I installed the package out of the cache, but that is hardly a smooth install.

That is all the time I can afford to put into this problem today.  I was *supposed* to be working on hardening the QNAP when this happened.  New RAM is on the porch and I need to put that in and leave well enough alone for on this problem.

Edit: RAM went in without a fight, Gott sei dank.

To finish off the day's status and for sake of completeness:

I ran long and short smartctl tests and they both completed without error.  dmesg does not show any indication of physical corruption of the disk, no read or write faults, no bad block error, etc.

The initial crash happened during an attempt to update via the GUI and that still isn't right.  Even if it doesn't crash the system, it breaks the httpd service and that requires a reboot, but that is not as bad as the beeping black screen of death.

I have mitigated the presumed firmware issue but have not achieved full operation.  I still see no indication of any hardware faults.   The runaway update error may just have recursed away all available RAM and not crashed the stack.  Either way, it's going down.

Given the inconsistencies in patching even from the command line I have to wonder if the repos got a second bad update.   It happens.
#4
25.7 Series / Re: pkg update crashes and bricks firewall
September 08, 2025, 11:41:24 PM
I cleaned out the pkg data and caches and got it bootstrapped again.   I can install packages and search for things:

pkg search -Q name opnsense
opnsense-25.7.2
Name           : opnsense
Comment        : OPNsense community release

So from the command line it seems OK, but from the UI, touching the "check for updates" freezes the web gui.   That requires areboot from the shell.   

Is there an updated image available to install 25.7.2?   
#5
25.7 Series / Re: pkg update crashes and bricks firewall
September 08, 2025, 11:04:32 PM
Well, maybe microcode and mitigations aren't the problem after all:

dmesg | grep micro
[1] CPU microcode: updated from 0xe to 0x1d

sysctl hw.ibrs_disable vm.pmap.pti
hw.ibrs_disable: 0
vm.pmap.pti: 1

The sqlite database is not corrupt physically this time, semantically, IDK.

***GOT REQUEST TO CHECK FOR UPDATES***
Currently running OPNsense 25.7 (amd64) at Mon Sep  8 21:00:10 UTC 2025
Fetching changelog information, please wait... done
Updating OPNsense repository catalogue...
Child process pid=11217 terminated abnormally: Segmentation fault
Child process pid=11850 terminated abnormally: Segmentation fault
Child process pid=13706 terminated abnormally: Segmentation fault
self: No packages available to install matching 'opnsense'
***DONE***
#6
25.7 Series / Re: pkg update crashes and bricks firewall
September 08, 2025, 10:52:26 PM
Quote from: Patrick M. Hausen on September 08, 2025, 10:48:49 PMInstall the os-cpu-microcode-intel plugin and reboot.

Thanks. 
#7
25.7 Series / Re: pkg update crashes and bricks firewall
September 08, 2025, 10:41:40 PM
Just to update, it is booted in multiuser mode and running fine.  The sqlite package database got trashed.  I may try the update again later.  I did manage to get smartmon tools installed and so far it is clean.  No logged historical errors, short and long tests both completed without error.  Due the lack of dmidecode I can't locate the microcode version but it is likely to be 0x1c.  Due to the lack of devcpu-data I can't easily patch it at runtime, something that OpenBSD and Linux handle automatically. 
#8
25.7 Series / Re: pkg update crashes and bricks firewall
September 08, 2025, 09:53:42 PM
Quote from: Patrick M. Hausen on September 08, 2025, 08:58:50 PMThe microcode update must be run from the OS at every boot - it's not persistent. If the manufacturer of your system offers a BIOS update, by all means install that first.

ZFS is more robust than any other file system existing. That's really just a fact. So it would be interesting to know more about your epic install failure. How much memory does your box have?

16 GB.   It normally runs with about 14 GB free.   The failure was just an infinite hang at the end of the install.  The epic part was figuring out how to clean all the ZFS residue off the disk - as I didn't know what to expect having never used it before.  dd ~ 100GB usually cleans all, but not with ZFS. 
#9
25.7 Series / Re: pkg update crashes and bricks firewall
September 08, 2025, 09:48:58 PM
The system barked on 'chown: dhcpd: illegal group name', so where do you come up with "there is no group named dhcp"

cat/etc/group | grep -i dhcp

_dhcp

It also has a leading underscore.

root@OPNsense:/var/db/pkg # chown dhcp:dhcp local.sqlite.bak
chown: dhcp: illegal group name
root@OPNsense:/var/db/pkg # chown dhcp:_dhcp local.sqlite.bak
chown: dhcp: illegal user name

I am primarily a Redhat admin, there are some quirks in BSD I am unfamiliar with, but this doesn't appear to e one of them.


#10
25.7 Series / Re: pkg update crashes and bricks firewall
September 08, 2025, 08:52:44 PM
Thanks, that is a viable explanation.

However, I did try ZFS first and that was an epic install failure.   It may not like that FS any better than the other.

I will have to push whatever action I take to the right a bit, primary user one is using the network and it had better not go down again or there will be something bad for dinner.

I will try to install Linux and see if I can get that to push the firmware.
#11
25.7 Series / Re: pkg update crashes and bricks firewall
September 08, 2025, 08:04:41 PM
sysctl -a | grep hw.model
hw.model: Intel(R) N100
hw.clockrate: 806
hw.ncpu: 4


Intel Alder Lake-M

Please explain how the CPU can cause layer 7 problems. 

I may not have been clear, but there were no hardware problems last week, last month, or last quarter, and nothing has changed.  I update monthly when I update all the Linux machines and the offline repos.   

I see nothing in dmesg or any other log to indicate a hardware issue, and there were no errors in /var/log/installer.log.   All in all, the system is operating nominally, as long as I don't try to update packages.

There are errors in the logs like:

/system_advanced_admin.php: The command '/usr/sbin/chown -R dhcpd:dhcpd '/var/dhcpd'' returned exit code '1', the output was 'chown: dhcpd: illegal group name'

Which is true, there is no group named dhcp and the user name is _dhcp

Other than that the logs are fairly clean - and nothing that is going to crash a BSD kernel - which in my experience is nearly impossible.

The update program is running entirely in user space, and if it is able to crash the kernel then there is a very good chance that whatever caused this is exploitable.

I sent two complete error reports. If that doesn't suffice, I could try running it in a VM to see how that goes. 

#12
25.7 Series / Re: pkg update crashes and bricks firewall
September 08, 2025, 07:03:52 PM
Well, I should add that this hardware has been running OpnSense for nearly a year and was already at 27.1 when I made what should have been minor change to Unbound DNS, setting it to use secure DNS at Cloudflare.  That caused the loss of all host DNS resolution capability and when I disabled that service and went back to DHCP-based DNS, the GUI locked up and the firewall was essentially bricked. It would try to reboot if I used the power button but would not bring up the LAN interface.

I sent two bug reports under the email address associated with this account, all necessary information should be there. As I indicated, I am not willing to to take down the network again, I need it for work.  From a practical point of view, no camera I have at the house has the frame rate necessary to capture the errors anyway.  They were just a blur. 

The core system firmware boots and runs fine as long as I don't try to update packages.  I am running through the firewall now and using the web GUI as I write this.   

To me that doesn't look anything like hardware.  That looks like bad package.
#13
25.7 Series / pkg update crashes and bricks firewall
September 08, 2025, 06:32:00 PM
I just installed a fresh copy of 27.1 on my hardware firewall.  The installation was normal, configured LAN IP and DHCP scope, set root password, logged in with laptop on the LAN interface.  Went to the update firmware page and allowed it to patch.   It crashed and got stuck on the bug reporting page and would not leave it.  I submitted it twice then forced a reboot with a short press on the power switch which usually does a graceful shutdown.  Not this time, it gave three short beeps and powered off in less than 3 seconds.   When I powered it back up it never appeared on the network and pushing the power button gave the same immediate power off.

Rebuilt it on the workbench but this time updated from the console.  The core system was fully up to date but packages needed updating.  I tried updating from the console and it started normally, downloaded the packages, then crashed immediately when it tried to apply them.  The screen was flooded with thousands of errors, far too fast to read, then it rebooted itself. 

Unlike the attempt from the web GUI, it did not brick itself.  The first package was lighttpd, I believe.   I am not in the mood to retry the update to find out.

This should be easily reproducible as no customizations or settings other than LAN IP and LAN DHCP scope were applied.