Kernel panic after upgrade

Started by tamer, February 01, 2019, 09:51:22 PM

Previous topic - Next topic
February 10, 2019, 10:09:12 PM #30 Last Edit: February 10, 2019, 11:12:16 PM by newsense
Quote from: laterra on February 08, 2019, 11:00:34 PM

You can also set them for only the current boot by escaping to the loader prompt in the bootloader. So when you see the OPNsense boot menu, hit select option 3, then type:

set vm.pmap.pti="0"
set hw.ibrs_disable="1"
boot

Doing this does disable Meltdown/Spectre mitigations. But only for that one boot just to see if that's the problem.

This was required on the following CPU running VBox 6.0.4. Opnsense upgrade would otherwise freeze the Win10 host on reboot into the 19.1 kernel and the upgrade wouldn't continue. By applying the commands on each reboot the upgrade completed without issues.

For referrence this in the host CPU info:

Cores 6
Threads 6
Name AMD Phenom II X6 1070T
Code Name Thuban
Package Socket AM3 (938)
Technology 45nm
Specification AMD Phenom II X6 1075T Processor
Family F
Extended Family 10
Model A
Extended Model A
Stepping 0
Revision PH-E0
Instructions MMX (+), 3DNow! (+), SSE, SSE2, SSE3, SSE4A, AMD 64, NX, VMX

Wrong quote above, fixed now to reflect the correct one by laterra

I have given up, as it is not possible to get support for upgrading Opnsense on the Dell Poweredge R410 which I am running.

I have a paid support subscription, but only got this answer:
"We do not have a Dell R410 to test. We have tested it on the hardware we sell and on these the upgrade works well. "

I do not consider this professional support and had actually expected better, when I decided to move from Cisco to Opnsense.

Now I have bitten into the bitter apple and ordered "hardware Deciso sells" to run Opnsense on it, as this seems to be the only safe way of have a system which can be kept up-to-date.





Quote from: Aloist on February 11, 2019, 02:09:00 PM
[...]

I have a paid support subscription, but only got this answer:
"We do not have a Dell R410 to test. We have tested it on the hardware we sell and on these the upgrade works well. "

[...]
I perfectly agree with you: not a professional answer. I can understand they can not have any hardware of the world but paid support must give more than that!

I have a spare R410 lying around. The RAM in it is dead. If you can hold out until I can buy new RAM (within a week or so), I'd be happy to test out on my (currently dead) R410.

February 11, 2019, 04:10:36 PM #35 Last Edit: February 11, 2019, 04:25:26 PM by Aloist
Quote from: lattera on February 11, 2019, 03:23:30 PM
I have a spare R410 lying around. The RAM in it is dead. If you can hold out until I can buy new RAM (within a week or so), I'd be happy to test out on my (currently dead) R410.

That is kind of you, thank you.
But as I have now ordered the appliance from Deciso, I will move from the Dell R410 (we have a lot of older Dell Poweredges) to the supported hardware.

I would not be able to rely on your test, because originally, when I first installed Opnsense 18.7 on the R410, kernel crashes also happened, most likely due to Raid controller issues. I documented what I did to fix this:
Have trouble installing opensense on w99, it crashes, most likely due
to driver issues for the Dell PERC
I can catch the USB boot process and enter menu option 3, to set boot
parameters.
There, I add
set hw.mfi.mrsas_enabled=1
and boot.

It may work. Afterwards, I must login as installer, pw= opnsense

It worked. After reboot, I can enter shell and see with
dmesg|more
that the Megaraid SAS driver is active.
With command
mfiutil show config
I can see then the raid configuration is properly recognized.


It is a pity to give up the R410, as it has completely new disks and would have run for many years to come. Still, I have other work to do and cannot spend time with unstable support situations for an essential piece of hardware.

The Deciso hardware has no RAID and no dual power supply, is therefore inferior on that side.

I may keep the R410 as a backup. Once it is no longer a critical component, I can afford to risk version upgrades on it.



@Aloist

Maybe if it is such an ultra critical device for you you should invest in a CARP cluster; possibly with the Dell as slave for the new device for example.

Even in case of updates; if the primary gets messed up the secondary can take over until the first gets fixed.

For 99% of workloads it would be best to maybe virtualize the device for quick BMR backup. For failed updates, snapshot restore is an extra bonus.

Virtualization would also make a test environment for major changes and updates feasible, especially with spare server laying around as you said.

Quote from: namezero111111 on February 12, 2019, 08:16:35 AM
@Aloist

Maybe if it is such an ultra critical device for you you should invest in a CARP cluster; possibly with the Dell as slave for the new device for example.

It is only the office firewall. But I work a lot outside of the office.

I like to trust that if I do a software update in a device in the office, it reboots and after a few minutes it will be up again.

If the system software update is so bad that at a reboot it ends up in a kernel panic, it comes never up again. I would have to be physically there, and reinstall from cold. This is what I fear, not a hardware failure.

We use RAID disks on all essential systems, because disk failure is the most frequent hardware failure. Typically after several years of 7/24 use. All else, i.e. power supply, RAM, CPU fails much more rarely, in my 40+ years IT experience.

Hello,

i've the same  Problem with the Kernel Panic.
see attached Screenshot.
My CPU is an i3-8100T on Fujitsu Mainboard D3633-S
The two kernel parameters i have tried, but then Panic stays the same.
The USB Keyboard is dead after that, so the command bt doesn't work.

Does anyone have a solution?

I have also encountered this problem when going from 18.7.x to 19.1.x.

I am running the OPNsense in a VM under Proxmox VE 5.3-9 with UEFI for the first time and have encountered this error. I have NOT encountered this with regular BIOS (SeaBIOS) with 18.7.x to 19.1.x or new 19.1.x installs.
Changed the VM from UEFI to BIOS (SeaBIOS) and the OPNSense 19.1.x VM booted from HDD or ISO CDROM with no issues.

Also the snip below is in the release notes which maybe the cause of this issue:

Migration notes and minor incompatibilities to look out for:

o Gateway health graphs may need a manual reset due to the Apinger to Dpinger migration.  Apinger is no longer available.
o Intrusion detection GeoIP rules are automatically deactivated and need to be manually migrated to firewall alias GeoIP.
o Quagga plugin has been superseded by FRR plugin.  A binary quagga package has been conserved for the time being.
o Please read the FRR documentation with regard to the required system tunables[8].
o Bhyve UEFI boot may fail as a guest.  The problem is being investigated.
o SNMP plugin has been superseded by Net-SNMP plugin.


So it seems to be a general UEFI issue from my eyes.

I thought the bhyve UEFI boot issue was worked around by passing the -w flag to bhyve. Are there other issues related to booting OPNsense 19.1 in bhyve?

Lattera:

I do not know about the bhyve UEFI issue. I just noticed it in the release notes. And I am having an issue with UEFI and from past experience the Dell RXXX series of servers used UEFI by default. So I assumed the UEFI might be an issue in general not just in the bhyve implementation.

February 13, 2019, 11:51:49 PM #42 Last Edit: February 14, 2019, 02:30:59 AM by bunchofreeds
@TheGrandWazoo

I think it is UEFI specific also. Which would make sense considering the broad spectrum of hardware and virtual platforms being impacted.

Looking at the contents of the ISO and comparing between 18.7 and 19.1. The files supporting EFI boot have been updated. Being /BOOT/BOOT1.EFI and /BOOT/LOADER.EFI

https://wiki.freebsd.org/UEFI
https://www.freebsdfoundation.org/freebsd-uefi-secure-boot/     Bit easier to understand but is looking at Secure Boot


Looks to be a FreeBSD or HardenedBSD issue?

Edit:  Tried a Hyper-V install using HardenedBSD-11-STABLE-v1100056.13-amd64-bootonly
Kernel Panic at the same point. Similar output, non responsive.


Hi all.

I have just completed a clean install of OPNsense v19.1, but had to work around a problem that appears to be similar to that discussed above. However the workaround for me doesn't seem to fit well with some of the suggestions above, laying the blame at the feet of the UEFI Boot process.

So, for the record... I have just installed v19.1 (vga) on a QOTOM-Q190G4 box (physical hardware).

Installation attempts failed by hanging after the kernel loaded its collection of kernel modules and the "Booting" line of text appeared. Nothing further was shown on the screen although USB drive activity continued for some time. I have seen kernel panic messages on Linux consoles, but I don't know whether a BSD kernel panic message would be expected on the display at this point, if there was such an event here. So, with nothing on the display (and nothing yet configured, so no network testing possible) this could conceivably be the same kernel panic referenced in this forum thread.

The fix:  So far, this box's BIOS had been set to traditional MBR BIOS boot (non-UEFI). Somewhat unexpectedly after the comments above, once the system BIOS setting was changed to use UEFI boot, the boot process (and subsequent live operation and then installation) of OPNsense v19.1 went through without a hitch.

Whether this is related to the kernel panics discussed above or not, I can't be sure. However it does confirm a surprising and similar boot problem, and the corresponding solution, at least for this pairing of hardware and OPNsense release. Hopefully that may assist some others.

There is a little bit of strangeness here, it affects some devices and not others. I use Qotom's too, i5 versions, but what you say about the installation appearing to continue is true. If you had connected the serial port and looked there you may have seen the console output appear there. This is something I do by default now when doing clean installs, connect both HDMI and also have a serial link to my PC!
OPNsense 24.7 - Qotom Q355G4 - ISP - Squirrel 1Gbps.

Team Rebellion Member