OPNsense Forum

Archive => 23.1 Legacy Series => Topic started by: CrazyBebop on March 14, 2023, 12:45:28 PM

Title: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: CrazyBebop on March 14, 2023, 12:45:28 PM
Hello!,

I just recently bought one of those mini PC firewalls off of Amazon, I was able to get 23.1 installed and imported config and it works perfectly until I attempt to upgrade to 23.1.3, after a successful upgrade message, the system reboots successfully.


After a couple of seconds, even after being able to reach the opnsense web GUI, I see a bunch of text fly down the little monitor I use for the firewall and immediately reboot into the "FreeBSD" screen which boots the system backup, which after a few seconds crashes again, and loops on and on.

I don't think it's a failing disk, because it's literally a brand new Nvme drive unless the drive itself is faulty right away, but I doubt it, because even now, without updating, everything is running perfectly, and now has been for 7 hours... just when I update, everything breaks..

Does anyone have any suggestions?
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: CrazyBebop on March 14, 2023, 10:40:03 PM
No one? :(
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: Patrick M. Hausen on March 14, 2023, 10:42:32 PM
Film it with your mobile phone and then post some evidence, i.e. the kernel panic message that is probably occuring right before the reboot.

Without more information it is simply impossible to tell.
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: CrazyBebop on March 14, 2023, 10:48:55 PM
Quote from: pmhausen on March 14, 2023, 10:42:32 PM
Film it with your mobile phone and then post some evidence, i.e. the kernel panic message that is probably occuring right before the reboot.

Without more information it is simply impossible to tell.

I'll do that tonight, thanks.
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: develishh on March 16, 2023, 08:31:26 AM
Hello, I am experiencing a very similar Problem.
mine behaves like this:


After updating to 23.1.3_4  (Fresh install, only restored config from backup):

It might be worth noting that this only happens when connected to WAN, is has never crashed on me when there was no Internet connection.
Proof Video: https://youtu.be/xvG1fJo8QVg
Im currently running version 23.1, which seems to be stable, but does not allow me to install Updates (Update required)
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: rcmcronny on March 16, 2023, 01:33:45 PM
Hi,

I have this to, it was also on 23.1.2 (check for updates -> kernel crash and reboot) its headless so hard to know more.
After a reboot and immedially update i could update to 23.1.3.
Now the reboots happen without interaction,  i heared the "boot up sound" yestern 11pm and today in the morning around 5:45am.

I follow this thread and try to get / find a serial or other console to get more output if needed.

Ronny
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: silverspy18 on March 16, 2023, 10:58:48 PM
My apologies, I posted to the wrong thread. Please delete if possible.
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: develishh on March 20, 2023, 11:18:27 PM
I have managed to find a Workaround which stops the system from crashing until the next reboot.
The problem seems to be caused by driver issues with the new Kernel (educated guess). Booting with the old kernel has completely solved my Issue.

Workaround:
The system should now run stable.
This is not permanent. OPNsense will boot the newer kernel on reboot
If anyone knows how to make this permanent, I would greatly appreciate a reply.


Other things I have tried (just want to mention for documentation purposes):
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: rcmcronny on March 21, 2023, 06:58:47 AM
Hi,

thanks for this information,

I also checked my sata ssd, as i suspected,that that was the case, but it also health, no issues here to see. I use a APU2C4 with Bios v4.17.0.3 if that is relevant.

Fot the moment , i live with the reboots, as its my 2nd internet Link, will have to move in the next weeks and then i have to check in detail.

@opnsense any hints or infos for us here ?

Ronny

Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: meyergru on March 21, 2023, 08:11:52 AM
What is your common factor? I226-V as NIC? This does not apply for the APU2C4, so: PPPoE on WAN?

The I226-V FreeBSD drivers are fairly fresh, pfSense does not even support those yet.

And after several I225 generations ridden with problems, there is plenty of indication on several other platforms (Windows, Linux) right now that I226-V might be just as unstable hardware-wise, just google for "I226-V connection drop".
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: rcmcronny on March 21, 2023, 09:10:11 AM
Quote from: meyergru on March 21, 2023, 08:11:52 AM
What is your common factor? I226-V as NIC? This does not apply for the APU2C4, so: PPPoE on WAN?

Yes,  PPPoE on the WAN side for me. Network Ports have the "Intel(R) I210" for me.

And this was stable as hell. For me it started with 23.1.2 and everytime i did a update check. Now with 23.1.3 it happens more often.

Its headless, so i really have to search to get my serial cable out of my "big box"  to see more. But it is really a crash with reboot not a simple "connection drop".

Ronny
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: meyergru on March 21, 2023, 10:08:55 AM
What is a connection drop on one OS may as well manifest as a kernel crash on another, just saying.

However, there are few reports of kernel crashes just because of using PPPoE. I have three OpnSense 23.1.3 installations running over it and had no problems at all.

I wonder if other reports share a common factor in hardware where it is more likely to have crashes than with a user-level process like mpd5. For now, information about probable causes in this thread is scarce (e.g. "those mini PC firewalls off of Amazon" use either I210, I211, I225 or I226 or even Realtek) and only shows common symptoms (i.e. kernel crashes).
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: develishh on March 21, 2023, 11:38:27 AM
I use a USB to Ethernet Adapter for my WAN interface.
Im also using PPPoE on my WAN interface.

root@OPNsense:~ # sysctl -a | grep -E 'dev.(rgephy|em|ure).*.%desc:'
dev.rgephy.0.%desc: RTL8251/8153 1000BASE-T media interface
dev.ure.0.%desc: Realtek USB 10/100/1000 LAN, class 0/0, rev 3.00/31.00, addr 3
dev.em.0.%desc: Intel(R) I219-LM SPT-H(2)


The NIC inside of the adapter seems to be made by Realtek.
My LAN interface is a Intel(R) I219-LM.
Prior to the update it was also running 100% stable for me.

I'm not using one of those "mini PC firewalls off of Amazon" I have a small form factor PC by HP (Elitedesk 800 G2; Intel i3-6100).
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: aiglauer on March 23, 2023, 01:23:22 AM
I am getting a similar reboot loop on OPNsense 23.1.4_1-amd64, which is running within Proxmox with NICs passed through (Mellanox ConnectX-3 and Intel I219). This system has been running fine for nearly a year, and I cannot see any obvious underlying hardware issues. The PC itself is an Intel i5-7600 (consumer PC, not an all-in-one).

Observing the crash in the proxmox console, it happens too fast to read any messages (goes from console prompt to 'Guest not running'). I don't know opnsense/FreeBSD well enough to be able to find the relevant log files (/var/log/dmesg shows nothing of interest).

Happy to help debug, just tell me where to look (log files etc).
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: steely.wing on April 06, 2023, 12:27:54 AM
I have this issue too, I installed on a mini PC, after setup the WAN using a VLAN, wait several minutes, OPNsense will crash and reboot, after reboot several minutes, it will reboot again.
If I down the WAN interface, it doesn't have crash in several hours.
I have done some memtest and disk check, there are no issue.
I can't find this issue report in Github, may be we should report in Github issue?
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: develishh on April 08, 2023, 01:46:10 AM
It looks like that may be our only option...
The only problem is that we haven't even figured out what is causing the crash.
Let's start by finding out if we are actually getting the same page fault.

This command will give out the most relevant information:
tar -xf /var/crash/textdump.tar.last -O | grep Fatal -A 21

It would be great if everyone who is experiencing this or something similar could paste the output of this command (or a screenshot) here. I attached mine in this post:
[Screenshot-2023-04-08.png]
(In order to avoid having to take my router apart I passed the SSD through to a VM)
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: myradon on April 14, 2023, 11:48:07 AM
I've got same problem. I bought a Loksing X86-P2 mini PC 5 (https://www.loksing.com.cn/products/x86-p2-software-route-n4000-j4105-j4125-mini-host-6w-low-power-consumption-quad-core-quad-thread-intelligent-hardware-fanless-energy-saving-microcomputer-computer); CPU Intel J4125 with NICS Intel I226-V with Samsung NVME running (latest) OPNsense 23.1.5_4. I've configured VLAN, IPS in Promiscious Mode.

Monitor just goes blank or see a shutdown with various services shutting down and even speaker beep. It happens after 10 minutes, 3 hours or within couple of seconds.

As suggested I've done a rollback to previous kernel. It doesn't make any difference.
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: neonknight on April 18, 2023, 07:36:22 PM
Same issue here. Fresh install on pcengines apu2c4 and a brand new apu6b4 (which I only bought because I thought that old apu2 was faulty ::) ). I updated the bios of said apu6b4 to the latest release. Opnsense was updated to 23.1.5_4 (mandatory to update, else os-igmp-proxy cannot be installed).

Crash is perfectly reproducible with this command:
curl https://3mdeb.com/open-source-firmware/pcengines/apu6/apu6_v4.17.0.3.rom > /dev/null

Upon crash I get loads of serial console output. I attached the beginning of it.

Using the old kernel (FreeBSD 13.1-RELEASE-p5 stable/23.1-n250372-c4ad069e50a) helps, thanks @develishh for the suggestion.

I'm unable to extract crash output, as the file does not exists. My system probably crashes before it can be written.
tar -xf /var/crash/textdump.tar.last -O | grep Fatal -A 21
tar: Error opening archive: Failed to open '/var/crash/textdump.tar.last'
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: dcdata on April 20, 2023, 10:46:38 AM
Thank you everyone. I'm relieved to have found this post.
The exact same issue is being experienced on my side using OPNsense in a KVM virtual machine.
Booting the old kernel resolves the problem. (13.1-RELEASE-p5 FreeBSD 13.1-RELEASE-p5 stable/23.1-n250372-c4ad069e50a SMP amd64)
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: marcoj90 on April 24, 2023, 09:35:36 PM
I´m not alone...   :D

I have the exact same problem on 2 completly differend systems.

1. System:
An VM running on unRAID with direct access to to the hardware Intel + Broadcom NICs.

This system was running for about 4 years now. But after an updates i got these reboot problems. At 1st i thought it was a dead NIC so i bought a couple of different Broadcom NICs and also tried to use a virtual Network interface ... without succsess.

2. System: Mini PC Celeron J4125 4 x I225-V (bare-metal installation)

Here i got the same reboot problem. So i was using my whole weekend to analyse it a bit more and made about 18 fresh installations only on that device.

As far as i can say, the fresh downloaded Version 23.1 is absolutly stable. It runs without any problems. But also without extenstions (updates required). As soon as the updates are installed, right after the boot completed, the system crashes. But only, if the WAN interface gets connected. If the ethernet cable on the WAN gets removed, the system runs stable ... for some time (a few minutes in my case maximum was about 20min). Trying to switch between the interfaces doesn`t change anything.

This even happen, if im not recover any configuration. Just fresh install, interface assignment and static ip on LAN. After the update to the newest version the problems keep going.

For me it looks like some sort of I/O Problem.

If i just connect one PC to the LAN Interface, connect WAN and don´t use any internet connection, it seems to work till i try to configure something or using the internet connection. Sometimes it worked for a few MB till it crashes.

And this happen on both systems the same way.

Additional Information:
Because i was waiting for the mini pc, i was using a different router as gateway and the OPNsense VM with unplugged WAN Cable for openVPN, DHCP, DNS, DDClient for about 3 weeks without any crashes. Just the WAN ethernet cable disconnected.
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: neonknight on April 26, 2023, 08:58:12 AM
I updated my spare hardware (which crashed too) to 23.1.6. A few simple tests indicate the bug might be gone. I haven't been able to test thoroughly  and haven't tried with my production system (which I don't dare to update yet and risk losing the "good old working kernel").
Has anyone else updated and can confirm?
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: develishh on May 03, 2023, 09:43:59 PM
I have updated about a week ago and the problem seems to be resolved.  :D
Title: Re: Upgrade from 23.1->23.1.3 kernel panic/crashing
Post by: bikemike on May 09, 2023, 09:34:55 PM
I am having the same issue with random reboots.  There are many, many threads on this in the forums with no real solution dating back to previous versions up to current.  I posted this thread, but not a lot of movement:

https://forum.opnsense.org/index.php?topic=33583.msg162367#msg162367

Was going to try to update the BIOS on my APU1D4, but not hopeful since others in this thread have not seen improvement afterwards.  Wish the OPNsense developers would put some attention on this.  I did however reach the four day mark until this morning when things crashed (twice).  Looks very similar to the output in the YT video someone posted in a previous comment but not related update checks.  Coming from pfSense where I had literally zero stability issues to random reboots nearly every other day is super frustrating.