How to auto reset after kernel panic?

Started by birdwood, December 16, 2024, 01:07:40 PM

Previous topic - Next topic
Hi there,

I am running OPNsense in a VM, on a system with quad J6412, 16GB RAM, SSD.

Was getting daily kernel panic with IDS. Without IDS I still get a weekly kernel panic at least.

After kernel panic, the system doesn't reset, it just stays in the debugger. Isn't it meant to auto reset?

I tried "ddb script kdb.enter.panic=reset" in console but no luck.

How can I get the system to reset itself?
Thanks.



Yes, it auto-resets by default?


Cheers,
Franco

Quote from: franco on December 16, 2024, 04:10:14 PMYes, it auto-resets by default?

Hmm, that's what I thought.. but I am being greeted with a screen like this that requires manual reset.


That's a shallow backtrace without much to go on. Does this maybe happen during bootup when the automatic reboot was not configured yet?


Cheers,
Franco

Hi Franco,

I've attached my system log from the day of the last kernel panic. I redacted a few things.

I think the system went down at 18:39. I did a manual reset from the debugger at 18:51, and then things are logged after that. There are no errors in boot.log.

Is dmesg log of use?

Thanks for looking at this. Any further thoughts?


Well, here is the issue why your ddb is not restarting.  Your UFS was corrupted and something stuck in /tmp that prevents writing to it:

<13>1 2024-12-15T18:51:25+11:00 opnsense.domain.com kernel - - [meta sequenceId="176"] <118>** SU+J Recovering /dev/gpt/rootfs

<13>1 2024-12-15T18:51:25+11:00 opnsense.domain.com kernel - - [meta sequenceId="198"] <118>rm: /tmp/template_sample: Invalid argument

<13>1 2024-12-15T18:51:25+11:00 opnsense.domain.com kernel - - [meta sequenceId="199"] <118>/usr/local/etc/rc.subr.d/crashdump: cannot create /tmp/ddb.conf: No such file or directory
<13>1 2024-12-15T18:51:25+11:00 opnsense.domain.com kernel - - [meta sequenceId="200"] <118>Configuring crash dump device: /dev/gpt/swapfs
<13>1 2024-12-15T18:51:25+11:00 opnsense.domain.com kernel - - [meta sequenceId="201"] <118>usage: ddb capture [-M core] [-N system] print
<13>1 2024-12-15T18:51:25+11:00 opnsense.domain.com kernel - - [meta sequenceId="202"] <118>       ddb capture [-M core] [-N system] status
<13>1 2024-12-15T18:51:25+11:00 opnsense.domain.com kernel - - [meta sequenceId="203"] <118>       ddb script scriptname
<13>1 2024-12-15T18:51:25+11:00 opnsense.domain.com kernel - - [meta sequenceId="204"] <118>       ddb script scriptname=script
<13>1 2024-12-15T18:51:25+11:00 opnsense.domain.com kernel - - [meta sequenceId="205"] <118>       ddb scripts
<13>1 2024-12-15T18:51:25+11:00 opnsense.domain.com kernel - - [meta sequenceId="206"] <118>       ddb unscript scriptname
<13>1 2024-12-15T18:51:25+11:00 opnsense.domain.com kernel - - [meta sequenceId="207"] <118>       ddb pathname

At this point I would just recommend reinstalling the system, consider using ZFS.

Other crashes or power loss may have disintegrated the file system to begin with.  Keep in mind using local NetFlow reporting also puts immense strain on the file system and the disk itself.


Cheers,
Franco

Oh!!! I think some time ago we had a power cut and the modem got fried, so this could be related.

Have reinstalled with ZFS, local netflow capture off.

Will see how it goes for a couple of weeks.

Thank you and happy holidays! :)