Kernel panics after upgrade to R1

Started by computeralex92, July 16, 2024, 08:21:29 PM

Previous topic - Next topic
Just happened on RC2.  Fresh install via image and restored config.

Fatal trap 12: page fault while in kernel mode


cpuid = 5; apic id = 05
fault virtual address   = 0x0
fault code      = supervisor read data, page not present
instruction pointer   = 0x20:0xffffffff80ddaf27
stack pointer           = 0x28:0xfffffe00e334fbe0
frame pointer           = 0x28:0xfffffe00e334fd10
code segment      = base 0x0, limit 0xfffff, type 0x1b
         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags   = interrupt enabled, resume, IOPL = 0
current process      = 12 (swi1: netisr 5)
rdi: fffff801c527a300 rsi: fffff80236ff5b00 rdx: fffff8042e3e2800

Fatal trap 12: page fault while in kernel mode
cpuid = 6; apic id = 06
fault virtual address   = 0x0
fault code      = supervisor read data, page not present
instruction pointer   = 0x20:0xffffffff80ddaf27
stack pointer           = 0x28:0xfffffe00e334abe0
frame pointer           = 0x28:0xfffffe00e334ad10
code segment      = base 0x0, limit 0xfffff, type 0x1b
         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags   = interrupt enabled, resume, IOPL = 0
current process      = 12 (swi1: netisr 6)
rdi: fffff801c527a300 rsi: fffff8023b6af040 rdx: fffff803b289a000
rcx: fffffe00b5d0f240  r8: 000000000000006b  r9: 3232395231a4ebd5
rax: 0000000000000000 rbx: fffff80001a73740 rbp: fffffe00e334ad10
r10: fffff80001a73740 r11: fffffe00e334a570 r12: fffff801c5621782
r13: fffff801c562179a r14: fffffe00e334abfc r15: fffff80017874800
trap number      = 12
panic: page fault
cpuid = 6
time = 1721416041
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00e334a8d0
vpanic() at vpanic+0x131/frame 0xfffffe00e334aa00
panic() at panic+0x43/frame 0xfffffe00e334aa60
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00e334aac0
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00e334ab10
calltrap() at calltrap+0x8/frame 0xfffffe00e334ab10
--- trap 0xc, rip = 0xffffffff80ddaf27, rsp = 0xfffffe00e334abe0, rbp = 0xfffffe00e334ad10 ---
ip6_forward() at ip6_forward+0x2a7/frame 0xfffffe00e334ad10
ip6_input() at ip6_input+0x11f/frame 0xfffffe00e334adf0
swi_net() at swi_net+0x138/frame 0xfffffe00e334ae60
ithread_loop() at ithread_loop+0x257/frame 0xfffffe00e334aef0
fork_exit() at fork_exit+0x7f/frame 0xfffffe00e334af30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00e334af30
--- trap 0x4d3efdb8, rip = 0xaba227f72fb510cb, rsp = 0x53c53af5127eeb0e, rbp = 0xec1658a0e86e8d54 ---
KDB: enter: panic
panic.txt0600001214646534551  7146 ustarrootwheelpage faultversion.txt0600007514646534551  7552 ustarrootwheelFreeBSD 14.1-RELEASE-p2 stable/24.7-n267755-f257b8d7e144 SMP

i have an IPV6 rule that was forwarding to a remote IPV6 address, as soon as i disabled that rule it seems to have stopped crashing.

@danderson Looking for a core dump using the debug kernel for this if you can help out as well.


Thanks,
Franco

@franco

ok, i installed the debug kernel and then rebooted, then enabled my ipv6 forward rule and made it crash.  i see /var/crash/kernel.0:
File too big to process. It will not be submitted automatically. in the crash report, what file(s) do you want me to grab?

Ill put them on my onedrive and shoot you a link at franco@opnsense.org if i remember correctly.

Yes, email is correct. Only need the kernel.0, splendid! :)


had my first crash with rc2 (debug kernel), submitted the report and put a link to the kernel.0 in the notes.

10+ hours uptime on r2 here, on all FWs.

It was pretty late, kernel.0 is the wrong file... need the vmcore.0 instead. Sorry.


Cheers,
Franco

@franco

vmcore.0 file shared via link in your inbox now.

July 20, 2024, 09:36:30 PM #85 Last Edit: July 22, 2024, 09:10:40 AM by franco
So danderson's report was about https://github.com/opnsense/src/commit/9cb6d71f6a

There maybe one more, but it would be easier to base work on this on a new kernel build on Monday which incorporates the above commit.


Cheers,
Franco

@danderson @csutcliff and anybody else who would like to help:

# opnsense-update -zkr 24.7.r2_2


Cheers,
Franco

Quote from: franco on July 22, 2024, 09:11:53 AM
@danderson @csutcliff and anybody else who would like to help:

# opnsense-update -zkr 24.7.r2_2


Cheers,
Franco

Even if this is not much of worth (as the crashes were so far on Baremetal only), I am running this on a VM OPNsense. So far all good.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

I have it running on 2 FWs, so far so good.

Can't test on the others as I lost access there due to a Zerotier issue that seems to have been introduced in RC1/RC2 - sent you an email about it.

July 22, 2024, 02:37:21 PM #89 Last Edit: July 22, 2024, 02:48:57 PM by danderson
@franco

updated kernel and rebooted, did the same steps previously done to cause a crash and no crash this time. I'll keep running this kernel for the day unless you want us to try the 24.7.r2_3 kernel.

Quote from: franco on July 22, 2024, 09:11:53 AM
@danderson @csutcliff and anybody else who would like to help:

# opnsense-update -zkr 24.7.r2_2


Cheers,
Franco