Kernel panic after upgrade

Started by Hendre, July 28, 2024, 02:29:01 PM

Previous topic - Next topic
Hey,

Try to gather the vmcore file using the 24.7.1 debug kernel:

# opnsense-update -zkr 24.7.1-dbg


Cheers,
Franco

Unfortunately no luck installing the kernel. I tried 24.7.2 as well but same error.

sudo opnsense-update -zkr 24.7.1-dbg

Fetching kernel-24.7.1-dbg-amd64.txz: ..[fetch: https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/sets/kernel-24.7.1-dbg-amd64.txz.sig: Not Found] failed, no signature found

Applying patch from this forum https://forum.opnsense.org/index.php?topic=42081.0 fixes the issue. Seeing if it holds over the next hours / days without panic.

I'm having same issue:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 5; apic id = 11
instruction pointer = 0x20:0xffffffff80d7c723
stack pointer         = 0x28:0xfffffe00c7148b90
frame pointer         = 0x28:0xfffffe00c7148bf0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 12 (swi1: netisr 0)
rdi: fffff80001a73740 rsi: 000000000300000a rdx: 35b04bd7a137a137
rcx: ffffffff83a15000  r8: 000000000000c544  r9: 0000000000000005
rax: ffffffffff32ed00 rbx: 000000000000f023 rbp: fffffe00c7148bf0
r10: 000000000000000a r11: fffffe00219d2c30 r12: 000000000000c544
r13: fffff80001a73740 r14: fffffe00219d8c38 r15: 00000000020013ac
trap number = 9
panic: general protection fault
cpuid = 5
time = 1724463849
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00c71488d0
vpanic() at vpanic+0x131/frame 0xfffffe00c7148a00
panic() at panic+0x43/frame 0xfffffe00c7148a60
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00c7148ac0
calltrap() at calltrap+0x8/frame 0xfffffe00c7148ac0
--- trap 0x9, rip = 0xffffffff80d7c723, rsp = 0xfffffe00c7148b90, rbp = 0xfffffe00c7148bf0 ---
in_pcblookup_hash_smr() at in_pcblookup_hash_smr+0x43/frame 0xfffffe00c7148bf0
in_pcblookup_mbuf() at in_pcblookup_mbuf+0x18/frame 0xfffffe00c7148c10
tcp_input_with_port() at tcp_input_with_port+0x4f6/frame 0xfffffe00c7148d80
tcp_input() at tcp_input+0xb/frame 0xfffffe00c7148d90
ip_input() at ip_input+0x268/frame 0xfffffe00c7148df0
swi_net() at swi_net+0x138/frame 0xfffffe00c7148e60
ithread_loop() at ithread_loop+0x257/frame 0xfffffe00c7148ef0
fork_exit() at fork_exit+0x7f/frame 0xfffffe00c7148f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00c7148f30
--- trap 0x83480824, rip = 0x4816ebc033047500, rsp = 0xc4834800000001b8, rbp = 0x4c89481024548948 ---
KDB: enter: panic
panic.txt0600003014662235351  7140 ustarrootwheelgeneral protection faultversion.txt0600007414662235351  7543 ustarrootwheelFreeBSD 14.1-RELEASE-p3 ixl_revert-n267779-6ca05616b9e9 SMP



A lot of:
<7>cannot forward src XXXXXXX, dst XXXXXXX, nxt 6, rcvif vlan0.20, outif pppoe0
<6>pid 34938 (php), jid 0, uid 0: exited on signal 10 (no core dump - bad address)


I can't test the Kernel in the PPPoE thread becase I already have a patched kernel :D but I will be happy to troubleshoot this one if needed.

Quote from: furfix on August 24, 2024, 07:34:28 PM
I can't test the Kernel in the PPPoE thread becase I already have a patched kernel :D but I will be happy to troubleshoot this one if needed.

Hmmm, the patches in that thread do not touch kernel at all.  ???

August 24, 2024, 08:14:23 PM #20 Last Edit: August 24, 2024, 08:16:04 PM by furfix
maybe the patch fixes something that is currently triggering the panic?

Just updated my 4 port 2.5GBe mini PC with the latest update and it's bricked. If I use the CLI and revert to kernel ver2 of 2 instead of 1 I can get into the GUI but it won't let me do anything and won't connect to the internet. Seems DHCP is broken.

People cross-posting all over with no information attached. I don't know how many times we've tried to say please do not. Make your own threads or find the exact match with the details you wanted to post.

@Hendre I followed up via mail.. I posted a garbled update command, but since your issue is gone I've given a few hints what it could have been.


Cheers,
Franco

Quote from: furfix on August 24, 2024, 08:14:23 PM
maybe the patch fixes something that is currently triggering the panic?

Well maybe - was my point. You can apply the PPPoE patches regardless of any patched kernel.

Quote from: franco on August 24, 2024, 09:20:49 PM
People cross-posting all over with no information attached. I don't know how many times we've tried to say please do not. Make your own threads or find the exact match with the details you wanted to post.

@Hendre I followed up via mail.. I posted a garbled update command, but since your issue is gone I've given a few hints what it could have been.


Cheers,
Franco
I would love to provide more info but i'm not as versed as many on here and this is my first issue with opnsense. I've also tried to reinstall and DNS just will not work so nothing can be routed.

Open a ticket, explain "bricked": does it boot, can you log in, do the firmware updates work, health audit ok, other things you want to say. Send me a PM with the link to the post so I will follow up there. Sometimes it's too busy to reply to all threads.


Cheers,
Franco

I went back to my original 24.1 config on 24.7.3 and figured out udp broadcast relay plugin was somehow causing the panic. Disabled and installed plugin, now all is working perfectly on 24.7.3. This was tough but finally got there.

Thanks for the support Franco.

udpbroadcastrelay is now at version 1.1 in 24.7.3, but this just as a side note.

I still suspect a kernel issue. I really need that vmcore file from the debug kernel.


Cheers,
Franco

I'll find time for it, just happy I finally have a functional setup. Strangely udp broadcast relay plugin still shows 1.0 on 24.7.3 for me ...

The plugin is not the third party package with the actual functional software. Therefore both have different versions.


Cheers,
Franco