Kernel panics after upgrade to R1

Started by computeralex92, July 16, 2024, 08:21:29 PM

Previous topic - Next topic
July 16, 2024, 08:21:29 PM Last Edit: July 18, 2024, 07:34:52 PM by computeralex92
Hello,

after updating today from 24.1.10 to 24.7.r1 I had some Kernel panics:

Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 06
fault virtual address = 0x20
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80c1dfd0
stack pointer         = 0x28:0xffffffff82841df0
frame pointer         = 0x28:0xffffffff82841e00
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = resume, IOPL = 0
current process = 7 (pf purge)
rdi: 0000000000000000 rsi: 0000000000000000 rdx: fffff80001d15740
rcx: fffff80001d15740  r8: 0000000000003000  r9: 000000000000000f
rax: 0000000000000000 rbx: 0000000000000000 rbp: ffffffff82841e00
r10: fffff801f0ef8000 r11: 000000008083bf61 r12: 0000000000000000
r13: fffff80001d15740 r14: 0000000000000000 r15: 000000000001432c
trap number = 12
panic: page fault
cpuid = 3
time = 1721152911
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff82841ae0
vpanic() at vpanic+0x131/frame 0xffffffff82841c10
panic() at panic+0x43/frame 0xffffffff82841c70
trap_fatal() at trap_fatal+0x40b/frame 0xffffffff82841cd0
trap_pfault() at trap_pfault+0x46/frame 0xffffffff82841d20
calltrap() at calltrap+0x8/frame 0xffffffff82841d20
--- trap 0xc, rip = 0xffffffff80c1dfd0, rsp = 0xffffffff82841df0, rbp = 0xffffffff82841e00 ---
turnstile_broadcast() at turnstile_broadcast+0x40/frame 0xffffffff82841e00
__mtx_unlock_sleep() at __mtx_unlock_sleep+0x73/frame 0xffffffff82841e30
pf_unlink_state() at pf_unlink_state+0x338/frame 0xffffffff82841e70
pf_purge_expired_states() at pf_purge_expired_states+0x178/frame 0xffffffff82841ec0
pf_purge_thread() at pf_purge_thread+0x13b/frame 0xffffffff82841ef0
fork_exit() at fork_exit+0x7f/frame 0xffffffff82841f30
fork_trampoline() at fork_trampoline+0xe/frame 0xffffffff82841f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic


My first experience was that it is only happening directly after a reboot, but now after some hours without any issue, it happen without any interaction from my side.

I will try to disable some tunables from 24.1 which are currently not required, as e.g. the Microcode upgrade is still active (and it seems like the boot process try to update it...):

CPU microcode: updated from 0xe to 0x17
CPU: Intel(R) N100 (806.40-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0xb06e0  Family=0x6  Model=0xbe  Stepping=0


I reported the last two panics via the issue reporter; hopefully this is helping finding the issue.

Thanks,
Alex

I can also confirm this issue and also submitted a report

Probably https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279899 and sadly the usual behaviour from the usual suspects at this point.


Cheers,
Franco

Thanks Franco for the update and reaching out to FreeBSD.
It is correct that there is no way to disable pfsync completely? (I checked the man-pages and didn't found any tunable etc.)

We were wondering if this does this also crash with the beta kernel? Because it sort of indicates that it didn't before.

# opnsense-update -kr 24.7.b


Cheers,
Franco

Quote from: franco on July 16, 2024, 08:44:49 PM
Probably https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279899 and sadly the usual behaviour from the usual suspects at this point.


Cheers,
Franco

...not again please....
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

Quote from: franco on July 16, 2024, 09:19:11 PM
We were wondering if this does this also crash with the beta kernel? Because it sort of indicates that it didn't before.

# opnsense-update -kr 24.7.b


Cheers,
Franco

Let's try it out ;-)
Already downloaded, reboot is happening in a sec.

Quote from: computeralex92 on July 16, 2024, 09:32:59 PM
Quote from: franco on July 16, 2024, 09:19:11 PM
We were wondering if this does this also crash with the beta kernel? Because it sort of indicates that it didn't before.

# opnsense-update -kr 24.7.b


Cheers,
Franco

Let's try it out ;-)
Already downloaded, reboot is happening in a sec.

I'm now running the following kernel:


FreeBSD OPNsense.localdomain 14.1-RELEASE FreeBSD 14.1-RELEASE stable/24.7-n267717-cf61c67cb34 SMP amd64


I will keep you updated, but directly after the reboot no panic happen.

I've been running the beta on a VM and on a Protectli bare metal since it was released and experienced no crashes.

Both are now on the R1 kernel, will report if anything comes up (uptime is ~2 hours running strong)

I've installed 24.7 beta from the ISO into Proxmox VM and updated to RC1... Now I'm testing... no crash for now.

Quote from: computeralex92 on July 16, 2024, 09:37:57 PM
Quote from: computeralex92 on July 16, 2024, 09:32:59 PM
Quote from: franco on July 16, 2024, 09:19:11 PM
We were wondering if this does this also crash with the beta kernel? Because it sort of indicates that it didn't before.

# opnsense-update -kr 24.7.b


Cheers,
Franco

Let's try it out ;-)
Already downloaded, reboot is happening in a sec.

I'm now running the following kernel:


FreeBSD OPNsense.localdomain 14.1-RELEASE FreeBSD 14.1-RELEASE stable/24.7-n267717-cf61c67cb34 SMP amd64


I will keep you updated, but directly after the reboot no panic happen.

did the same and also no panic after reboot

Pardon me for asking, when I lookup pfsync, it deals with high availability. Do you have this setup? Reason I ask is that I don't have it setup and I'm trying to determine if I should upgrade and test.  I'd rather wait if it's impacting those without HA too.

Have been using it for an hour so far and no crash.

This is Proxmox virtualized... not bare metal

July 17, 2024, 01:38:00 AM #13 Last Edit: July 17, 2024, 01:47:10 AM by newsense
Mkay...quick update.

I had reboots on the physical FWs, the virtualized one is stable.

Moved the physical ones on 24.7.b for now - where one of them ran just fine for a month, and keeping an eye on it.


No HA here either, just to make it clear.



Quote from: franco on July 16, 2024, 09:19:11 PM
We were wondering if this does this also crash with the beta kernel? Because it sort of indicates that it didn't before.

# opnsense-update -kr 24.7.b


Cheers,
Franco
Same here. Keep getting crashed every couple minutes with RC1 so I update to 24.7.b. It's been an hour and no crash. Love the dashboard but widgets are not resizable ?