OPNsense Forum

English Forums => 24.7, 24.10 Legacy Series => Topic started by: computeralex92 on July 16, 2024, 08:21:29 PM

Title: Kernel panics after upgrade to R1
Post by: computeralex92 on July 16, 2024, 08:21:29 PM
Hello,

after updating today from 24.1.10 to 24.7.r1 I had some Kernel panics:

Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 06
fault virtual address = 0x20
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80c1dfd0
stack pointer         = 0x28:0xffffffff82841df0
frame pointer         = 0x28:0xffffffff82841e00
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = resume, IOPL = 0
current process = 7 (pf purge)
rdi: 0000000000000000 rsi: 0000000000000000 rdx: fffff80001d15740
rcx: fffff80001d15740  r8: 0000000000003000  r9: 000000000000000f
rax: 0000000000000000 rbx: 0000000000000000 rbp: ffffffff82841e00
r10: fffff801f0ef8000 r11: 000000008083bf61 r12: 0000000000000000
r13: fffff80001d15740 r14: 0000000000000000 r15: 000000000001432c
trap number = 12
panic: page fault
cpuid = 3
time = 1721152911
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff82841ae0
vpanic() at vpanic+0x131/frame 0xffffffff82841c10
panic() at panic+0x43/frame 0xffffffff82841c70
trap_fatal() at trap_fatal+0x40b/frame 0xffffffff82841cd0
trap_pfault() at trap_pfault+0x46/frame 0xffffffff82841d20
calltrap() at calltrap+0x8/frame 0xffffffff82841d20
--- trap 0xc, rip = 0xffffffff80c1dfd0, rsp = 0xffffffff82841df0, rbp = 0xffffffff82841e00 ---
turnstile_broadcast() at turnstile_broadcast+0x40/frame 0xffffffff82841e00
__mtx_unlock_sleep() at __mtx_unlock_sleep+0x73/frame 0xffffffff82841e30
pf_unlink_state() at pf_unlink_state+0x338/frame 0xffffffff82841e70
pf_purge_expired_states() at pf_purge_expired_states+0x178/frame 0xffffffff82841ec0
pf_purge_thread() at pf_purge_thread+0x13b/frame 0xffffffff82841ef0
fork_exit() at fork_exit+0x7f/frame 0xffffffff82841f30
fork_trampoline() at fork_trampoline+0xe/frame 0xffffffff82841f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic


My first experience was that it is only happening directly after a reboot, but now after some hours without any issue, it happen without any interaction from my side.

I will try to disable some tunables from 24.1 which are currently not required, as e.g. the Microcode upgrade is still active (and it seems like the boot process try to update it...):

CPU microcode: updated from 0xe to 0x17
CPU: Intel(R) N100 (806.40-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0xb06e0  Family=0x6  Model=0xbe  Stepping=0


I reported the last two panics via the issue reporter; hopefully this is helping finding the issue.

Thanks,
Alex
Title: Re: Kernel panics after upgrade to R1
Post by: Vasco on July 16, 2024, 08:43:04 PM
I can also confirm this issue and also submitted a report
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 16, 2024, 08:44:49 PM
Probably https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279899 and sadly the usual behaviour from the usual suspects at this point.


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: computeralex92 on July 16, 2024, 09:18:18 PM
Thanks Franco for the update and reaching out to FreeBSD.
It is correct that there is no way to disable pfsync completely? (I checked the man-pages and didn't found any tunable etc.)
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 16, 2024, 09:19:11 PM
We were wondering if this does this also crash with the beta kernel? Because it sort of indicates that it didn't before.

# opnsense-update -kr 24.7.b


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: Seimus on July 16, 2024, 09:27:24 PM
Quote from: franco on July 16, 2024, 08:44:49 PM
Probably https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279899 and sadly the usual behaviour from the usual suspects at this point.


Cheers,
Franco

...not again please....
Title: Re: Kernel panics after upgrade to R1
Post by: computeralex92 on July 16, 2024, 09:32:59 PM
Quote from: franco on July 16, 2024, 09:19:11 PM
We were wondering if this does this also crash with the beta kernel? Because it sort of indicates that it didn't before.

# opnsense-update -kr 24.7.b


Cheers,
Franco

Let's try it out ;-)
Already downloaded, reboot is happening in a sec.
Title: Re: Kernel panics after upgrade to R1
Post by: computeralex92 on July 16, 2024, 09:37:57 PM
Quote from: computeralex92 on July 16, 2024, 09:32:59 PM
Quote from: franco on July 16, 2024, 09:19:11 PM
We were wondering if this does this also crash with the beta kernel? Because it sort of indicates that it didn't before.

# opnsense-update -kr 24.7.b


Cheers,
Franco

Let's try it out ;-)
Already downloaded, reboot is happening in a sec.

I'm now running the following kernel:


FreeBSD OPNsense.localdomain 14.1-RELEASE FreeBSD 14.1-RELEASE stable/24.7-n267717-cf61c67cb34 SMP amd64


I will keep you updated, but directly after the reboot no panic happen.
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 16, 2024, 09:53:35 PM
I've been running the beta on a VM and on a Protectli bare metal since it was released and experienced no crashes.

Both are now on the R1 kernel, will report if anything comes up (uptime is ~2 hours running strong)
Title: Re: Kernel panics after upgrade to R1
Post by: muchacha_grande on July 16, 2024, 11:11:55 PM
I've installed 24.7 beta from the ISO into Proxmox VM and updated to RC1... Now I'm testing... no crash for now.
Title: Re: Kernel panics after upgrade to R1
Post by: Vasco on July 16, 2024, 11:29:37 PM
Quote from: computeralex92 on July 16, 2024, 09:37:57 PM
Quote from: computeralex92 on July 16, 2024, 09:32:59 PM
Quote from: franco on July 16, 2024, 09:19:11 PM
We were wondering if this does this also crash with the beta kernel? Because it sort of indicates that it didn't before.

# opnsense-update -kr 24.7.b


Cheers,
Franco

Let's try it out ;-)
Already downloaded, reboot is happening in a sec.

I'm now running the following kernel:


FreeBSD OPNsense.localdomain 14.1-RELEASE FreeBSD 14.1-RELEASE stable/24.7-n267717-cf61c67cb34 SMP amd64


I will keep you updated, but directly after the reboot no panic happen.

did the same and also no panic after reboot
Title: Re: Kernel panics after upgrade to R1
Post by: madj42 on July 16, 2024, 11:35:34 PM
Pardon me for asking, when I lookup pfsync, it deals with high availability. Do you have this setup? Reason I ask is that I don't have it setup and I'm trying to determine if I should upgrade and test.  I'd rather wait if it's impacting those without HA too.
Title: Re: Kernel panics after upgrade to R1
Post by: muchacha_grande on July 17, 2024, 12:00:02 AM
Have been using it for an hour so far and no crash.

This is Proxmox virtualized... not bare metal
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 17, 2024, 01:38:00 AM
Mkay...quick update.

I had reboots on the physical FWs, the virtualized one is stable.

Moved the physical ones on 24.7.b for now - where one of them ran just fine for a month, and keeping an eye on it.


No HA here either, just to make it clear.


Title: Re: Kernel panics after upgrade to R1
Post by: depc80 on July 17, 2024, 01:51:01 AM
Quote from: franco on July 16, 2024, 09:19:11 PM
We were wondering if this does this also crash with the beta kernel? Because it sort of indicates that it didn't before.

# opnsense-update -kr 24.7.b


Cheers,
Franco
Same here. Keep getting crashed every couple minutes with RC1 so I update to 24.7.b. It's been an hour and no crash. Love the dashboard but widgets are not resizable ?
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 17, 2024, 01:54:27 AM
>>> widgets are not resizable ?

They are if you unlock the dashboard in the upper right corner
Title: Re: Kernel panics after upgrade to R1
Post by: depc80 on July 17, 2024, 02:44:44 AM
I tried that, not working vertically. I wanna see full service list so I can extend it bit down, hit refresh and it's back to before.
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 17, 2024, 07:03:49 AM
Can we keep this thread to the core of the subject?

Let's bisect this then if BETA is good. I'll have a new kernel in a bit.


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: computeralex92 on July 17, 2024, 07:14:53 AM
Quote from: franco on July 17, 2024, 07:03:49 AM
Can we keep this thread to the core of the subject?

Let's bisect this then if BETA is good. I'll have a new kernel in a bit.


Cheers,
Franco

Until now no panic with the beta kernel...
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 17, 2024, 07:30:53 AM
> Until now no panic with the beta kernel...

Good, here is the next one:

# opnsense-update -zkr 24.7.b_15


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: computeralex92 on July 17, 2024, 07:43:44 AM
Quote from: franco on July 17, 2024, 07:30:53 AM
> Until now no panic with the beta kernel...

Good, here is the next one:

# opnsense-update -zkr 24.7.b_15


Cheers,
Franco

So far no panic after reboot:

FreeBSD OPNsense.localdomain 14.1-RELEASE-p1 FreeBSD 14.1-RELEASE-p1 n267732-007d9fa5c015 SMP amd64
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 17, 2024, 07:57:10 AM
Ok, second confirmation would be nice. This is going to be a weird one if it's in the later commits leading up to RC1.


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 17, 2024, 08:38:49 AM
b15 crashed immediately for me
Title: Re: Kernel panics after upgrade to R1
Post by: depc80 on July 17, 2024, 08:46:28 AM
After reboot, 24.7.b_15 crashed twice for me but then it's working fine so far. Submitted the problem, not sure if it's sent bc of the 2nd crash.
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 17, 2024, 08:49:28 AM
Ok guys I really have conflicting crash reports with different panics. If we screw up the bisect because our goal is "crash" we just produce heat and waste time. If you can send your crash reports on _15 so I can check...
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 17, 2024, 09:24:01 AM
Managed to get this for now...



<118>Root file system: zroot/ROOT/24.7.r1-b15Kernel
<118>Wed Jul 17 05:44:21 GMT 2024
<118>
<118>*** OPNsense.localdomain: OPNsense 24.7.r1 ***
<118>
...........................

kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address = 0x20
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80c1e520
stack pointer         = 0x28:0xfffffe0109632df0
frame pointer         = 0x28:0xfffffe0109632e00
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = resume, IOPL = 0
current process = 7 (pf purge)
rdi: 0000000000000000 rsi: 0000000000000000 rdx: fffff8000906a000
rcx: fffff8000906a000  r8: ffffffff827e0490  r9: 0000000000000014
rax: 0000000000000000 rbx: 0000000000000000 rbp: fffffe0109632e00
r10: fffff801c8cae840 r11: 000000007ffc94e4 r12: 0000000000000000
r13: fffff8000906a000 r14: 0000000000000000 r15: 0000000000016d25
trap number = 12
panic: page fault
cpuid = 1
time = 1721195385
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0109632ae0
vpanic() at vpanic+0x131/frame 0xfffffe0109632c10
panic() at panic+0x43/frame 0xfffffe0109632c70
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe0109632cd0
trap_pfault() at trap_pfault+0x46/frame 0xfffffe0109632d20
calltrap() at calltrap+0x8/frame 0xfffffe0109632d20
--- trap 0xc, rip = 0xffffffff80c1e520, rsp = 0xfffffe0109632df0, rbp = 0xfffffe0109632e00 ---
turnstile_broadcast() at turnstile_broadcast+0x40/frame 0xfffffe0109632e00
__mtx_unlock_sleep() at __mtx_unlock_sleep+0x73/frame 0xfffffe0109632e30
pf_unlink_state() at pf_unlink_state+0x338/frame 0xfffffe0109632e70
pf_purge_expired_states() at pf_purge_expired_states+0x178/frame 0xfffffe0109632ec0
pf_purge_thread() at pf_purge_thread+0x13b/frame 0xfffffe0109632ef0
fork_exit() at fork_exit+0x7f/frame 0xfffffe0109632f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0109632f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 17, 2024, 09:34:45 AM
Ok great, next one is:

# opnsense-update -zkr 24.7.b_7
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 17, 2024, 10:16:20 AM
I'm joining the fun. Just had two of those crashes in a row after working for at least 20 hours straight. Can't say for _7 right now, but installed it too.


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 17, 2024, 10:17:36 AM
Testing b_7, uptime 20 minutes on one FW, but as I mentioned this crash seems random and not always immediately after boot.
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 17, 2024, 10:24:27 AM
Yep, the worst part is actually knowing it's "good" because the bug might just be hiding ;)


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 17, 2024, 11:21:54 AM
_15 was really broken on the 3rd FW, couldn't curl the kernel, opnsense-patch would timeout eventually complaining it cannot verify the sig - which was bonkers.

Managed to winscp the kernel and sig file and then I installed it with -zkr 24.7.b_7 -l /foldername and will see what happens.

The other two are happy for now on _7, with 60' and 80' uptime respectively.


Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 17, 2024, 12:11:20 PM
Let's just try this one:

# opnsense-update -zkr 24.7.r1_2

I placed a bet...


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: computeralex92 on July 17, 2024, 12:28:48 PM
Quote from: franco on July 17, 2024, 12:11:20 PM
Let's just try this one:

# opnsense-update -zkr 24.7.r1_2

I placed a bet...


Cheers,
Franco

Just installed it, until now no problems or panic.
Title: Re: Kernel panics after upgrade to R1
Post by: Vasco on July 17, 2024, 12:35:06 PM
Quote from: franco on July 17, 2024, 12:11:20 PM
Let's just try this one:

# opnsense-update -zkr 24.7.r1_2

I placed a bet...


Cheers,
Franco

Sorry for partially missing the tests. Submited a crash report after boot with this one but didn't see a "panic" in dmesg-
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 17, 2024, 02:53:13 PM
I've replaced the original kernel by including this https://github.com/opnsense/src/commit/de60ffe06fd6

It may or may not be the right one, but it looks promising and I want to avoid people catching the bad one as best we can.


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 17, 2024, 06:26:31 PM
Just finished moving all 3 boxes to r1_2.

_b7 was there on all 3 with an ~8 hour uptime.
Title: Re: Kernel panics after upgrade to R1
Post by: planetf1 on July 17, 2024, 09:29:19 PM
Just to say I've not had any kernel panics with the initial 24.7.r1 - uptime 14 hours
n100 miniPC running Proxmox 8.2.4 16GB ram
opnsense 24.7.r1 in an 8GB VM with 1xintel i226v passthrough (wan) and 1 proxmox/linux bridge
Connection is pppoe, dual stack ipv4/v6
Simple config - using unbound, suricata (lan), crowdsec

All is 'just working'. Not seeing any unexpected kernel issues.

Nice job :-)
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 17, 2024, 09:37:34 PM
The kernel panics only happened on bare metal, virtualized worked ok.

Just a heads up for the other kernel testers, if you're on r1_2  from snapshots and check for updates the 24.7.r1 kernel will be installed and cause a reboot just because of the name change, otherwise it is the same kernel.


14.1-RELEASE-p2 FreeBSD 14.1-RELEASE-p2 stable/24.7-n267750-de60ffe06fd6 SMP amd64

Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 17, 2024, 10:12:22 PM
de60ffe06fd6 is the relevant part of the hotfixed kernel, yep

Since _15 was bad and _7 was good it was just a matter of an educated guess and it looks like we found it. Thanks all for the help!


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: computeralex92 on July 18, 2024, 08:04:58 AM
The new kernel is working for me without any issue.
Thanks all for the testing and debugging this problem!

Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: franco on July 18, 2024, 08:10:16 AM
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279899#c15

Of course, what looks like a proper fix found its way to FreeBSD's stable/14 branch yesterday afternoon ;)


Cheers,
Franco
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: ProximusAl on July 18, 2024, 08:26:57 AM
I love your comment on there Franco.......

A man after my own heart!!!

Excellent job!
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: franco on July 18, 2024, 08:44:27 AM
Thanks, but it's safe to assume the people that matter in this won't appreciate the candidness. Still how does the old saying go? "Do good things and talk about it" is what I'd like to see.

Here's an amended kernel with the proper fix. I also have it on my box so fingers crossed.

# opnsense-update -zkr 24.7.r1_5


Cheers,
Franco
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: newsense on July 18, 2024, 10:02:42 AM
Moved the fleet to the 5th amendment
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: newsense on July 18, 2024, 10:40:12 AM
With regards to the bug, and after reading the thread on bugs.freebsd.org, I still can't say I understood why it appeared to work just fine on multiple virtual environments but trigger relatively quickly on bare metal... Given its nature I would have expected a similar and consistent crash regardless of where it was running
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: Seimus on July 18, 2024, 11:08:35 AM
Quote from: franco on July 18, 2024, 08:44:27 AM
Thanks, but it's safe to assume the people that matter in this won't appreciate the candidness. Still how does the old saying go? "Do good things and talk about it" is what I'd like to see.

Here's an amended kernel with the proper fix. I also have it on my box so fingers crossed.

# opnsense-update -zkr 24.7.r1_5


Cheers,
Franco

There is as well >
Quote
"Karma is extremely efficient, if one is extremely patient"

Many thanks Franco for taking care of this!

Regards,
S.
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: franco on July 18, 2024, 11:12:43 AM
There is room for locking-related issues in pf states handling especially since it's actively being worked on (and I've seen a number of fixes that confirm this). A mildly related change just showed us by allowing a certain path previously not taken to break it, but it could also mean there are more of these issues in other places still. If they manifest only on hardware or due to specific traffic patterns or configuration or plain race conditions between state cleanup kernel thread and active state handling is unclear.


Cheers,
Franco
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: almodovaris on July 18, 2024, 03:45:11 PM
Probably the ones with Intel Ethernet adapters reported no crashes, I have Realtek, I had installed kernel 24.7.r1_7 and it crashed the moment I started a computer on the LAN side. Maybe it does not like Zenarmor blocking some website.
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: csutcliff on July 18, 2024, 04:17:11 PM
I have intel nics and still crashing every few hours with  _5 kernel. (sent crash reports)
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: almodovaris on July 18, 2024, 04:19:12 PM
Yup, I sent two crash reports, one with _5 and the other _7. Or so I think, since I had bectl-ed beforehand to 24.1 stable before sending the crash reports.
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: franco on July 18, 2024, 04:55:45 PM
I haven't seen any crash report with the particular stack trace today matching any of _2, _5 or _7 so far. Also no crash on my main production box.


Cheers,
Franco
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: almodovaris on July 18, 2024, 05:12:59 PM
Yup, 24.7 did not notice the crash. But bectl-ing to 24.1 and rebooting did see a crash (twice). I don't know if it can see the crash from another bectl.
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: csutcliff on July 18, 2024, 06:19:57 PM
Quote from: franco on July 18, 2024, 04:55:45 PM
I haven't seen any crash report with the particular stack trace today matching any of _2, _5 or _7 so far. Also no crash on my main production box.


Cheers,
Franco

I've sent two, one on _5 and one on _7. Have no idea if they made it to you since there is no feedback after sending. I did wait until the wan was up before submitting (since 24.7 the pppoe connection takes a few minutes to come up after reboot)

Edit: just realised you are meaning there is nothing matching this specific crash.
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: franco on July 18, 2024, 07:20:27 PM
Quote from: almodovaris on July 18, 2024, 05:12:59 PM
Yup, 24.7 did not notice the crash. But bectl-ing to 24.1 and rebooting did see a crash (twice). I don't know if it can see the crash from another bectl.

Not sure about 24.1? We were trying to find the regression between 24.7.b and 24.7.r1 kernel so 24.1.x kernels are very far way from this (FreeBSD 13 vs. 14).


Cheers,
Franco
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: franco on July 18, 2024, 07:22:08 PM
Quote from: csutcliff on July 18, 2024, 06:19:57 PM
Edit: just realised you are meaning there is nothing matching this specific crash.

Yes, just keep sending if you see one and I'll recheck later. The latest test kernel is

# opnsense-update -zkr 24.7.r1_7

Which may help with two other panics seen before on the 24.7.b kernels.


Cheers,
Franco
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: computeralex92 on July 18, 2024, 07:34:41 PM
Sorry that I was not able to test the kernels today, but now I'm back with kernel 24.7.r1_7...
No panic after reboot; let's see how it is performing.

Regarding the NIC topic:
I'm running on a Intel N100 with Intel I226 NICs.
Title: Re: [SOLVED] Kernel panics after upgrade to R1
Post by: almodovaris on July 18, 2024, 08:32:58 PM
Quote from: franco on July 18, 2024, 07:22:08 PM
Yes, just keep sending if you see one and I'll recheck later. The latest test kernel is
If 24.1 can see the crash from 24.7, then both crashes are from 24.7. But, again, I don't know if it can report the crashes from another bectl.
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 18, 2024, 10:08:08 PM
Hmm, ok but that makes searching for these hard because I'm pre-filtering for 24.7 user agent string.
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 18, 2024, 10:24:23 PM
Only noticed r1_7 about 75 minutes ago, applied on the 3 FWs and working fine so far from a crashing perspective
Title: Re: Kernel panics after upgrade to R1
Post by: almodovaris on July 18, 2024, 11:07:37 PM
Reported by icnl at home dot nl.

The bectl with 24.7 crashed twice. The bectl with 24.1 filled the crash reports. AFAIK 24.1 did not crash, ever. It's a fairly new installation (two days old).

But, okay, it can have misleading data about the installed software.
Title: Re: Kernel panics after upgrade to R1
Post by: csutcliff on July 18, 2024, 11:34:19 PM
just sent another crash report for 24.7.r1_7
Title: Re: Kernel panics after upgrade to R1
Post by: almodovaris on July 19, 2024, 02:01:11 AM
And, yup, if the bectl with 24.1 cannot see the crash from another bectl, I have no idea why it prompted me to send the crash reports.
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 19, 2024, 07:44:31 AM
Quote from: csutcliff on July 18, 2024, 11:34:19 PM
just sent another crash report for 24.7.r1_7

Just to make sure we're on the same page here, there can be crashes in programs that you can report from the GUI, restart said program and everything else on the FW continues working normally. The OPNsense team receives the crash reports and the issue is fixed one way or another and available shortly in an update.


This thread is about kernel panics on 24.1.r1 and the OS being rebooted automatically --  of which I had none for the last few kernels I tested on 3 FWs.

Uptime on 24.7.r1_7 is now over 10 hours.



Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 19, 2024, 10:54:20 AM
Yes there were a few PHP crash reports as well and we fixed them where we could.

It looks like the pf state unlink stuff works fine now but there's still a strange panic so we will go ahead with RC2 and I've also uploaded a debug kernel to work on the remaining panic because there's nothing that sticks out in the code about this one (ip6_input() related).


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: gtwop on July 19, 2024, 12:03:05 PM
Installed RC2. Lobby: Dashboard is blank no info at all.
Title: Re: Kernel panics after upgrade to R1
Post by: gtwop on July 19, 2024, 12:21:14 PM
Here is a screenshot.

Other than that, all the rest is functioning properly.
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 19, 2024, 12:47:18 PM
I wouldn't call it blank but I also wouldn't want to start discussing this in a kernel panic thread. Thanks.
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 19, 2024, 02:02:06 PM
So now we're at RC2. I'm hoping someone running into the ip6_input() panic will try the associated debug kernel to be able to share a core dump.

Only use the command if you are sure about your panic:

# opnsense-update -kr dbg-24.7.r2

The debug kernel will be detected by the reboot and configure itself to produce a core dump instead of a text dump. The dump files will not submit due to their size, so putting them on a file share would be the best option for us to grab it.


Thanks,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: csutcliff on July 19, 2024, 02:02:59 PM
Quote from: newsense on July 19, 2024, 07:44:31 AM
Quote from: csutcliff on July 18, 2024, 11:34:19 PM
just sent another crash report for 24.7.r1_7

Just to make sure we're on the same page here, there can be crashes in programs that you can report from the GUI, restart said program and everything else on the FW continues working normally. The OPNsense team receives the crash reports and the issue is fixed one way or another and available shortly in an update.


This thread is about kernel panics on 24.1.r1 and the OS being rebooted automatically --  of which I had none for the last few kernels I tested on 3 FWs.

Uptime on 24.7.r1_7 is now over 10 hours.

Yes I'm talking about kernel crashes where it dumps page after page of debug into the screen and reboots, I'm submitting the reports it has generated after reboot which do include kernel dump info etc. had 2 kernel crashes with _7 yesterday but none so far today.
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 19, 2024, 02:35:33 PM
csutcliff, you are our only hope!

You're a prime candidate for the RC2 debug kernel since you have that non-obvious ip6_input() crash.


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: Patrick M. Hausen on July 19, 2024, 02:37:51 PM
Help us, Obi-Franco Kenobi  :)
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 19, 2024, 02:55:45 PM
How about that weekend ear worm? https://www.youtube.com/watch?v=AYMlad4e3Q4
Title: Re: Kernel panics after upgrade to R1
Post by: Patrick M. Hausen on July 19, 2024, 02:59:13 PM
I have a bad feeling about this ...
Title: Re: Kernel panics after upgrade to R1
Post by: csutcliff on July 19, 2024, 03:14:14 PM
Quote from: franco on July 19, 2024, 02:35:33 PM
csutcliff, you are our only hope!

You're a prime candidate for the RC2 debug kernel since you have that non-obvious ip6_input() crash.


Cheers,
Franco

Thank you, I wasn't sure if I was "the one" since I didn't save the output from the crashes, only submitted it to you

I'm already on the rc2 but I'll install that debug kernel now.
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 19, 2024, 06:01:01 PM
Thanks. Don't forget to reboot before waiting for the crash. :)


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: danderson on July 19, 2024, 09:11:34 PM
Just happened on RC2.  Fresh install via image and restored config.

Fatal trap 12: page fault while in kernel mode


cpuid = 5; apic id = 05
fault virtual address   = 0x0
fault code      = supervisor read data, page not present
instruction pointer   = 0x20:0xffffffff80ddaf27
stack pointer           = 0x28:0xfffffe00e334fbe0
frame pointer           = 0x28:0xfffffe00e334fd10
code segment      = base 0x0, limit 0xfffff, type 0x1b
         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags   = interrupt enabled, resume, IOPL = 0
current process      = 12 (swi1: netisr 5)
rdi: fffff801c527a300 rsi: fffff80236ff5b00 rdx: fffff8042e3e2800

Fatal trap 12: page fault while in kernel mode
cpuid = 6; apic id = 06
fault virtual address   = 0x0
fault code      = supervisor read data, page not present
instruction pointer   = 0x20:0xffffffff80ddaf27
stack pointer           = 0x28:0xfffffe00e334abe0
frame pointer           = 0x28:0xfffffe00e334ad10
code segment      = base 0x0, limit 0xfffff, type 0x1b
         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags   = interrupt enabled, resume, IOPL = 0
current process      = 12 (swi1: netisr 6)
rdi: fffff801c527a300 rsi: fffff8023b6af040 rdx: fffff803b289a000
rcx: fffffe00b5d0f240  r8: 000000000000006b  r9: 3232395231a4ebd5
rax: 0000000000000000 rbx: fffff80001a73740 rbp: fffffe00e334ad10
r10: fffff80001a73740 r11: fffffe00e334a570 r12: fffff801c5621782
r13: fffff801c562179a r14: fffffe00e334abfc r15: fffff80017874800
trap number      = 12
panic: page fault
cpuid = 6
time = 1721416041
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00e334a8d0
vpanic() at vpanic+0x131/frame 0xfffffe00e334aa00
panic() at panic+0x43/frame 0xfffffe00e334aa60
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00e334aac0
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00e334ab10
calltrap() at calltrap+0x8/frame 0xfffffe00e334ab10
--- trap 0xc, rip = 0xffffffff80ddaf27, rsp = 0xfffffe00e334abe0, rbp = 0xfffffe00e334ad10 ---
ip6_forward() at ip6_forward+0x2a7/frame 0xfffffe00e334ad10
ip6_input() at ip6_input+0x11f/frame 0xfffffe00e334adf0
swi_net() at swi_net+0x138/frame 0xfffffe00e334ae60
ithread_loop() at ithread_loop+0x257/frame 0xfffffe00e334aef0
fork_exit() at fork_exit+0x7f/frame 0xfffffe00e334af30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00e334af30
--- trap 0x4d3efdb8, rip = 0xaba227f72fb510cb, rsp = 0x53c53af5127eeb0e, rbp = 0xec1658a0e86e8d54 ---
KDB: enter: panic
panic.txt0600001214646534551  7146 ustarrootwheelpage faultversion.txt0600007514646534551  7552 ustarrootwheelFreeBSD 14.1-RELEASE-p2 stable/24.7-n267755-f257b8d7e144 SMP
Title: Re: Kernel panics after upgrade to R1
Post by: danderson on July 19, 2024, 09:20:18 PM
i have an IPV6 rule that was forwarding to a remote IPV6 address, as soon as i disabled that rule it seems to have stopped crashing.
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 19, 2024, 09:58:25 PM
@danderson Looking for a core dump using the debug kernel for this if you can help out as well.


Thanks,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: danderson on July 19, 2024, 10:17:19 PM
@franco

ok, i installed the debug kernel and then rebooted, then enabled my ipv6 forward rule and made it crash.  i see /var/crash/kernel.0:
File too big to process. It will not be submitted automatically. in the crash report, what file(s) do you want me to grab?

Ill put them on my onedrive and shoot you a link at franco@opnsense.org if i remember correctly.
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 19, 2024, 10:22:24 PM
Yes, email is correct. Only need the kernel.0, splendid! :)
Title: Re: Kernel panics after upgrade to R1
Post by: danderson on July 19, 2024, 10:31:56 PM
email with link sent.
Title: Re: Kernel panics after upgrade to R1
Post by: csutcliff on July 20, 2024, 01:18:21 AM
had my first crash with rc2 (debug kernel), submitted the report and put a link to the kernel.0 in the notes.
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 20, 2024, 03:32:59 AM
10+ hours uptime on r2 here, on all FWs.
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 20, 2024, 06:33:37 PM
It was pretty late, kernel.0 is the wrong file... need the vmcore.0 instead. Sorry.


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: danderson on July 20, 2024, 07:11:52 PM
@franco

vmcore.0 file shared via link in your inbox now.
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 20, 2024, 09:36:30 PM
So danderson's report was about https://github.com/opnsense/src/commit/9cb6d71f6a

There maybe one more, but it would be easier to base work on this on a new kernel build on Monday which incorporates the above commit.


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 22, 2024, 09:11:53 AM
@danderson @csutcliff and anybody else who would like to help:

# opnsense-update -zkr 24.7.r2_2


Cheers,
Franco
Title: Re: Kernel panics after upgrade to R1
Post by: Seimus on July 22, 2024, 10:04:29 AM
Quote from: franco on July 22, 2024, 09:11:53 AM
@danderson @csutcliff and anybody else who would like to help:

# opnsense-update -zkr 24.7.r2_2


Cheers,
Franco

Even if this is not much of worth (as the crashes were so far on Baremetal only), I am running this on a VM OPNsense. So far all good.
Title: Re: Kernel panics after upgrade to R1
Post by: newsense on July 22, 2024, 10:17:54 AM
I have it running on 2 FWs, so far so good.

Can't test on the others as I lost access there due to a Zerotier issue that seems to have been introduced in RC1/RC2 - sent you an email about it.
Title: Re: Kernel panics after upgrade to R1
Post by: danderson on July 22, 2024, 02:37:21 PM
@franco

updated kernel and rebooted, did the same steps previously done to cause a crash and no crash this time. I'll keep running this kernel for the day unless you want us to try the 24.7.r2_3 kernel.

Quote from: franco on July 22, 2024, 09:11:53 AM
@danderson @csutcliff and anybody else who would like to help:

# opnsense-update -zkr 24.7.r2_2


Cheers,
Franco

Title: Re: Kernel panics after upgrade to R1
Post by: franco on July 22, 2024, 03:15:54 PM
Yay, thanks. The _3 is just for OpenVPN DCO.


Cheers,
Franco