I own a Opnsense DEC-2750. I was on a previous version of Opnsense Business Edition for about a year without issues. I upgraded to 24.10.1 a couple weeks ago, and started getting daily kernel panics and crashes/reboots. Some of the crashes required power cycling the unit to recover.
After reading some forum posts, I tried reinstalling the kernel and syslog-ng. This seemed to fix the issue for about a week. But just had another hard crash requiring a power cycle. I connected serial console to monitor the DEC-2750, and it's a solid stream of these messages until I reboot.
Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address = 0x1
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80c0af1
stack pointer = 0x28:0xfffffe000f6bc530
frame pointer = 0x28:0xfffffe000f6bc630
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = resume, IOPL = 0
current process = 31708 (python3.11)
kernel trap 12 with interrupts disabled
Anyone have any suggestions. Sent the logs to Opnsense via the issue reporting dialog. This is the only firewall I use at home and dropping internet every 12 hours (and having to power cycle) is a real bummer.
I'm on the 670 on the latest business version with 42 days uptime
Hopefully someone can help find the issue
Same here, 24.10.1 on a DEC750 and at this moment 53 days uptime ...
Quote from: opn_leo on January 19, 2025, 02:35:05 PM.....
Anyone have any suggestions. Sent the logs to Opnsense via the issue reporting dialog. This is the only firewall I use at home and dropping internet every 12 hours (and having to power cycle) is a real bummer.
Can you post the output of this command please:
ls -ltrh /var/crash
Quote from: newsense on January 20, 2025, 02:31:29 AMQuote from: opn_leo on January 19, 2025, 02:35:05 PM.....
Anyone have any suggestions. Sent the logs to Opnsense via the issue reporting dialog. This is the only firewall I use at home and dropping internet every 12 hours (and having to power cycle) is a real bummer.
Can you post the output of this command please:
ls -ltrh /var/crash
That is an empty directory
Disk full ? A kernel crash should leave something in that dir otherwise
Disk only 1% used, plenty of space
In what environment is this unit deployed? It's a rack mount unit so I imagine it's installed in a rack?
Is the grounding pin of the appliance connected to the rack?
Is there a UPS used?
Are there any temperature issues? Fan working?
Quote from: Monviech (Cedrik) on January 20, 2025, 03:49:36 PMIn what environment is this unit deployed? It's a rack mount unit so I imagine it's installed in a rack?
Rack mounted
QuoteIs the grounding pin of the appliance connected to the rack?
Yes, grounding pin used and powered by properly grounded rack mount PDU
QuoteIs there a UPS used?
Yes, PDU is plugged into UPS. All other devices on UPS are functioning fine
QuoteAre there any temperature issues? Fan working?
Yes, fan is working. No issues with temperature of the DEC2750, I check the temperature health via GUI and everything is very stable.
Still getting weekly kernel panics, dmesg from most recent crash attached.