Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - opn_leo

#1
Still getting weekly kernel panics, dmesg from most recent crash attached. 


#2
Quote from: Monviech (Cedrik) on January 20, 2025, 03:49:36 PMIn what environment is this unit deployed? It's a rack mount unit so I imagine it's installed in a rack?

Rack mounted

QuoteIs the grounding pin of the appliance connected to the rack?

Yes, grounding pin used and powered by properly grounded rack mount PDU

QuoteIs there a UPS used?

Yes, PDU is plugged into UPS.  All other devices on UPS are functioning fine

QuoteAre there any temperature issues? Fan working?

Yes, fan is working.  No issues with temperature of the DEC2750, I check the temperature health via GUI and everything is very stable.
#3
Disk only 1% used, plenty of space
#4
Quote from: newsense on January 20, 2025, 02:31:29 AM
Quote from: opn_leo on January 19, 2025, 02:35:05 PM.....
Anyone have any suggestions.  Sent the logs to Opnsense via the issue reporting dialog.  This is the only firewall I use at home and dropping internet every 12 hours (and having to power cycle) is a real bummer. 

Can you post the output of this command please:

  ls -ltrh /var/crash


That is an empty directory

#5
I own a Opnsense DEC-2750.  I was on a previous version of Opnsense Business Edition for about a year without issues.  I upgraded to  24.10.1 a couple weeks ago, and started getting daily kernel panics and crashes/reboots.  Some of the crashes required power cycling the unit to recover.

After reading some forum posts, I tried reinstalling the kernel and syslog-ng.  This seemed to fix the issue for about a week.  But just had another hard crash requiring a power cycle.  I connected serial console to monitor the DEC-2750, and it's a solid stream of these messages until I reboot.

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address   = 0x1
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80c0af1
stack pointer           = 0x28:0xfffffe000f6bc530
frame pointer           = 0x28:0xfffffe000f6bc630
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 31708 (python3.11)
kernel trap 12 with interrupts disabled

Anyone have any suggestions.  Sent the logs to Opnsense via the issue reporting dialog.  This is the only firewall I use at home and dropping internet every 12 hours (and having to power cycle) is a real bummer. 
#6
I wanted to post a message in the forum in case anyone else has this issue.  I have resolved the problem after a weekend of trial and error.  These forums were very helpful in the troubleshooting process!

I have a DEC2750 purchased in April 2023 running -24 bios version.  I was running 23.10.2 Business Edition for many months and then last weekend decided to upgrade to 23.10.3.  After upgrading the system would crash randomly every several hours.  I searched the logs and there were no corresponding error messages, just reboot messages all of a sudden.  I first tried to disable Unbound because there are several threads about that causing random reboots, but that did not fix the issue. 

I eventually decided to update to -28 Bios version as a last resort.  During the bios install, the updater got stuck at 98%.  I let the system sit for an hour, hoping it would get to 100% - but it never did.  I was worried I had bricked my DEC2750, but upon reboot it appeared everything was fine and it was running the new -28 bios version.

After updating the bios, the random reboots stopped.  I re-enabled Unbound and it has been very stable ever since.  If your having random issues with DEC hardware, see if there is a bios update!