Crashes after update to 24.10.2 business edition

Started by martin87, February 07, 2025, 07:33:50 AM

Previous topic - Next topic
After updating my DEC2752 to 24.10.2 my firewall crashes with this panic

db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00be38a380
vpanic() at vpanic+0x131/frame 0xfffffe00be38a4b0
panic() at panic+0x43/frame 0xfffffe00be38a510
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00be38a570
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00be38a5c0
calltrap() at calltrap+0x8/frame 0xfffffe00be38a5c0
--- trap 0xc, rip = 0xffffffff80d053a7, rsp = 0xfffffe00be38a690, rbp = 0xfffffe00be38a6b0 ---
rn_walktree() at rn_walktree+0x77/frame 0xfffffe00be38a6b0
pfr_get_addrs() at pfr_get_addrs+0x122/frame 0xfffffe00be38a710
pfioctl() at pfioctl+0x221e/frame 0xfffffe00be38abf0
devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfffffe00be38ac40
vn_ioctl() at vn_ioctl+0xce/frame 0xfffffe00be38acb0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00be38acd0
kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe00be38ad40
sys_ioctl() at sys_ioctl+0xff/frame 0xfffffe00be38ae00
amd64_syscall() at amd64_syscall+0xf9/frame 0xfffffe00be38af30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00be38af30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x23f81cfed5fa, rsp = 0x23f819805568, rbp = 0x23f819805a00 ---
KDB: enter: panic

After this, i roll back with a snapshot to 24.10.1, but now i am getting still panics

--- trap 0xc, rip = 0xffffffff80f578ac, rsp = 0xfffffe00b22d6cc0, rbp = 0xfffffe00b22d6cd0 ---
vm_object_terminate() at vm_object_terminate+0xec/frame 0xfffffe00b22d6cd0
vm_object_deallocate() at vm_object_deallocate+0x1ab/frame 0xfffffe00b22d6d10
vm_map_process_deferred() at vm_map_process_deferred+0x92/frame 0xfffffe00b22d6d30
vm_map_remove() at vm_map_remove+0xf9/frame 0xfffffe00b22d6d60
vmspace_exit() at vmspace_exit+0xab/frame 0xfffffe00b22d6d90
exit1() at exit1+0x53a/frame 0xfffffe00b22d6df0
sys_exit() at sys_exit+0xd/frame 0xfffffe00b22d6e00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe00b22d6f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00b22d6f30
--- syscall (1, FreeBSD ELF64, exit), rip = 0x8247780da, rsp = 0x8209ce2b8, rbp = 0x8209ce2d0 ---
KDB: enter: panic

I don't know what to do anymore... I had similar problems after updates, see https://forum.opnsense.org/index.php?topic=44514.msg222656#msg222656

I only have these problems with my new DEC 2752. My old selfmade xeon appliance doesn't have these problems

Does anyone have an idea? I'm already so frustrated

Thank you




Update:

I think it's a hardware issue. I made a clean install, but the kernel panics are still present.
I made a test and setted up my old  C2758 board with 24.10.2 and my config, this is working fine since 2 days.

I will contact Deciso to ask about help, because I have still waranty.

Can you tell us what plugins you are using? We are investigating instability related to the rn_walktree() on FreeBSD 14 which also seems to factor in a lot of connects or packet throughput.


Cheers,
Franco

I'using os-apcupsd, os-cpu-microcode-amd, os-dmidecode, os-mdns-repeater, os-theme-cicada, os-zabbix-agent.

I am already in contact with your colleagues. He gave me the hint to try a clean install without plugins.
I did that today, now I have to wait and see. But i don't understand, why the same configuration runs on a other appliance without any errors.

Could it be an AMD processor issue? That's the difference between your two devices, 2752 is AMD.

If I was at a different point, I'd spin up an HP T740 to see if the V1756b processor gives any trouble. I'm still working on unlocking the BIOS on the batch I just bought so it's going to be a bit, waiting for new BIOS chips to arrive so I have things I can sacrifice (if needed). Also need to solder a pogo pin adapter so I don't have to pull the chips every time I want to fool with it.

Maybe I can roll out my old T620+ and give this a check, is there a way I can download the Business version and install it without using my license key? I don't want things getting confused with that key because we have 2 more years on it.

QuoteCould it be an AMD processor issue? That's the difference between your two devices, 2752 is AMD.

Deciso thinks it's more a software issue and i should make a clean install without the plugins.
Now my appliance is running stable since 24 hours. It's still strange that it runs stable on the Intel device.

I'll probably update mine Wednesday night, too much going on during the days for me to supervise it, have to hope it works when I get in on Thursday morning. It's a Xeon server I pulled out of retirement until they can get me budget for a 2770 (or a new Supermicro server which is going to be about the same price).

The current suspicion is around os-mdns-repeater plugin, which apparently causes a lot of packets and associated states and lookup operations.


Cheers,
Franco

Quote from: franco on February 12, 2025, 08:08:21 AMThe current suspicion is around os-mdns-repeater plugin, which apparently causes a lot of packets and associated states and lookup operations.


Cheers,
Franco

After a clean install and removing all plugins, my device is running since two days without a crash.

Quote from: franco on February 12, 2025, 08:08:21 AMThe current suspicion is around os-mdns-repeater plugin, which apparently causes a lot of packets and associated states and lookup operations.


Cheers,
Franco

We have a DEC2750 v2, and we use the os-mdns-repeater plugin. Should we hold off on installing 24.10.2 for now?
--
Regards,
   Evert

It depends. This is relevant for all of 24.7 / 24.10 in general as it is the first one running on FreeBSD 14.1.


Cheers,
Franco

i'm currently getting random crashes but i don't have the os-mdns-repeater plugin, i have noticed that unbound sometimes stopped working before the actual reload.

these are the only plugins i have in my appliance,
os-acme-client (installed)   4.7   787KiB   3   OPNsense   ACME Client   
os-ddclient (installed)   1.26   136KiB   3   OPNsense   Dynamic DNS client   
os-etpro-telemetry (installed)   1.7_5   50.3KiB   2   OPNsense   ET Pro Telemetry Edition   
os-OPNBEcore (installed)   1.4_4   151KiB   1   OPNsense   OPNsense Business Edition add-ons   
os-theme-vicuna (installed)   1.48   5.27MiB   3   OPNsense   The vicuna theme - blue sapphire

Update: I have bypassed unbound by sending all dns requests to my dc's and then direct to internet, ever since the appliance is stable for me.


I see that a hotfix has been released, version 24.10.2_6

Does this hotfix resolve the issues in this thread?
--
Regards,
   Evert