Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - martin87

#1
Quote from: franco on February 12, 2025, 08:08:21 AMThe current suspicion is around os-mdns-repeater plugin, which apparently causes a lot of packets and associated states and lookup operations.


Cheers,
Franco

After a clean install and removing all plugins, my device is running since two days without a crash.
#2
QuoteCould it be an AMD processor issue? That's the difference between your two devices, 2752 is AMD.

Deciso thinks it's more a software issue and i should make a clean install without the plugins.
Now my appliance is running stable since 24 hours. It's still strange that it runs stable on the Intel device.
#3
I'using os-apcupsd, os-cpu-microcode-amd, os-dmidecode, os-mdns-repeater, os-theme-cicada, os-zabbix-agent.

I am already in contact with your colleagues. He gave me the hint to try a clean install without plugins.
I did that today, now I have to wait and see. But i don't understand, why the same configuration runs on a other appliance without any errors.
#4
Update:

I think it's a hardware issue. I made a clean install, but the kernel panics are still present.
I made a test and setted up my old  C2758 board with 24.10.2 and my config, this is working fine since 2 days.

I will contact Deciso to ask about help, because I have still waranty.
#5
After updating my DEC2752 to 24.10.2 my firewall crashes with this panic

db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00be38a380
vpanic() at vpanic+0x131/frame 0xfffffe00be38a4b0
panic() at panic+0x43/frame 0xfffffe00be38a510
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00be38a570
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00be38a5c0
calltrap() at calltrap+0x8/frame 0xfffffe00be38a5c0
--- trap 0xc, rip = 0xffffffff80d053a7, rsp = 0xfffffe00be38a690, rbp = 0xfffffe00be38a6b0 ---
rn_walktree() at rn_walktree+0x77/frame 0xfffffe00be38a6b0
pfr_get_addrs() at pfr_get_addrs+0x122/frame 0xfffffe00be38a710
pfioctl() at pfioctl+0x221e/frame 0xfffffe00be38abf0
devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfffffe00be38ac40
vn_ioctl() at vn_ioctl+0xce/frame 0xfffffe00be38acb0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00be38acd0
kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe00be38ad40
sys_ioctl() at sys_ioctl+0xff/frame 0xfffffe00be38ae00
amd64_syscall() at amd64_syscall+0xf9/frame 0xfffffe00be38af30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00be38af30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x23f81cfed5fa, rsp = 0x23f819805568, rbp = 0x23f819805a00 ---
KDB: enter: panic

After this, i roll back with a snapshot to 24.10.1, but now i am getting still panics

--- trap 0xc, rip = 0xffffffff80f578ac, rsp = 0xfffffe00b22d6cc0, rbp = 0xfffffe00b22d6cd0 ---
vm_object_terminate() at vm_object_terminate+0xec/frame 0xfffffe00b22d6cd0
vm_object_deallocate() at vm_object_deallocate+0x1ab/frame 0xfffffe00b22d6d10
vm_map_process_deferred() at vm_map_process_deferred+0x92/frame 0xfffffe00b22d6d30
vm_map_remove() at vm_map_remove+0xf9/frame 0xfffffe00b22d6d60
vmspace_exit() at vmspace_exit+0xab/frame 0xfffffe00b22d6d90
exit1() at exit1+0x53a/frame 0xfffffe00b22d6df0
sys_exit() at sys_exit+0xd/frame 0xfffffe00b22d6e00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe00b22d6f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00b22d6f30
--- syscall (1, FreeBSD ELF64, exit), rip = 0x8247780da, rsp = 0x8209ce2b8, rbp = 0x8209ce2d0 ---
KDB: enter: panic

I don't know what to do anymore... I had similar problems after updates, see https://forum.opnsense.org/index.php?topic=44514.msg222656#msg222656

I only have these problems with my new DEC 2752. My old selfmade xeon appliance doesn't have these problems

Does anyone have an idea? I'm already so frustrated

Thank you



#6
Quote from: franco on December 11, 2024, 01:31:58 PM
Can we establish if it still crashes with the same issues first before suggesting a debug kernel which could surface another issue? :)

Plus the 24.7.10 debug kernel has the bad pf state double-free behaviour...


Cheers,
Franco

Ok, I'll update to 24.7.10_2 tomorrow and see if it crashes again. I will report...
#7
Ok, my option would be to swap the SSD and RAM as a test, but unfortunately that's not possible because of the warranty seal.

I would like to test the debug kernel and help to find the error.

How can I install it and how do I get to the core dump in the event of a crash?
#8
I copied the entire crash report. Does that help you?
#9
I ran memtest, but without errors. Or does that say nothing? Should I contact Deciso? Because I still have warranty.

#10
Update:

I made another complete reinstall, but I updated manually from 24.7.0 to 24.7.8. Now it seems to be running stable since two days.
When I update to 24.7.10_2 an revert to 24.7.8 it crashes with kernel panics.

Does anyone have an idea?
#11
Since the last update i have mulitple kernel panics with the last stable kernel ("stable/24.7-n267981-8375762712f")

ddb.txt06000014000014725171521  7077 ustarrootwheeldb:0:kdb.enter.default>  run lockinfo
db:1:lockinfo> show locks
No such command; use "help" to list available commands
db:1:lockinfo>  show alllocks
No such command; use "help" to list available commands
db:1:lockinfo>  show lockedvnods
Locked vnodes
db:0:kdb.enter.default>  show pcpu
cpuid        = 0
dynamic pcpu = 0x124b080
curthread    = 0xfffff801c075e740: pid 13298 tid 101433 critnest 1 "pfctl"
curpcb       = 0xfffff801c075ec60
fpcurthread  = 0xfffff801c075e740: pid 13298 "pfctl"
idlethread   = 0xfffff800016c1740: tid 100003 "idle: cpu0"
self         = 0xffffffff82c10000
curpmap      = 0xfffff8004d906398
tssp         = 0xffffffff82c10384
rsp0         = 0xfffffe00c060f000
kcr3         = 0x58246000
ucr3         = 0x1e6c7a000
scr3         = 0x1e6c7a000
gs32p        = 0xffffffff82c10404
ldt          = 0xffffffff82c10444
tss          = 0xffffffff82c10434
curvnet      = 0xfffff800011a8b80
db:0:kdb.enter.default>  bt
Tracing pid 13298 tid 101433 td 0xfffff801c075e740
kdb_enter() at kdb_enter+0x33/frame 0xfffffe00c060e4b0
panic() at panic+0x43/frame 0xfffffe00c060e510
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00c060e570
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00c060e5c0
calltrap() at calltrap+0x8/frame 0xfffffe00c060e5c0
--- trap 0xc, rip = 0xffffffff80d053f7, rsp = 0xfffffe00c060e690, rbp = 0xfffffe00c060e6b0 ---
rn_walktree() at rn_walktree+0x77/frame 0xfffffe00c060e6b0
pfr_get_addrs() at pfr_get_addrs+0x122/frame 0xfffffe00c060e710
pfioctl() at pfioctl+0x221e/frame 0xfffffe00c060ebf0
devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfffffe00c060ec40
vn_ioctl() at vn_ioctl+0xce/frame 0xfffffe00c060ecb0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00c060ecd0
kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe00c060ed40
sys_ioctl() at sys_ioctl+0xff/frame 0xfffffe00c060ee00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe00c060ef30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00c060ef30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x1cd74181d5fa, rsp = 0x1cd73c4fac58, rbp = 0x1cd73c4fb0f0 ---


After that i revertet with "opnsense-update -kr 24.7.8" to "stable/24.7-n267939-fd5bc7f34el" With this kernel i have also multiple kernel panics:

ddb.txt06000014000014725304712  7077 ustarrootwheeldb:0:kdb.enter.default>  run lockinfo
db:1:lockinfo> show locks
No such command; use "help" to list available commands
db:1:lockinfo>  show alllocks
No such command; use "help" to list available commands
db:1:lockinfo>  show lockedvnods
Locked vnodes
db:0:kdb.enter.default>  show pcpu
cpuid        = 2
dynamic pcpu = 0xfffffe008e461080
curthread    = 0xfffff80193b42740: pid 51194 tid 101692 critnest 1 "python3.11"
curpcb       = 0xfffff80193b42c60
fpcurthread  = 0xfffff80193b42740: pid 51194 "python3.11"
idlethread   = 0xfffff800016c2740: tid 100005 "idle: cpu2"
self         = 0xffffffff82c12000
curpmap      = 0xffffffff81b81670
tssp         = 0xffffffff82c12384
rsp0         = 0xfffffe00bce3a000
kcr3         = 0xae3d1000
ucr3         = 0xffffffffffffffff
scr3         = 0x59533000
gs32p        = 0xffffffff82c12404
ldt          = 0xffffffff82c12444
tss          = 0xffffffff82c12434
curvnet      = 0
db:0:kdb.enter.default>  bt
Tracing pid 51194 tid 101692 td 0xfffff80193b42740
kdb_enter() at kdb_enter+0x33/frame 0xfffffe00bce39a00
panic() at panic+0x43/frame 0xfffffe00bce39a60
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00bce39ac0
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00bce39b10
calltrap() at calltrap+0x8/frame 0xfffffe00bce39b10
--- trap 0xc, rip = 0xffffffff80baf29b, rsp = 0xfffffe00bce39be0, rbp = 0xfffffe00bce39be0 ---
unlock_rw() at unlock_rw+0xb/frame 0xfffffe00bce39be0
_vm_page_busy_sleep() at _vm_page_busy_sleep+0xc3/frame 0xfffffe00bce39c20
vm_object_page_remove() at vm_object_page_remove+0x141/frame 0xfffffe00bce39c80
vm_map_entry_delete() at vm_map_entry_delete+0xf5/frame 0xfffffe00bce39cc0
vm_map_delete() at vm_map_delete+0x7b/frame 0xfffffe00bce39d30
vm_map_remove() at vm_map_remove+0x96/frame 0xfffffe00bce39d60
vmspace_exit() at vmspace_exit+0xab/frame 0xfffffe00bce39d90
exit1() at exit1+0x53a/frame 0xfffffe00bce39df0
sys_exit() at sys_exit+0xd/frame 0xfffffe00bce39e00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe00bce39f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00bce39f30
--- syscall (1, FreeBSD ELF64, exit), rip = 0x8265650da, rsp = 0x820d6b048, rbp = 0x820d6b060 ---
db:0:kdb.enter.default>  ps


I made a complete reinstall, but it didn't change anything. Is it possible a hardware problem and should i contact Deciso? My hardware is a new DEC2752, bought in july 2024. I still have a warranty.
At the moment my old reserve hardware is running with 24.7.8. This system is running stable.
#12
Quote from: newsense on December 04, 2024, 08:30:35 PM
No need to reinstall, nothing to gain doing that.

If none of the .10 kernels work for you simply go back to .8 until everything is sorted out.


# opnsense-update -kr 24.7.8

# opnsense-shell reboot


Ok, then I'll wait until everything is sorted out. The "route_del_fix-n267981-8375762712f" kernel works stable.
#13
I sent the crash report via GUI yesterday. Or should I post it here?

Now it would be interesting to know why the system crashes with "stable/24.7-n267981-8375762712f" and not with "route_del_fix-n267981-8375762712f". Otherwise I'll just do a clean install tomorrow.
#14
The panic occurs after ~1 hour after the update to  "stable/24.7-n267981-8375762712f"
After that i changed to "route_del_fix-n267981-8375762712f". This is for me stable. I tried it one more time, but after ~ 1 hour it crashed again.

Yesterday with "stable/24.7-n267979-0d692990122" i get this:

--- trap 0x9, rip = 0xffffffff80d053f7, rsp = 0xfffffe00b2e4c690, rbp = 0xfffffe00b2e4c6b0 ---
rn_walktree() at rn_walktree+0x77/frame 0xfffffe00b2e4c6b0
pfr_get_addrs() at pfr_get_addrs+0x122/frame 0xfffffe00b2e4c710
pfioctl() at pfioctl+0x221e/frame 0xfffffe00b2e4cbf0
devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfffffe00b2e4cc40
vn_ioctl() at vn_ioctl+0xce/frame 0xfffffe00b2e4ccb0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00b2e4ccd0
kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe00b2e4cd40
sys_ioctl() at sys_ioctl+0xff/frame 0xfffffe00b2e4ce00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe00b2e4cf30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00b2e4cf30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0xc46b237d5fa, rsp = 0xc46af1d9bf8, rbp = 0xc46af1da090 ---
#15
Quote from: franco on December 04, 2024, 07:07:48 PM
Quote from: martin87 on December 04, 2024, 06:57:36 PM
After "opnsense-update -fk" uname -v shows me "stable/24.7-n267981-8375762712f"
With this kernel the system crashes too. 

After "opnsense-update -zkr 24.7.10-state" uname -v shows me "route_del_fix-n267981-8375762712f".
This kernel runs stable. I'm using the default mirror.

I highlighted the commit hashes to emphasise that the builds are in fact the same.


Cheers,
Franco

Ok thank you, but with the "stable/24.7-n267981-8375762712f" it crashes with

--- trap 0xc, rip = 0xffffffff80f6be42, rsp = 0xfffffe00b2899c40, rbp = 0xfffffe00b2899c50 ---
vm_radix_lookup_unlocked() at vm_radix_lookup_unlocked+0x62/frame 0xfffffe00b2899c50
vm_fault() at vm_fault+0x85d/frame 0xfffffe00b2899d70
vm_fault_trap() at vm_fault_trap+0x4d/frame 0xfffffe00b2899dc0
trap_pfault() at trap_pfault+0x1be/frame 0xfffffe00b2899e10
trap() at trap+0x4ab/frame 0xfffffe00b2899f30
calltrap() at calltrap+0x8/frame 0xfffffe00b2899f30
--- trap 0xc, rip = 0x82213a86e, rsp = 0x820f06990, rbp = 0x820f069a0 ---