Hello community,
I have a problem with my opnsense after updating from 24.7.1 to 24.7.2. After the reboot I get a kernel panic:
Mounting filesystems...
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0x0
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff804d7de7
stack pointer = 0x28:0xfffffe00715ddb20
frame pointer = 0x28:0xfffffe00715ddb40
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 31 (zpool)
rdi: fffff8000378c000 rsi: 0000000000020005 rdx: 000000000000000b
rcx: fffff80003768900 r8: 0000000000000001 r9: 0000000000000000
rax: 0000000000000000 rbx: fffff8000378c000 rbp: fffffe00715ddb40
r10: 0000000000000016 r11: fffff8004ff73520 r12: 0000000000002000
r13: 0000000000020005 r14: fffff8000378b700 r15: fffff8000378b600
trap number = 12
panic: page fault
cpuid = 0
time = 1009843225
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00715dd810
vpanic() at vpanic+0x131/frame 0xfffffe00715dd940
panic() at panic+0x43/frame 0xfffffe00715dd9a0
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00715dda00
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00715dda50
calltrap() at calltrap+0x8/frame 0xfffffe00715dda50
--- trap 0xc, rip = 0xffffffff804d7de7, rsp = 0xfffffe00715ddb20, rbp = 0xfffffe00715ddb40 ---
agp_close() at agp_close+0x57/frame 0xfffffe00715ddb40
giant_close() at giant_close+0x68/frame 0xfffffe00715ddb90
devfs_close() at devfs_close+0x4b3/frame 0xfffffe00715ddc00
VOP_CLOSE_APV() at VOP_CLOSE_APV+0x1d/frame 0xfffffe00715ddc20
vn_close1() at vn_close1+0x14c/frame 0xfffffe00715ddc90
vn_closefile() at vn_closefile+0x3d/frame 0xfffffe00715ddce0
devfs_close_f() at devfs_close_f+0x2a/frame 0xfffffe00715ddd10
_fdrop() at _fdrop+0x11/frame 0xfffffe00715ddd30
closef() at closef+0x24a/frame 0xfffffe00715dddc0
closefp_impl() at closefp_impl+0x58/frame 0xfffffe00715dde00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe00715ddf30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00715ddf30
--- syscall (6, FreeBSD ELF64, close), rip = 0x18d8eaaf52ba, rsp = 0x18d8f2980d88, rbp = 0x18d8f2980da0 ---
KDB: enter: panic
[ thread pid 31 tid 100264 ]
Stopped at kdb_enter+0x33: movq $0,0xfd9962(%rip)
A clean reinstall of 24.7 with a config backup works, but after updating to 24.7.2 again, the kernel panic shows up again.
I have tried a different memory module without any success.
Any ideas what I can do?
Kind Regards
Marian
Suricata ? ZenArmor ? Virtualized ?
No Surricata, no ZenArmor and a Hardware machine. It is an old Gateprotect gpo 150
Seeing the crashing process is "zpool" here is an educated guess:
https://github.com/opnsense/core/commit/37003d1d5793b03
That's going to be a fun one... I suspect if you install UFS it's fine.
Cheers,
Franco
QuoteSeeing the crashing process is "zpool" here is an educated guess:
https://github.com/opnsense/core/commit/37003d1d5793b03
That's going to be a fun one... I suspect if you install UFS it's fine.
Hi Franco.
is this from your perspective a generic problem which hits all ZFS based installations and a recommendation to skip 24.7.2 in that case?
Best regards
Robert
Hi Robert,
I hope not. It looks like a fringe kernel issue with the OP's hardware (AGP slot in particular) that doesn't surface on FreeBSD because ZPOOL_IMPORT_PATH wasn't bootstrapped ever since FreeBSD changed ZFS implementations in version 13 so this will likely remain to go unnoticed.
We do have a debug kernel, but it requires the system to boot up first. If we can manage to get a core dump we can probably apply a bandaid and report to FreeBSD.
That being said I see no reason to revoke the ZPOOL_IMPORT_PATH. All hell would have broken loose already if it was a major problem. But even then I still don't think an environment variable should crash a user system ever.
Cheers,
Franco
Ok, thanks. Than I will collect some more courage in the next days and do the upgrade afterwards. :-)
Another one: https://forum.opnsense.org/index.php?topic=42387.msg209391#msg209391
I am having the same error, what information do you need?
My box is/was a BARRACUDA BMF220A.
I dont think this device has an AGP port?
Quote from: franco on August 22, 2024, 08:55:44 AM
I suspect if you install UFS it's fine.
Cheers,
Franco
So reinstalling OPNSense with UFS filesystem is the fix?
Quote from: TestUserPleaseIgnore on August 22, 2024, 04:30:23 PM
My box is/was a BARRACUDA BBS190A.
I dont think this device has an AGP port?
Well, it probably has some on-board graphics which presents as AGP to the system.
https://github.com/freebsd/freebsd-src/blob/main/sys/dev/agp/agp.c#L829
https://man.freebsd.org/cgi/man.cgi?query=agp&sektion=4&format=html
I think even the $14 on Ebay is too much for this kind of HW.
https://www.msi.com/Motherboard/N3150I-ECO/
For a good LMAO, see this video: https://www.youtube.com/watch?v=BKDRnu7KAKw - this things looks like a serious fire hazard to me and another WTF from Barracuda.
Quote from: rackenthogg on August 22, 2024, 05:05:13 PM
So reinstalling OPNSense with UFS filesystem is the fix?
If it's the above hardware, I'd reinstall it
into a shredder.
I've got the hardware that I've got, if you want to donate a new box to me I'd gladly take it.
Quote from: doktornotor on August 22, 2024, 05:05:57 PM
Quote from: rackenthogg on August 22, 2024, 05:05:13 PM
So reinstalling OPNSense with UFS filesystem is the fix?
If it's the above hardware, I'd reinstall it into a shredder.
Well, it is definitely not.
Quote from: doktornotor on August 22, 2024, 05:05:57 PM
Quote from: TestUserPleaseIgnore on August 22, 2024, 04:30:23 PM
My box is/was a BARRACUDA BBS190A.
I dont think this device has an AGP port?
Well, it probably has some on-board graphics which presents as AGP to the system.
https://github.com/freebsd/freebsd-src/blob/main/sys/dev/agp/agp.c#L829
https://man.freebsd.org/cgi/man.cgi?query=agp&sektion=4&format=html
I think even the $14 on Ebay is too much for this kind of HW.
https://www.msi.com/Motherboard/N3150I-ECO/
For a good LMAO, see this video: https://www.youtube.com/watch?v=BKDRnu7KAKw - this things looks like a serious fire hazard to me and another WTF from Barracuda.
Quote from: rackenthogg on August 22, 2024, 05:05:13 PM
So reinstalling OPNSense with UFS filesystem is the fix?
If it's the above hardware, I'd reinstall it into a shredder.
Sorry I got the SKU wrong - its a BMF220a
https://servers4less.com/bmf220a-barracuda-im-firewall-220-1-x-vga-1-x-keyboard-1-x-10-100base-tx/?srsltid=AfmBOooz7IMXsHzVaYZvXeeg0x6_VUo4LlyN17G3oL46h4fffM9uepXL
Update: I wiped disk and booted OPNsense 27.2 from USB. Then I logged in as installer and selected UFS option. After that screen was bombarded with fast scrolling messages (to fast to read anything) and the box rebooted.
Repeated the same, after selecting UFS install I paused the screen messages using "Pause" key but before I focused my camera on display, the box rebooted anyway.
Edit 2: ZFS install mode with config restore proceeds without problems, but after 24.7.2 update the whole kernel crash happens again. Selecting UFS install mode results in stream of error messages shown below.
Edit: I managed to take a quick paparazzo-style photo, so here is the part of fast-scrolling stream of messages:
(https://i.postimg.cc/yNQgyD5T/DSC02834.jpg) (https://postimg.cc/v1nZQZG1)
Hi,
I have a Watchguard XTM505. 3GB DDR2 using single SATA Samsung 870 EVO 256GB SSD.
This is happening on my hardware as well and no previous issues any other OPNsense versions.
Nothing additional added or configured in OPNsense after installation.
24.7 clean install works
24.7.1 worked
24.7.2 upgrade from 24 then same issue as OP.
Just wanted to add additional information. Screenshot attached as well
Quote from: TestUserPleaseIgnore on August 22, 2024, 05:18:23 PM
Sorry I got the SKU wrong - its a BMF220a
https://servers4less.com/bmf220a-barracuda-im-firewall-220-1-x-vga-1-x-keyboard-1-x-10-100base-tx/?srsltid=AfmBOooz7IMXsHzVaYZvXeeg0x6_VUo4LlyN17G3oL46h4fffM9uepXL
I don't dare to Google it. ;D
Quote from: benkill15 on August 22, 2024, 05:25:36 PM
Just wanted to add additional information. Screenshot attached as well
Pasting the serial console output would be a whole lot better than the screenshot.
Not sure what to make of this. The defect we talk about with the OP is not in any image we offer.
The screenshot is out of context since it scrolls forever with irrelevant stack traces.
Cheers,
Franco
Mine was identical to the OPs, maybe one of us could upgrade again to 24.7.2 and try to get you logs?
Quote from: doktornotor on August 22, 2024, 05:26:47 PM
Quote from: benkill15 on August 22, 2024, 05:25:36 PM
Just wanted to add additional information. Screenshot attached as well
Pasting the serial console output would be a whole lot better than the screenshot.
Hi, understood and agreed. It was what I had at the time so will get home later tonight and post full output.
Same here on an HP/Compaq and a Toshiba disk.
It panics at the exact same instruction (the movq) as with OP. I have no additional info to offer.
At another installation, 24.7.2 is running fine after an earlier update, I was told.
Quote from: benkill15 on August 22, 2024, 05:25:36 PM
24.7 clean install works
24.7.1 worked
24.7.2 upgrade from 24 then same issue as OP.
I've tested the whole thing on another hardware box. Same kernel crash/panic thing happens after update to 24.7.2
Let's bring a bit of structure in these unclear +1 posts.
Are you using ZFS? How old is the hardware you are using or is it a VM? Does this panic occur due to the 24.7.2 kernel or 24.7.2 core package? I know it's difficult with the panic but we need more data points than "24.7.2 is not working" now.
Thanks,
Franco
Fix that worked for me:
1. Wipe disk (without this installer barked later about some UUIDs and other disk related stuff).
2. Start installer, during install select "Other Modes" menu option and manually create UFS filesystem.
3. After install restore config backup.
4. Update to 24.7.2 (I did it from shell, and missing plugins were installed, too)
What was weird is that I was not asked for reboot after updating to 24.7.2 and adding plugins.
bump: I had the same issue. took photos of the logs (spoiler alert: they look about the same as everyone else's) but I'm not gonna include them unless asked because I don't think they'll be of much use.
tried reboots, legacy kernel, safe mode, etc. no dice. after a reinstall to 24.7 it all worked, but updating to 24.7.2 brought about the same issue. sticking on 24.7 for now. if there's any logs or sysinfo I can offer to help with this issue I'm happy to, but I don't have the time to try updating and tinkering again to help with bugfixing
Quote from: emsbro100 on August 22, 2024, 08:46:57 PM
bump: I had the same issue.
Please, read this post: https://forum.opnsense.org/index.php?topic=42373.msg209438#msg209438
Quote from: benkill15 on August 22, 2024, 05:39:14 PM
Quote from: doktornotor on August 22, 2024, 05:26:47 PM
Quote from: benkill15 on August 22, 2024, 05:25:36 PM
Just wanted to add additional information. Screenshot attached as well
Pasting the serial console output would be a whole lot better than the screenshot.
Hi, understood and agreed. It was what I had at the time so will get home later tonight and post full output.
Hardware: WatchGuard XTM505. Xeon L5420. 3GB DDR2. Samsung 870 EVO 256GB.
Posting in serial console output. This was a clean install of 24.7 using ZFS. Nothing configured other than for pure connectivity and tried to update to 24.7.2. I'll be happy to help further if I can.
Mounting filesystems...
Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 03
fault virtual address = 0x0
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff804d7de7
stack pointer = 0x28:0xfffffe00594aeb20
frame pointer = 0x28:0xfffffe00594aeb40
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 29 (zpool)
rdi: fffff8000383b500 rsi: 0000000000020005 rdx: 000000000000000b
rcx: fffff800037ff780 r8: 0000000000000001 r9: 0000000000000000
rax: 0000000000000000 rbx: fffff8000383b500 rbp: fffffe00594aeb40
r10: 0000000000000016 r11: fffff80003817c60 r12: 0000000000002000
r13: 0000000000020005 r14: fffff8000383a700 r15: fffff8000383a600
trap number = 12
panic: page fault
cpuid = 3
time = 1724369812
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00594ae810
vpanic() at vpanic+0x131/frame 0xfffffe00594ae940
panic() at panic+0x43/frame 0xfffffe00594ae9a0
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00594aea00
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00594aea50
calltrap() at calltrap+0x8/frame 0xfffffe00594aea50
--- trap 0xc, rip = 0xffffffff804d7de7, rsp = 0xfffffe00594aeb20, rbp = 0xfffffe00594aeb40 ---
agp_close() at agp_close+0x57/frame 0xfffffe00594aeb40
giant_close() at giant_close+0x68/frame 0xfffffe00594aeb90
devfs_close() at devfs_close+0x4b3/frame 0xfffffe00594aec00
VOP_CLOSE_APV() at VOP_CLOSE_APV+0x1d/frame 0xfffffe00594aec20
vn_close1() at vn_close1+0x14c/frame 0xfffffe00594aec90
vn_closefile() at vn_closefile+0x3d/frame 0xfffffe00594aece0
devfs_close_f() at devfs_close_f+0x2a/frame 0xfffffe00594aed10
_fdrop() at _fdrop+0x11/frame 0xfffffe00594aed30
closef() at closef+0x24a/frame 0xfffffe00594aedc0
closefp_impl() at closefp_impl+0x58/frame 0xfffffe00594aee00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe00594aef30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00594aef30
--- syscall (6, FreeBSD ELF64, close), rip = 0x1805a05842ba, rsp = 0x1805a7b73d88, rbp = 0x1805a7b73da0 ---
KDB: enter: panic
[ thread pid 29 tid 100230 ]
Stopped at kdb_enter+0x33: movq $0,0xfd9962(%rip)
db>
Quote from: franco on August 22, 2024, 08:00:18 PM
Let's bring a bit of structure in these unclear +1 posts.
Are you using ZFS? How old is the hardware you are using or is it a VM? Does this panic occur due to the 24.7.2 kernel or 24.7.2 core package? I know it's difficult with the panic but we need more data points than "24.7.2 is not working" now.
ZFS: yes
HW: HP/Compaq dc7800
Age: unknown, but at least a couple of years, probably five or so.
VM: No.
Kernel or corepackage: I have no clue. The kernel panic stack is in zpool, so I would say there is certainly a problem in the kernel, probably in ZFS. See the dumps others have provided, mine is similar as far as I can tell.
The problem was reproducible and consistently fails with same output.
Glad to help, if I can.
Please let me know what I can do, but I am currently reinstalling, so I can no longer reproduce the error in the same config.
Edit 1:
Additional information: During the reboot from live and reïnstall, the config importer fails. When it started to read the previous ZFS on harddisk, it showed many many error kernel messages that scroll past so fast I cannot read them, and then reboots. I.e. even the former 24.7 kernel is not capable of reading the ZFS on my disk anymore.
I would say the ZFS on disk got inconsistent enough to be a total loss.
Edit 2:
Fresh install from live 24.7 with ZFS. Installing from scratch as the HD was wiped.
After the proper default, and running the wizzard from the GUI for the initial config, I tried upgrading from the root menu at the console.
The reulst is exactly the same. After the reboot the system crashes
"KDB: enter: panic
[thread pid 31 tid 100212]
Stopped at kdb_enter+0x33: movq $0,0xfd996..."
Tracing pid 31
panic() at panic+0x43/frame 0xfffffe007a3169a0
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe007a326a00
trap_pfault() at trap_pfault+0x46/frame 0xfffffe007a316a50
calltrap() at calltrap+0x8/frame 0xfffffe007a316a50
--- trap 0xc, rip = 0xffffffff804d7de7, rsp = 0xfffffe007a316b20, rbp = 0xfffff007a316b40 ---
agp_close() at agp_close+0x57/frame 0xfffffe007a316b40
giant_close() ...
devfs_close() ...
VOP_CLOSE_APV() ...
vn_close1() ...
vn_closefile() ...
etc.
I've been thinking how to approach this. Would someone care to test two images of 24.7.2 -- one with the actual 24.7.2 state and one with the environment var commit reverted?
I think we should do 24.7.3 next week so we need to move this along. We need a way to confirm this precisely and I guess that is the safest way.
Cheers,
Franco
I can do that if you want. My system is out of order anyway...
You have two installation images prepared that I can install? I need .iso images because all I have to install from is a good ol' DVD player. My darn HP refuses to boot from USB sticks.
Links via PM, perhaps?
Quote from: franco on August 23, 2024, 01:20:33 PM
I've been thinking how to approach this. Would someone care to test two images of 24.7.2 -- one with the actual 24.7.2 state and one with the environment var commit reverted?
I think we should do 24.7.3 next week so we need to move this along. We need a way to confirm this precisely and I guess that is the safest way.
Cheers,
Franco
I am willing to test it.
ZFS: yes
HW: Baracuda 220a (Intel Atom based d525)
Age: Based on CPU 10-12 years (Barracuda says it was sold new until 2016)
VM: No.
Kernel or corepackage: default
I just want to "join the club" as well. I too got an kernel panic after upgrading to 24.7.2
panic: page fault
cpuid = 0
time = 1724410401
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0064e3e818
vpanic() at vpanic+0x131/frame 0xfffffe0064e3e940
panic() at panic+0x43/frame 0xfffffe0064e3e9a0
trap_fatal() at trap_fatal+0x48b/frame 0xfffffe0064e3ea00 trap_pfault() at trap_pfault+0x46/frame 0xfffffe0064e3ea50
calltrap() at calltrap+0x8/frame 0xfffffe0064e3ea50
trap 0xc, rip = 0xffffffff804d7de7, rsp = 0xfffffe0064e3eb20, rbp = 0xfffffe0064e3eb40
agp_close() at agp_close+0x57/frame 0xfffffe0064e3eb40 giant_close() at giant_close+0x68/frame Bxfffffe0064e3eb98
devfs_close() at devfs_close+0x4b3/frame 0xfffffe0064e3ec00
VOP_CLOSE_APV() at VOP_CLOSE_APV+0x1d/frame 0xfffffe0064e3ec20
vn_close1() at vn_close1+0x14c/frame 0xfffffe0064e3ec90
vn_closefile() at vn_closefile+0x3d/frame 0xfffffe0064e3ece0
devfs_close_f() at devfs_close_f+0x2a/frame 0xfffffe0064e3ed18
_fdrop() at_fdrop+0x11/frame Bxfffffe0064e3ed30
closef() at closef+0x24a/frame 0xfffffe0064e3edco
closefp_impl() at closefp_imp1+0x58/frame 0xfffffe0064e3ee00 amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe0064e3ef30
fast_syscall_common() at fast_syscall_common+Bxf8/frame 0xfffffe0064e3ef30
syscall (6, FreeBSD ELF64, close), rip = 0x3843e84152ba, rsp = 0x3843f837fd8
18 , rbp = 0x3843f837fda0
KDB: enter: panic
[ thread pid 31 tid 100232 ]
Stopped at
kdb_enter+8x33: movq $0,0xfd9962(%rip)
NEC
db>
This on a real physical box with a Intel Core Duo CPU E6400 at 2.13Ghz with 2GB RAM. I have never had any kernel panic on this system before.
Luckily I just replaced some drives in that box so I reverted to OPNsense 24.7_9-amd64 / FreeBSD 14.1-RELEASE-p2, OpenSSL 3.0.14 which is running happily. Once I changed back to the new drives again I get the kernel panic.
I suggest that you pull this "upgrade" before more people are getting bit by this bug. Best of luck finding it.
Quote from: waxhead on August 23, 2024, 01:44:43 PM
I suggest that you pull this "upgrade" before more people are getting bit by this bug. Best of luck finding it.
I suggest we all agree to find the actual cause first instead of giving blank 20-20 advice.
Thanks,
Franco
Which image types are you guys using here.. DVD or VGA?
Cheers,
Franco
I am using dvd (iso image for DVD+R)
Quote from: franco on August 23, 2024, 02:18:18 PM
Which image types are you guys using here.. DVD or VGA?
Cheers,
Franco
Quote from: franco on August 23, 2024, 01:20:33 PM
I've been thinking how to approach this. Would someone care to test two images of 24.7.2 -- one with the actual 24.7.2 state and one with the environment var commit reverted?
I think we should do 24.7.3 next week so we need to move this along. We need a way to confirm this precisely and I guess that is the safest way.
Cheers,
Franco
Hi. I'm running serial image. I'll be happy to test whatever tonight with whatever links and reinstalls and happy to report back
Quote from: mifi42 on August 23, 2024, 02:28:08 PM
I am using dvd (iso image for DVD+R)
Quote from: franco on August 23, 2024, 02:18:18 PM
Which image types are you guys using here.. DVD or VGA?
Cheers,
Franco
I used VGA.
Quote from: franco on August 23, 2024, 02:18:18 PM
Which image types are you guys using here.. DVD or VGA?
Cheers,
Franco
I can't say 100%, but I would be surprised if I did use anything else than the USB installer e.g. VGA image.
I have tested the update with the patch reverted. The system is booting normal.
HW: Gateprotect GPO 150, Intel(R) Atom(TM) CPU D525 @ 1.80GHz, Samsung SSD, with ZFS, used serial image to install
After the first kernel panic, I did a fresh ZFS install, did a pkg update and pkg upgrade via serial, reverted the patch an reboot. Seems to work fine so far.
Regards
Marian
@mroess
sorry I have been wildly busy today so I couldn't finish the images required. Since you have a system on the good state I would like to ask of you the following:
1. Install the debug kernel for 24.7.2 and reboot to activate it.
# opnsense-update -zkr dbg-24.7.2
# opnsense-shell reboot
2. Trigger the panic manually which in theory should be:
# env ZPOOL_IMPORT_PATH=/dev zpool import -Na
3. If it panics the system knows the debug kernel was installed and creates a /var/crash/vmcore.0 file which is the one I need to hit the debugger.
4. The system should boot back without issue since the panic trigger was only temporarily forced.
=======
If I can view this in the debugger I can apply a kernel bandaid and issue a new kernel. This seems very hardware specific and likely the only possible panic.
Thanks,
Franco
PS: In fact the process would work for anyone with the issue sitting on 24.7 or 24.7.1 waiting for resolution. I tested the command with truss and it really goes on and pokes everything in /dev for better or worse.
:o ::)
You'd expect this would be limited to block devices at minimum.
For emphasis:
# sh -c "env ZPOOL_IMPORT_PATH=/dev truss zpool import -Na 2>&1" | grep '/dev'
openat(AT_FDCWD,"/dev/zfs",O_RDWR|O_EXCL|O_CLOEXEC,00) = 3 (0x3)
openat(AT_FDCWD,"/dev/zfs",O_RDWR|O_CLOEXEC,00) = 4 (0x4)
fstatat(AT_FDCWD,"/dev",{ mode=dr-xr-xr-x ,inode=2,size=512,blksize=4096 },0x0) = 0 (0x0)
__realpathat(AT_FDCWD,"/dev","/dev",1024,0) = 0 (0x0)
open("/dev",O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC,05413465340) = 5 (0x5)
openat(AT_FDCWD,"/dev/acpi",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/apm",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/apmctl",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 7 (0x7)
openat(AT_FDCWD,"/dev/audit",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/auditpipe",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/bpf",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/bpf0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/console",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/consolectl",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/ctty",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) ERR#6 'Device not configured'
openat(AT_FDCWD,"/dev/cuau0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) ERR#16 'Device busy'
openat(AT_FDCWD,"/dev/cuau0.lock",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/cuau0.init",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 7 (0x7)
openat(AT_FDCWD,"/dev/devctl",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) ERR#16 'Device busy'
openat(AT_FDCWD,"/dev/devctl2",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/devstat",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 8 (0x8)
openat(AT_FDCWD,"/dev/fido",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/full",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 8 (0x8)
openat(AT_FDCWD,"/dev/geom.ctl",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 7 (0x7)
openat(AT_FDCWD,"/dev/hpet0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/io",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/kbdmux0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) ERR#16 'Device busy'
openat(AT_FDCWD,"/dev/klog",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) ERR#16 'Device busy'
openat(AT_FDCWD,"/dev/kmem",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/mem",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/midistat",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/mlx5ctl",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/nda0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/nda0p1",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 7 (0x7)
openat(AT_FDCWD,"/dev/nda0p2",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 8 (0x8)
openat(AT_FDCWD,"/dev/music0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/nda0p4",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 8 (0x8)
openat(AT_FDCWD,"/dev/netdump",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 9 (0x9)
openat(AT_FDCWD,"/dev/mdctl",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/netmap",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 9 (0x9)
openat(AT_FDCWD,"/dev/nda0p3",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 10 (0xa)
openat(AT_FDCWD,"/dev/null",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/nvd0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/kbd0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) ERR#16 'Device busy'
openat(AT_FDCWD,"/dev/nvd0p1",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 9 (0x9)
openat(AT_FDCWD,"/dev/nvd0p2",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 11 (0xb)
openat(AT_FDCWD,"/dev/nvd0p4",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/nvd0p3",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 7 (0x7)
openat(AT_FDCWD,"/dev/nvme0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/pass0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) ERR#1 'Operation not permitted'
openat(AT_FDCWD,"/dev/nvme0ns1",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/pci",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/pf",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/pfil",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/random",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/sndstat",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/speaker",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/stderr",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/stdin",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/stdout",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/sysmouse",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/tcp_log",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/ttyu0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/ttyu0.init",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/ttyu0.lock",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/ttyv0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/ttyv1",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/ttyv2",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/ttyv3",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/ttyv4",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/ttyv5",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/ttyv6",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/ttyv7",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 7 (0x7)
openat(AT_FDCWD,"/dev/ttyv8",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/ttyv9",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/ttyva",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 7 (0x7)
openat(AT_FDCWD,"/dev/ttyvb",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/ugen0.1",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/ufssuspend",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 7 (0x7)
openat(AT_FDCWD,"/dev/ugen0.2",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/uinput",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 6 (0x6)
openat(AT_FDCWD,"/dev/usbctl",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/urandom",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 7 (0x7)
openat(AT_FDCWD,"/dev/xpt0",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) ERR#1 'Operation not permitted'
openat(AT_FDCWD,"/dev/zero",O_RDONLY|O_NONBLOCK|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/nda0p3",O_RDONLY|O_EXCL|O_CLOEXEC,00) = 5 (0x5)
openat(AT_FDCWD,"/dev/nvd0p3",O_RDONLY|O_EXCL|O_CLOEXEC,00) = 5 (0x5)
Hmm, awesome. Just reproduced this insane zpool import behavior on XigmaNAS. (The 14.1 RC version).
I guess I'd rather file a ticket there, recycling various desktop -like HW is much more common there.
Good idea, thanks! The problem is this isn't used by default anymore and likely nobody uses this env var. It used to be the default in the old ZFS implementation... That was the starting point of all of this.
To be honest I don't understand the cache files for zfs/zpool which would have been the other way to solve this.
Cheers,
Franco
I've looked at the change you referenced earlier https://github.com/opnsense/core/commit/37003d1d5793b03
but I can't see where the upstream change is, if there's one. Do you have it handy?
Well I certainly don't expect exporting an env. variable to trigger kernel panics. Used or not - if it's broken, unused and unmaintained, just nuke the code... 🤷♂️
@cookiemonster https://github.com/opnsense/core/issues/7553#issuecomment-2186182935
thanks @franco. Shall have a look.
Do I get this correct? "zfs import -a" scans /dev and that makes the kernel panic?
O.K., that could either be a weird broken device driver that is being touched via the device path or maybe even ZFS itself if the zpool is old enough (the current OpenZFS version has new features (https://forum.opnsense.org/index.php?topic=29304.msg206737#msg206737) - and maybe new bugs).
Quote from: meyergru on August 24, 2024, 12:04:32 AM
Do I get this correct? "zfs import -a" scans /dev and that makes the kernel panic?
O.K., that could either be a weird broken device driver that is being touched via the device path
Don't know but this (https://github.com/freebsd/freebsd-src/blob/5cbb98c8259c48ba22c8359f4c14f5438329ce58/sys/dev/agp/agp.c#L829) reference to agp driver (https://man.freebsd.org/cgi/man.cgi?query=agp&sektion=4&format=html) and the
agp_close() call seems to be the only one in the entire FreeBSD-src repo (https://github.com/freebsd/freebsd-src).
🤷♂️
Quote from: franco on August 23, 2024, 02:18:18 PM
Which image types are you guys using here.. DVD or VGA?
I was using VGA version for all tests.
Throwing my hat in to say I had the same issue as OP down to the hex. Did a clean install (ZFS) of 24.7.0, all's well. Restored config, all good. Update to 24.7.2 and hello page fault again.
Saw the post to try UFS instead of ZFS. Selected install via UFS or whatever the menu option was, and got a bunch of stacktraces before the machine rebooted, believe someone also had that issue.
Selected the other option that's not UFS or ZFS and ended up doing a guided UFS install basically. I forget exactly which option it is because I've been messing with (re)installing OPN for the last 3 hours testing this and I'm not about to do it again since everything is stable now.
Are you using ZFS? Not anymore, since that's why the error is appearing.
How old is the hardware/is it a VM? Bare metal, Core 2 Quad 8300, socket lga 775 was circa 08 I believe. 4 gigs RAM, 1 Intel PCIe NIC with 2 ports and 2 1 port PCI NICs. 120gb SATA SSD. Not that it matters, this is a ZFS error.
Does this panic occur due to the 24.7.2 kernel or 24.7.2 core package? I know it's difficult with the panic but we need more data points than "24.7.2 is not working" now. I don't know how I'd differentiate that during boot to be honest. Since it's the kernel panicking I'm assuming it's the 24.7.2 kernel and not core.
Can't sleep. It's 4 a.m. I looked at "man agp". I looked at sporadic meaningless sys/dev/agp code refactors of the last couple of years. Found this in 2020:
https://github.com/opnsense/src/commit/4f8959b9f4bb
Hmm.
Quote from: doktornotor on August 23, 2024, 10:20:58 PM
Well I certainly don't expect exporting an env. variable to trigger kernel panics. Used or not - if it's broken, unused and unmaintained, just nuke the code... 🤷♂️
"And so it is, just like you said it should be"
https://github.com/opnsense/tools/commit/97f9f368b58
If anyone misses agp they can still kldload at their own peril?
https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/misc/OPNsense-24.7.2-vga-amd64.img.bz2
https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/misc/OPNsense-24.7.2-vga-amd64.img.sig
https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/misc/OPNsense-24.7.2-dvd-amd64.iso.bz2
https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/misc/OPNsense-24.7.2-dvd-amd64.iso.sig
I will replace the kernel on the mirror when someone confirmed this on one of the images with the actual hardware that the driver attaches to. ZPOOL_IMPORT_PATH has not been reverted, only agp removed from kernel so it cannot hit the bad agp_close(). If this works for the people reporting the crash we have our answer.
Cheers,
Franco
>>>>> https://github.com/opnsense/src/commit/4f8959b9f4bb <<<<<
Just as Nostradamus predicted in the ICMP thread: "Downstream issue, use a vanilla...ooops nevermind and carry on, nothing to see here"
Hindsight is always 20-20
/FreeBSD<=4
It's a valid point raised in 2020. The man page says this was added in FreeBSD 4.1 (2000). The man page was last updated in 2007. flyboy463 said their hardware was from 2008.
The only question that remains is whether the removal of agp from the kernel is detrimental to using this hardware from 2008 or not. This is a bit of a trick question. :)
Cheers,
Franco
So I was halfway smashed on my initial post here and I am definitely toasty now, but I can say that a clean install of opnsense 24.7.0 and a console upgrade to 24.7.2 on my R720XD completely virtualized (KVM/QEMU) did not produce a pagefault. That hardware is from 2012 with no PCI to speak of AFAIK. Hope this is of some use. I can do more testing if need be if I'm not too hungover tomorrow.
Quote from: franco on August 24, 2024, 04:36:19 AM
"And so it is, just like you said it should be"
https://github.com/opnsense/tools/commit/97f9f368b58
If anyone misses agp they can still kldload at their own peril?
Indeed looks like a good prevention so I suggested that to XigmaNAS folks as well (https://sourceforge.net/p/xigmanas/bugs/484/). If the broken code is not there, it cannot be triggered. 8)
@flyboy463 sorry to have come across this way here. All input is appreciated. It's a hardware specific issue that just happens to be triggered now with ZFS/zpool-import use due to a environment variable use. You can probably crash the affected hardware on FreeBSD 14.1 with something as simple as
# echo > /dev/agpgart
If anyone dares to try be my guest.
The question still remains if there is some use for the agp kernel module here WRT graphics support / VGA console. If that is the case disabling it by default may have other repercussions for users of the hardware. An alternative would be to avoid presenting the device node /dev/agpgart or fix the actual panic as suggested earlier.
I'm still positive that we should do something other than removing the environment variable for 24.7.3 since the scope of this is very narrow and mitigated by using UFS as far as I can tell.
Cheers,
Franco
Franco, are these hints still applicable?
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187015
If removing the driver altogether is a real issue (not convinced at all), this could mitigate the device creation, seems to me? Also, people affected here could add that from the boot prompt to see if it helps?
I saw these patches. Should be "good".
We don't know yet a dmesg of such a system or a "ls /dev/agp*" so that's more or less guesswork.
My favourite solution is removing agp from the kernel now but at least two people with the hardware should test it to be sure.
Cheers,
Franco
Quote from: franco on August 24, 2024, 11:24:29 AM
My favourite solution is removing agp from the kernel now
#metoo
set hint.agp.0.disabled=1
set hint.agp.1.disabled=1
set hint.agp.2.disabled=1
set hint.agp.3.disabled=1
boot
can get you running from the loader shell probably (on most setups,
agp.0 should be enough).
The VGA and DVD images I posted earlier have the agp-disabled kernel.
It can also be installed via
# opnsense-update -zkr 24.7.2
(note the -z for snapshot)
I had this boot tested on non-legacy hardware just to be sure. ;)
Cheers,
Franco
Quote from: doktornotor on August 24, 2024, 11:28:01 AM
Quote from: franco on August 24, 2024, 11:24:29 AM
My favourite solution is removing agp from the kernel now
set hint.agp.0.disabled=1
set hint.agp.1.disabled=1
set hint.agp.2.disabled=1
set hint.agp.3.disabled=1
boot
didn't work on my setup (https://www.ipcstation.net/lanner/fw-7540b)
OK set hint.agp.0.disable=1
OK set hint.agp.1.disable=1
OK set hint.agp.2.disable=1
OK set hint.agp.3.disable=1
OK boot
Loading kernel...
/boot/kernel/kernel text=0x1813e0 text=0xe011a8 text=0x45598c data=0x180+0xe80 data=0x196dc0+0x469240 0x8+0x1a0778+0x8+0x1c5352
Loading configured modules...
/boot/kernel/pflog.ko size 0x3c10 at 0x2166000
loading required module 'pf'
/boot/kernel/pf.ko size 0x8e588 at 0x216a000
/boot/modules/if_re.ko size 0x11d718 at 0x21f9000
/boot/kernel/zfs.ko size 0x5cd5e0 at 0x2317000
/boot/kernel/if_enc.ko size 0x4c20 at 0x28e5000
/boot/kernel/if_bridge.ko size 0xea58 at 0x28ea000
loading required module 'bridgestp'
/boot/kernel/bridgestp.ko size 0x8930 at 0x28f9000
/boot/kernel/opensolaris.ko size 0x1e2c8 at 0x2902000
/boot/kernel/pfsync.ko size 0x11a18 at 0x2921000
/etc/hostid size=0x25
/boot/kernel/if_lagg.ko size 0x165f0 at 0x2933000
loading required module 'if_infiniband'
/boot/kernel/if_infiniband.ko size 0x3540 at 0x294a000
/boot/kernel/carp.ko size 0xfba8 at 0x294e000
/boot/kernel/if_gre.ko size 0xaa30 at 0x295e000
/boot/entropy size=0x1000
KDB: debugger backends: ddb
KDB: current backend: ddb
---<<BOOT>>---
panic: running without device atpic requires a local APIC
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff82d3be20
vpanic() at vpanic+0x131/frame 0xffffffff82d3bf50
panic() at panic+0x43/frame 0xffffffff82d3bfb0
apic_init() at apic_init+0xfc/frame 0xffffffff82d3bfd0
mi_startup() at mi_startup+0xb5/frame 0xffffffff82d3bff0
KDB: enter: panic
[ thread pid 0 tid 0 ]
Stopped at kdb_enter+0x33: movq $0,0xfd9962(%rip)
db>
i'm not quite sure how to perform @franco suggestion
# opnsense-update -zkr 24.7.2
as my machine is offline :-/
Yes.
Exactly the same thing here. Using UFS instead of ZFS leads to another problem: lots of messages and repeated reboots.
Using UFS is not a workaround, but instead seems to point at another kernel problem.
Quote from: rackenthogg on August 22, 2024, 05:24:18 PM
Update: I wiped disk and booted OPNsense 27.2 from USB. Then I logged in as installer and selected UFS option. After that screen was bombarded with fast scrolling messages (to fast to read anything) and the box rebooted.
Repeated the same, after selecting UFS install I paused the screen messages using "Pause" key but before I focused my camera on display, the box rebooted anyway.
Edit 2: ZFS install mode with config restore proceeds without problems, but after 24.7.2 update the whole kernel crash happens again. Selecting UFS install mode results in stream of error messages shown below.
Edit: I managed to take a quick paparazzo-style photo, so here is the part of fast-scrolling stream of messages:
(https://i.postimg.cc/yNQgyD5T/DSC02834.jpg) (https://postimg.cc/v1nZQZG1)
Hmmm, but that appears to be a completely different issue...
Quote from: klmi on August 24, 2024, 01:25:59 PM
---<<BOOT>>---
panic: running without device atpic requires a local APIC
Well so ZFS works on the images that I posted? "UFS as a workaround for the agp(4) issue. That doesn't mean this hardware doesn't have other issues as well. How can anybody guess or expect that?
So I'll ask again: the hardware where the agp_close() panic was seen works with ZFS install on one of the images that I have posted based on 24.7.2 with agp(4) turned off in the kernel?
Cheers,
Franco
IOW: people who did NOT have the agp_close() call in the backtrace would be better off starting their own thread.
E.g., the APIC one probably related to a completely different change (first thing on Google) incompatible with old hardware: https://lists.freebsd.org/archives/freebsd-current/2023-November/005064.html
Yes, the new images do seem to fix the problem for me.
Thank you.
Procedure:
1) boot Fresh install of 24.7
2) login root
3) shell: opnsense-update -zkr 24.7.2
4) reboot
5) system boots fine, into the GUI
6) update to 24.7.2 from GUI
7) reboot
All is fine, running fine, I expect no other problems, but I am yet to install the config I had before.
For now: Thank you very much for your support, all.
Quote from: franco on August 24, 2024, 01:39:35 PM
Well so ZFS works on the images that I posted? "UFS as a workaround for the agp(4) issue. That doesn't mean this hardware doesn't have other issues as well. How can anybody guess or expect that?
So I'll ask again: the hardware where the agp_close() panic was seen works with ZFS install on one of the images that I have posted based on 24.7.2 with agp(4) turned off in the kernel?
Cheers,
Franco
Quote from: doktornotor on August 24, 2024, 01:33:56 PM
Hmmm, but that appears to be a completely different issue...
Quote from: klmi on August 24, 2024, 01:25:59 PM
---<<BOOT>>---
panic: running without device atpic requires a local APIC
sorry, i mixed up logs, the one i posted was with ACPI turned off, so please ignore it
here is the correct with agp_close inside
OK
OK set hint.agp.0.disable=1
OK set hint.agp.1.disable=1
OK set hint.agp.2.disable=1
OK set hint.agp.3.disable=1
OK boot
Loading kernel...
/boot/kernel/kernel text=0x1813e0 text=0xe011a8 text=0x45598c data=0x180+0xe80 data=0x196dc0+0x469240 0x8+0x1a0778+0x8+0x1c5352
Loading configured modules...
/boot/kernel/zfs.ko size 0x5cd5e0 at 0x2166000
/boot/modules/if_re.ko size 0x11d718 at 0x2734000
/etc/hostid size=0x25
/boot/kernel/opensolaris.ko size 0x1e2c8 at 0x2852000
/boot/kernel/if_enc.ko size 0x4c20 at 0x2871000
/boot/kernel/if_gre.ko size 0xaa30 at 0x2876000
/boot/kernel/carp.ko size 0xfba8 at 0x2881000
/boot/kernel/pfsync.ko size 0x11a18 at 0x2891000
loading required module 'pf'
/boot/kernel/pf.ko size 0x8e588 at 0x28a3000
/boot/kernel/pflog.ko size 0x3c10 at 0x2932000
/boot/kernel/if_lagg.ko size 0x165f0 at 0x2936000
loading required module 'if_infiniband'
/boot/kernel/if_infiniband.ko size 0x3540 at 0x294d000
/boot/kernel/if_bridge.ko size 0xea58 at 0x2951000
loading required module 'bridgestp'
/boot/kernel/bridgestp.ko size 0x8930 at 0x2960000
/boot/entropy size=0x1000
KDB: debugger backends: ddb
KDB: current backend: ddb
---<<BOOT>>---
Copyright (c) 1992-2023 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 14.1-RELEASE-p3 stable/24.7-n267796-c61a3c23fb1 SMP amd64
FreeBSD clang version 18.1.5 (https://github.com/llvm/llvm-project.git llvmorg-18.1.5-0-g617a15a9eac9)
VT(vga): resolution 640x480
CPU: Intel(R) Atom(TM) CPU D525 @ 1.80GHz (1800.10-MHz K8-class CPU)
Origin="GenuineIntel" Id=0x106ca Family=0x6 Model=0x1c Stepping=10
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2=0x40e31d<SSE3,DTES64,MON,DS_CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE>
AMD Features=0x20100800<SYSCALL,NX,LM>
AMD Features2=0x1<LAHF>
TSC: P-state invariant, performance statistics
real memory = 4294967296 (4096 MB)
avail memory = 4083974144 (3894 MB)
Event timer "LAPIC" quality 100
ACPI APIC Table: <020812 APIC1311>
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s) x 2 hardware threads
random: unblocking device.
ioapic0: MADT APIC ID 4 != hw id 1
ioapic0 <Version 2.0> irqs 0-23
Launching APs: 3 2 1
random: entropy device external interface
wlan: mac acl policy registered
kbd1 at kbdmux0
WARNING: Device "spkr" is Giant locked and may be deleted before FreeBSD 15.0.
vtvga0: <VT VGA driver>
smbios0: <System Management BIOS> at iomem 0xfa390-0xfa3ae
smbios0: Version: 2.6
aesni0: No AES or SHA support.
acpi0: <020812 RSDT1311>
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 450
Event timer "HPET1" frequency 14318180 Hz quality 440
Event timer "HPET2" frequency 14318180 Hz quality 440
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
vgapci0: <VGA-compatible display> port 0x9080-0x9087 mem 0xfe700000-0xfe77ffff,0xd0000000-0xdfffffff,0xfe600000-0xfe6fffff irq 16 at device 2.0 on pci0
agp0: <Intel Pineview SVGA controller> on vgapci0
WARNING: Device "agp" is Giant locked and may be deleted before FreeBSD 15.0.
agp0: aperture size is 256M, detected 8188k stolen memory
vgapci0: Boot video device
vgapci1: <VGA-compatible display> mem 0xfe580000-0xfe5fffff at device 2.1 on pci0
uhci0: <Intel 82801H (ICH8) USB controller USB-D> port 0x9480-0x949f irq 16 at device 26.0 on pci0
uhci0: LegSup = 0x2f00
usbus0 on uhci0
usbus0: 12Mbps Full Speed USB v1.0
uhci1: <Intel 82801H (ICH8) USB controller USB-E> port 0x9400-0x941f irq 21 at device 26.1 on pci0
uhci1: LegSup = 0x2f00
usbus1 on uhci1
usbus1: 12Mbps Full Speed USB v1.0
ehci0: <Intel 82801H (ICH8) USB 2.0 controller USB2-B> mem 0xfe7ff400-0xfe7ff7ff irq 18 at device 26.7 on pci0
usbus2: EHCI version 1.0
usbus2 on ehci0
usbus2: 480Mbps High Speed USB v2.0
pcib1: <ACPI PCI-PCI bridge> irq 22 at device 28.0 on pci0
pci1: <ACPI PCI bus> on pcib1
em0: <Intel(R) 82583V> port 0xbc00-0xbc1f mem 0xfe8e0000-0xfe8fffff,0xfe8dc000-0xfe8dffff irq 16 at device 0.0 on pci1
em0: EEPROM V1.10-0
em0: Using 1024 TX descriptors and 1024 RX descriptors
em0: Using an MSI interrupt
em0: Ethernet address: 00:90:0b:28:55:f0
em0: netmap queues/slots: TX 1/1024, RX 1/1024
pcib2: <ACPI PCI-PCI bridge> irq 23 at device 28.1 on pci0
pci2: <ACPI PCI bus> on pcib2
em1: <Intel(R) 82583V> port 0xcc00-0xcc1f mem 0xfe9e0000-0xfe9fffff,0xfe9dc000-0xfe9dffff irq 17 at device 0.0 on pci2
em1: EEPROM V1.10-0
em1: Using 1024 TX descriptors and 1024 RX descriptors
em1: Using an MSI interrupt
em1: Ethernet address: 00:90:0b:28:55:f1
em1: netmap queues/slots: TX 1/1024, RX 1/1024
pcib3: <ACPI PCI-PCI bridge> irq 20 at device 28.2 on pci0
pci3: <ACPI PCI bus> on pcib3
em2: <Intel(R) 82583V> port 0xdc00-0xdc1f mem 0xfeae0000-0xfeafffff,0xfeadc000-0xfeadffff irq 18 at device 0.0 on pci3
em2: EEPROM V1.10-0
em2: Using 1024 TX descriptors and 1024 RX descriptors
em2: Using an MSI interrupt
em2: Ethernet address: 00:90:0b:28:55:f2
em2: netmap queues/slots: TX 1/1024, RX 1/1024
pcib4: <ACPI PCI-PCI bridge> irq 21 at device 28.3 on pci0
pci4: <ACPI PCI bus> on pcib4
em3: <Intel(R) 82583V> port 0xec00-0xec1f mem 0xfebe0000-0xfebfffff,0xfebdc000-0xfebdffff irq 19 at device 0.0 on pci4
em3: EEPROM V1.10-0
em3: Using 1024 TX descriptors and 1024 RX descriptors
em3: Using an MSI interrupt
em3: Ethernet address: 00:90:0b:28:55:f3
em3: netmap queues/slots: TX 1/1024, RX 1/1024
uhci2: <Intel 82801H (ICH8) USB controller USB-A> port 0x9c00-0x9c1f irq 23 at device 29.0 on pci0
uhci2: LegSup = 0x3f00
usbus3 on uhci2
usbus3: 12Mbps Full Speed USB v1.0
uhci3: <Intel 82801H (ICH8) USB controller USB-B> port 0x9880-0x989f irq 19 at device 29.1 on pci0
uhci3: LegSup = 0x2f00
usbus4 on uhci3
usbus4: 12Mbps Full Speed USB v1.0
uhci4: <Intel 82801H (ICH8) USB controller USB-C> port 0x9800-0x981f irq 18 at device 29.2 on pci0
uhci4: LegSup = 0x2f00
usbus5 on uhci4
usbus5: 12Mbps Full Speed USB v1.0
ehci1: <Intel 82801H (ICH8) USB 2.0 controller USB2-A> mem 0xfe7ff800-0xfe7ffbff irq 23 at device 29.7 on pci0
usbus6: EHCI version 1.0
usbus6 on ehci1
usbus6: 480Mbps High Speed USB v2.0
pcib5: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci5: <ACPI PCI bus> on pcib5
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH8M UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0
ata0: <ATA channel> at channel 0 on atapci0
atapci1: <Intel ICH8M SATA300 controller> port 0xac00-0xac07,0xa880-0xa883,0xa800-0xa807,0xa480-0xa483,0xa400-0xa40f,0xa080-0xa08f irq 18 at device 31.2 on pci0
ata2: <ATA channel> at channel 0 on atapci1
ata3: <ATA channel> at channel 1 on atapci1
acpi_button0: <Power Button> on acpi0
ns8250: UART FCR is broken
ns8250: UART FCR is broken
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
ns8250: UART FCR is broken
uart0: console (115200,n,8,1)
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
Timecounter "TSC" frequency 1799999591 Hz quality 1000
Timecounters tick every 1.000 msec
ugen5.1: <Intel UHCI root HUB> at usbus5
ugen0.1: <Intel UHCI root HUB> at usbus0
ugen6.1: <Intel EHCI root HUB> at usbus6
uhub0 on usbus5
ugen1.1: <Intel UHCI root HUB> at usbus1
uhub1 on usbus0
uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
uhub2 on usbus6
uhub2: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus6
uhub3 on usbus1
uhub3: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1
uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus5
ugen4.1: <Intel UHCI root HUB> at usbus4
ugen2.1: <Intel EHCI root HUB> at usbus2
ugen3.1: <Intel UHCI root HUB> at usbus3
uhub4 on usbus4
uhub5 on usbus2
uhub4: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus4
uhub5: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2
uhub6 on usbus3
uhub6: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus3
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
Trying to mount root from zfs:zroot/ROOT/default []...
uhub3: 2 ports with 2 removable, self powered
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
uhub4: 2 ports with 2 removable, self powered
uhub6: 2 ports with 2 removable, self powered
Root mount waiting for: usbus2 usbus6 CAM
uhub5: 4 ports with 4 removable, self powered
uhub2: 6 ports with 6 removable, self powered
ada0 at ata2 bus 0 scbus1 target 0 lun 0
ada0: <INTENSO SSD V0718B0> ACS-2 ATA SATA 3.x device
ada0: Serial Number AA000000000000002900
ada0: 300.000MB/s transfers (SATA 2.x, UDMA5, PIO 512bytes)
ada0: 114473MB (234441648 512 byte sectors)
Root mount waiting for: ada
Mounting filesystems...
Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address = 0x0
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff804d7de7
stack pointer = 0x28:0xfffffe0063322b20
frame pointer = 0x28:0xfffffe0063322b40
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 30 (zpool)
rdi: fffff800038a7100 rsi: 0000000000020005 rdx: 000000000000000b
rcx: fffff80003893400 r8: 0000000000000001 r9: 0000000000000000
rax: 0000000000000000 rbx: fffff800038a7100 rbp: fffffe0063322b40
r10: 0000000000000016 r11: fffff80017f12c60 r12: 0000000000002000
r13: 0000000000020005 r14: fffff800038a6700 r15: fffff800038a6600
trap number = 12
panic: page fault
cpuid = 2
time = 1724500496
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0063322810
vpanic() at vpanic+0x131/frame 0xfffffe0063322940
panic() at panic+0x43/frame 0xfffffe00633229a0
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe0063322a00
trap_pfault() at trap_pfault+0x46/frame 0xfffffe0063322a50
calltrap() at calltrap+0x8/frame 0xfffffe0063322a50
--- trap 0xc, rip = 0xffffffff804d7de7, rsp = 0xfffffe0063322b20, rbp = 0xfffffe0063322b40 ---
agp_close() at agp_close+0x57/frame 0xfffffe0063322b40
giant_close() at giant_close+0x68/frame 0xfffffe0063322b90
devfs_close() at devfs_close+0x4b3/frame 0xfffffe0063322c00
VOP_CLOSE_APV() at VOP_CLOSE_APV+0x1d/frame 0xfffffe0063322c20
vn_close1() at vn_close1+0x14c/frame 0xfffffe0063322c90
vn_closefile() at vn_closefile+0x3d/frame 0xfffffe0063322ce0
devfs_close_f() at devfs_close_f+0x2a/frame 0xfffffe0063322d10
_fdrop() at _fdrop+0x11/frame 0xfffffe0063322d30
closef() at closef+0x24a/frame 0xfffffe0063322dc0
closefp_impl() at closefp_impl+0x58/frame 0xfffffe0063322e00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe0063322f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0063322f30
--- syscall (6, FreeBSD ELF64, close), rip = 0x2510eeb22ba, rsp = 0x25117523d88, rbp = 0x25117523da0 ---
KDB: enter: panic
[ thread pid 30 tid 100262 ]
Stopped at kdb_enter+0x33: movq $0,0xfd9962(%rip)
db>
db>
db>
<deleted>
Shouldn't that be "disabled" instead of "disable"?
The Lanner FW-7540b only has SERIAL, is there an image or can i just flash VGA?
Quote from: Patrick M. Hausen on August 24, 2024, 02:47:11 PM
Shouldn't that be "disabled" instead of "disable"?
:-X
OK set hint.agp.0.disabled=1
OK set hint.agp.1.disabled=1
OK set hint.agp.2.disabled=1
OK set hint.agp.3.disabled=1
OK boot
Loading kernel...
/boot/kernel/kernel text=0x1813e0 text=0xe011a8 text=0x45598c data=0x180+0xe80 data=0x196dc0+0x469240 0x8+0x1a0778+0x8+0x1c5352
Loading configured modules...
/boot/entropy size=0x1000
/boot/kernel/if_bridge.ko size 0xea58 at 0x2167000
loading required module 'bridgestp'
/boot/kernel/bridgestp.ko size 0x8930 at 0x2176000
/boot/kernel/carp.ko size 0xfba8 at 0x217f000
/boot/kernel/if_gre.ko size 0xaa30 at 0x218f000
/boot/kernel/if_enc.ko size 0x4c20 at 0x219a000
/boot/kernel/zfs.ko size 0x5cd5e0 at 0x219f000
/boot/kernel/pfsync.ko size 0x11a18 at 0x276d000
loading required module 'pf'
/boot/kernel/pf.ko size 0x8e588 at 0x277f000
/boot/kernel/opensolaris.ko size 0x1e2c8 at 0x280e000
/boot/kernel/pflog.ko size 0x3c10 at 0x282d000
/boot/kernel/if_lagg.ko size 0x165f0 at 0x2831000
loading required module 'if_infiniband'
/boot/kernel/if_infiniband.ko size 0x3540 at 0x2848000
/boot/modules/if_re.ko size 0x11d718 at 0x284c000
/etc/hostid size=0x25
KDB: debugger backends: ddb
KDB: current backend: ddb
---<<BOOT>>---
Copyright (c) 1992-2023 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 14.1-RELEASE-p3 stable/24.7-n267796-c61a3c23fb1 SMP amd64
FreeBSD clang version 18.1.5 (https://github.com/llvm/llvm-project.git llvmorg-18.1.5-0-g617a15a9eac9)
VT(vga): resolution 640x480
CPU: Intel(R) Atom(TM) CPU D525 @ 1.80GHz (1800.09-MHz K8-class CPU)
Origin="GenuineIntel" Id=0x106ca Family=0x6 Model=0x1c Stepping=10
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2=0x40e31d<SSE3,DTES64,MON,DS_CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE>
AMD Features=0x20100800<SYSCALL,NX,LM>
AMD Features2=0x1<LAHF>
TSC: P-state invariant, performance statistics
real memory = 4294967296 (4096 MB)
avail memory = 4083978240 (3894 MB)
Event timer "LAPIC" quality 100
ACPI APIC Table: <020812 APIC1311>
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s) x 2 hardware threads
random: unblocking device.
ioapic0: MADT APIC ID 4 != hw id 1
ioapic0 <Version 2.0> irqs 0-23
Launching APs: 3 2 1
random: entropy device external interface
wlan: mac acl policy registered
kbd1 at kbdmux0
WARNING: Device "spkr" is Giant locked and may be deleted before FreeBSD 15.0.
vtvga0: <VT VGA driver>
smbios0: <System Management BIOS> at iomem 0xfa390-0xfa3ae
smbios0: Version: 2.6
aesni0: No AES or SHA support.
acpi0: <020812 RSDT1311>
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 450
Event timer "HPET1" frequency 14318180 Hz quality 440
Event timer "HPET2" frequency 14318180 Hz quality 440
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
vgapci0: <VGA-compatible display> port 0x9080-0x9087 mem 0xfe700000-0xfe77ffff,0xd0000000-0xdfffffff,0xfe600000-0xfe6fffff irq 16 at device 2.0 on pci0
agp0: <Intel Pineview SVGA controller> on vgapci0
WARNING: Device "agp" is Giant locked and may be deleted before FreeBSD 15.0.
agp0: aperture size is 256M, detected 8188k stolen memory
vgapci0: Boot video device
vgapci1: <VGA-compatible display> mem 0xfe580000-0xfe5fffff at device 2.1 on pci0
uhci0: <Intel 82801H (ICH8) USB controller USB-D> port 0x9480-0x949f irq 16 at device 26.0 on pci0
uhci0: LegSup = 0x2f00
usbus0 on uhci0
usbus0: 12Mbps Full Speed USB v1.0
uhci1: <Intel 82801H (ICH8) USB controller USB-E> port 0x9400-0x941f irq 21 at device 26.1 on pci0
uhci1: LegSup = 0x2f00
usbus1 on uhci1
usbus1: 12Mbps Full Speed USB v1.0
ehci0: <Intel 82801H (ICH8) USB 2.0 controller USB2-B> mem 0xfe7ff400-0xfe7ff7ff irq 18 at device 26.7 on pci0
usbus2: EHCI version 1.0
usbus2 on ehci0
usbus2: 480Mbps High Speed USB v2.0
pcib1: <ACPI PCI-PCI bridge> irq 22 at device 28.0 on pci0
pci1: <ACPI PCI bus> on pcib1
em0: <Intel(R) 82583V> port 0xbc00-0xbc1f mem 0xfe8e0000-0xfe8fffff,0xfe8dc000-0xfe8dffff irq 16 at device 0.0 on pci1
em0: EEPROM V1.10-0
em0: Using 1024 TX descriptors and 1024 RX descriptors
em0: Using an MSI interrupt
em0: Ethernet address: 00:90:0b:28:55:f0
em0: netmap queues/slots: TX 1/1024, RX 1/1024
pcib2: <ACPI PCI-PCI bridge> irq 23 at device 28.1 on pci0
pci2: <ACPI PCI bus> on pcib2
em1: <Intel(R) 82583V> port 0xcc00-0xcc1f mem 0xfe9e0000-0xfe9fffff,0xfe9dc000-0xfe9dffff irq 17 at device 0.0 on pci2
em1: EEPROM V1.10-0
em1: Using 1024 TX descriptors and 1024 RX descriptors
em1: Using an MSI interrupt
em1: Ethernet address: 00:90:0b:28:55:f1
em1: netmap queues/slots: TX 1/1024, RX 1/1024
pcib3: <ACPI PCI-PCI bridge> irq 20 at device 28.2 on pci0
pci3: <ACPI PCI bus> on pcib3
em2: <Intel(R) 82583V> port 0xdc00-0xdc1f mem 0xfeae0000-0xfeafffff,0xfeadc000-0xfeadffff irq 18 at device 0.0 on pci3
em2: EEPROM V1.10-0
em2: Using 1024 TX descriptors and 1024 RX descriptors
em2: Using an MSI interrupt
em2: Ethernet address: 00:90:0b:28:55:f2
em2: netmap queues/slots: TX 1/1024, RX 1/1024
pcib4: <ACPI PCI-PCI bridge> irq 21 at device 28.3 on pci0
pci4: <ACPI PCI bus> on pcib4
em3: <Intel(R) 82583V> port 0xec00-0xec1f mem 0xfebe0000-0xfebfffff,0xfebdc000-0xfebdffff irq 19 at device 0.0 on pci4
em3: EEPROM V1.10-0
em3: Using 1024 TX descriptors and 1024 RX descriptors
em3: Using an MSI interrupt
em3: Ethernet address: 00:90:0b:28:55:f3
em3: netmap queues/slots: TX 1/1024, RX 1/1024
uhci2: <Intel 82801H (ICH8) USB controller USB-A> port 0x9c00-0x9c1f irq 23 at device 29.0 on pci0
uhci2: LegSup = 0x2f00
usbus3 on uhci2
usbus3: 12Mbps Full Speed USB v1.0
uhci3: <Intel 82801H (ICH8) USB controller USB-B> port 0x9880-0x989f irq 19 at device 29.1 on pci0
uhci3: LegSup = 0x2f00
usbus4 on uhci3
usbus4: 12Mbps Full Speed USB v1.0
uhci4: <Intel 82801H (ICH8) USB controller USB-C> port 0x9800-0x981f irq 18 at device 29.2 on pci0
uhci4: LegSup = 0x2f00
usbus5 on uhci4
usbus5: 12Mbps Full Speed USB v1.0
ehci1: <Intel 82801H (ICH8) USB 2.0 controller USB2-A> mem 0xfe7ff800-0xfe7ffbff irq 23 at device 29.7 on pci0
usbus6: EHCI version 1.0
usbus6 on ehci1
usbus6: 480Mbps High Speed USB v2.0
pcib5: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci5: <ACPI PCI bus> on pcib5
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH8M UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0
ata0: <ATA channel> at channel 0 on atapci0
atapci1: <Intel ICH8M SATA300 controller> port 0xac00-0xac07,0xa880-0xa883,0xa800-0xa807,0xa480-0xa483,0xa400-0xa40f,0xa080-0xa08f irq 18 at device 31.2 on pci0
ata2: <ATA channel> at channel 0 on atapci1
ata3: <ATA channel> at channel 1 on atapci1
acpi_button0: <Power Button> on acpi0
ns8250: UART FCR is broken
ns8250: UART FCR is broken
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
ns8250: UART FCR is broken
uart0: console (115200,n,8,1)
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
Timecounter "TSC" frequency 1799999856 Hz quality 1000
Timecounters tick every 1.000 msec
ugen5.1: <Intel UHCI root HUB> at usbus5
ugen1.1: <Intel UHCI root HUB> at usbus1
ugen2.1: <Intel EHCI root HUB> at usbus2
ugen0.1: <Intel UHCI root HUB> at usbus0
uhub0 on usbus5
uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus5
uhub1 on usbus0
uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
uhub2 on usbus1
ugen6.1: <Intel EHCI root HUB> at usbus6
uhub2: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1
ugen4.1: <Intel UHCI root HUB> at usbus4
uhub3 on usbus6
uhub4 on usbus4
ugen3.1: <Intel UHCI root HUB> at usbus3
uhub5 on usbus2
uhub4: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus4
uhub6 on usbus3
uhub6: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus3
uhub5: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2
uhub3: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus6
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
Trying to mount root from zfs:zroot/ROOT/default []...
uhub1: 2 ports with 2 removable, self powered
uhub6: 2 ports with 2 removable, self powered
uhub2: 2 ports with 2 removable, self powered
uhub4: 2 ports with 2 removable, self powered
uhub0: 2 ports with 2 removable, self powered
Root mount waiting for: usbus2 usbus6 CAM
uhub5: 4 ports with 4 removable, self powered
ada0 at ata2 bus 0 scbus1 target 0 lun 0
ada0: <INTENSO SSD V0718B0> ACS-2 ATA SATA 3.x device
ada0: Serial Number AA000000000000002900
ada0: 300.000MB/s transfers (SATA 2.x, UDMA5, PIO 512bytes)
ada0: 114473MB (234441648 512 byte sectors)
Root mount waiting for: usbus6 ada
uhub3: 6 ports with 6 removable, self powered
Mounting filesystems...
Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address = 0x0
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff804d7de7
stack pointer = 0x28:0xfffffe00846e7b20
frame pointer = 0x28:0xfffffe00846e7b40
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 30 (zpool)
rdi: fffff800038a7100 rsi: 0000000000020005 rdx: 000000000000000b
rcx: fffff80003893400 r8: 0000000000000001 r9: 0000000000000000
rax: 0000000000000000 rbx: fffff800038a7100 rbp: fffffe00846e7b40
r10: 0000000000000016 r11: fffff80017ce4c60 r12: 0000000000002000
r13: 0000000000020005 r14: fffff800038a6700 r15: fffff800038a6600
trap number = 12
panic: page fault
cpuid = 2
time = 1724503787
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00846e7810
vpanic() at vpanic+0x131/frame 0xfffffe00846e7940
panic() at panic+0x43/frame 0xfffffe00846e79a0
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00846e7a00
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00846e7a50
calltrap() at calltrap+0x8/frame 0xfffffe00846e7a50
--- trap 0xc, rip = 0xffffffff804d7de7, rsp = 0xfffffe00846e7b20, rbp = 0xfffffe00846e7b40 ---
agp_close() at agp_close+0x57/frame 0xfffffe00846e7b40
giant_close() at giant_close+0x68/frame 0xfffffe00846e7b90
devfs_close() at devfs_close+0x4b3/frame 0xfffffe00846e7c00
VOP_CLOSE_APV() at VOP_CLOSE_APV+0x1d/frame 0xfffffe00846e7c20
vn_close1() at vn_close1+0x14c/frame 0xfffffe00846e7c90
vn_closefile() at vn_closefile+0x3d/frame 0xfffffe00846e7ce0
devfs_close_f() at devfs_close_f+0x2a/frame 0xfffffe00846e7d10
_fdrop() at _fdrop+0x11/frame 0xfffffe00846e7d30
closef() at closef+0x24a/frame 0xfffffe00846e7dc0
closefp_impl() at closefp_impl+0x58/frame 0xfffffe00846e7e00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe00846e7f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00846e7f30
--- syscall (6, FreeBSD ELF64, close), rip = 0x31976572ba, rsp = 0x319f2c6d88, rbp = 0x319f2c6da0 ---
KDB: enter: panic
[ thread pid 30 tid 100257 ]
Stopped at kdb_enter+0x33: movq $0,0xfd9962(%rip)
db>
Hello,
Using set hint.agp.0.disabled=1, was enought to back my server to life.
Trying to run: opnsense-update -zfkr 24.7.2, to force reinstall from snapshot, returns error:
Fetching kernel-24.7.2-amd64.txz: .............[fetch: https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/sets/kernel-24.7.2-amd64.txz.sig: Address family for host not supported] failed, no signature found
So I included -i option, like opnsense-update -zfikr 24.7.2, was able to install snapshop.
Tks all fot help in this case.
Ok, I fail to understand how it only has serial.
agp0: <Intel Pineview SVGA controller> on vgapci0
You can use this method from 24.7 - https://forum.opnsense.org/index.php?topic=42373.msg209660#msg209660
Or perhaps fix your fat fingers and type the hints properly. Was confirmed by another user to be working.
Quote from: ftonioli on August 24, 2024, 02:51:13 PM
Using set hint.agp.0.disabled=1, was enought to back my server to life.
Nice. Thanks for testing.
I am working on it. Will report back. Thanks.
Michiel
Quote from: franco on August 24, 2024, 04:36:19 AM
I will replace the kernel on the mirror when someone confirmed this on one of the images with the actual hardware that the driver attaches to. ZPOOL_IMPORT_PATH has not been reverted, only agp removed from kernel so it cannot hit the bad agp_close(). If this works for the people reporting the crash we have our answer.
Quote from: doktornotor on August 24, 2024, 02:51:58 PM
Ok, I fail to understand how it only has serial.
IIRC Even APU devices have a VGA device somewhere in their chipset but do not have any port connected to that (and probably some more missing components that would drive these signal lines ... whatever).
Quote from: doktornotor on August 24, 2024, 02:53:32 PM
Quote from: ftonioli on August 24, 2024, 02:51:13 PM
Using set hint.agp.0.disabled=1, was enought to back my server to life.
Nice. Thanks for testing.
disabeling only agp.0 also brought mine back to life as well :-)
will continue with installing the patch
i was worried about SERIAL and VGA, as mine doesn't have an video output, only serial
https://www.amazon.ca/Lanner-FW-7540B-Compact-Desktop-Appliance/dp/B01D3WNJQS (https://www.amazon.ca/Lanner-FW-7540B-Compact-Desktop-Appliance/dp/B01D3WNJQS)
can approve
opnsense-update -zfkr 24.7.2
is working for me, until now no side affects
will continue testing, so long big thanks and great work, THANK YOU!
Quote from: Patrick M. Hausen on August 24, 2024, 02:58:23 PM
IIRC Even APU devices have a VGA device somewhere in their chipset but do not have any port connected to that (and probably some more missing components that would drive these signal lines ... whatever).
Apparently there is some VGA pinout on the MB - https://www.manualsdir.com/manuals/802785/lanner-fw-7540.html?page=16&original=1
Quote from: klmi on August 24, 2024, 03:00:31 PM
i was worried about SERIAL and VGA, as mine doesn't have an video output, only serial
It seems to have some nonsense on-board with pins, no normal connector, at least per the above linked manual. (Also wondering how is the BIOS described there used with serial console... huh.)
I think this is the way to get serial console on VGA images - but untested:
set console=comconsole
boot -v
Quote from: klmi on August 24, 2024, 03:06:32 PM
opnsense-update -zfkr 24.7.2
is working for me, until now no side affects
8)
If someone with the affected hardware wants to follow up the broken agp driver with upstream, I created a bug report for your convenience - see Bug 281035 (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281035). ;D
Will check the progress here in a couple of years. :P
I can confirm the DVD image is fine. I am currently running OPNsense 24.7.2_1 happily.
Thanks a lot for your effort, especially in the middle of the night.
I hope you will sleep better. I am grateful.
Michiel
Quote from: franco on August 24, 2024, 04:36:19 AM
https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/misc/OPNsense-24.7.2-vga-amd64.img.bz2
https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/misc/OPNsense-24.7.2-vga-amd64.img.sig
https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/misc/OPNsense-24.7.2-dvd-amd64.iso.bz2
https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/misc/OPNsense-24.7.2-dvd-amd64.iso.sig
I will replace the kernel on the mirror when someone confirmed this on one of the images with the actual hardware that the driver attaches to. ZPOOL_IMPORT_PATH has not been reverted, only agp removed from kernel so it cannot hit the bad agp_close(). If this works for the people reporting the crash we have our answer.
Cheers,
Franco
Does this mean its safe to use the firmware update in the GUI at this time?
Uhm, no... those are images for testing, nothing released now. If you have some vintage HW which has /dev/agpgart and running <24.7.2 and do not want to play with kernel hints on boot or potentially reinstalling, just leave it alone. If you cannot live without updating, only use the method mentioned in this post (https://forum.opnsense.org/index.php?topic=42373.msg209660#msg209660).
Quote from: doktornotor on August 24, 2024, 06:01:52 PM
Uhm, no... those are images for testing, nothing released now. If you have some vintage HW which has /dev/agpgart and running <24.7.2 and do not want to play with kernel hints on boot or potentially reinstalling, just leave it alone. If you cannot live without updating, only use the method mentioned in this post (https://forum.opnsense.org/index.php?topic=42373.msg209660#msg209660).
I tested this on a modern CPU 5700G and it still happened. Its not about hardware age.
Not sure we are on the same page. This thread is about kernel panic with AGP graphics driver specifically.
Quote from: doktornotor on August 24, 2024, 05:00:40 PM
If someone with the affected hardware wants to follow up the broken agp driver with upstream, I created a bug report for your convenience - see Bug 281035 (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281035). ;D
Will check the progress here in a couple of years. :P
...exactly my humor! Keep up with your good work! :-P
When will BSD kick out OPNsense? :-D
@TestUserPleaseIgnore can you clarify a lot on "it still happened"
Thanks,
Franco
Up hill battle, isn't it ...
i have an acer veriton s680g with core i5 650, 8G of Ram, i had the original problem, i tested the vga iso, and i am connected to my console via vga cable, matter of fact i just installed, it is ok, now i do not have the panic problem anymore.
Thanks,
Mohammad
Quote from: franco on August 24, 2024, 04:36:19 AM
If anyone misses agp they can still kldload at their own peril?
https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/misc/OPNsense-24.7.2-vga-amd64.img.bz2
https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/misc/OPNsense-24.7.2-vga-amd64.img.sig
https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/misc/OPNsense-24.7.2-dvd-amd64.iso.bz2
https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/misc/OPNsense-24.7.2-dvd-amd64.iso.sig
I will replace the kernel on the mirror when someone confirmed this on one of the images with the actual hardware that the driver attaches to. ZPOOL_IMPORT_PATH has not been reverted, only agp removed from kernel so it cannot hit the bad agp_close(). If this works for the people reporting the crash we have our answer.
Cheers,
Franco
For what it is worth...
My HP/Compaq, so I was told, does not have an AGP-slot on the mobo. (I have not checked myself.)
Yet, the original 24.7.2 broke on agp_close(), so I am figuring there are mobo's out there that seem to report they have agp, while they do not.
Something in the chipset?
Does that make any sense to anyone?
Quote from: mifi42 on August 25, 2024, 10:48:16 AM
My HP/Compaq, so I was told, does not have an AGP-slot on the mobo. (I have not checked myself.)
Does that make any sense to anyone?
Can attach to some unused on-board junk.
dmesg | grep agp
and see what it detects.
Quote from: doktornotor on August 25, 2024, 10:53:52 AM
Quote from: mifi42 on August 25, 2024, 10:48:16 AM
My HP/Compaq, so I was told, does not have an AGP-slot on the mobo. (I have not checked myself.)
Does that make any sense to anyone?
Can attach to some unused on-board junk.
dmesg | grep agp
and see what it detects.
It is not clear to me if the result below means agp is in use. However, the console does work, while the test version (24.7.2_1) has agp removed from the kernel. I am clearly out of my comfort zone here.
Is it time to scrap the hardware?
dmesg | grep agp:
agp0: <Intel Q35 SVGA controller> on vgapci0
WARNING: Device "agp" is Giant locked and may be deleted before FreeBSD 15.0.
agp0: aperture size is 256M, detected 6140k stolen memory
Cheers,
Michiel
As noted, there's something on the motherboard detected by the driver. That does not mean it's usable, has any pinout, let alone VGA connector available.
agp0: <Intel Q35 SVGA controller> on vgapci0
Is it time to scrap the hardware?
Are you actually using the VGA console? I.e., do you have monitor connected to the box itself?
Quote from: doktornotor on August 25, 2024, 12:44:26 PM
Are you actually using the VGA console? I.e., do you have monitor connected to the box itself?
I do yes. And it is still working. The DVI-port did not work, however.
Michiel
If VGA is still working with the kernel without the AGP driver, then you don't have any other problem and no need to scrap anything. You can check for what other video hardware it has, using tools such as
pciconf -lv
@franco
Sorry for the late response. It is har to schedule a maintenance window with the family ;-)
I have done as requested and installed the dbg kernel. I have reproduced the kernel panic and have the vmcore file.
What is the preferred way of sharing the file?
Regards
Marian
Just to add another hardware config, an upgraded (CPU + memory) WatchGuard XTM 505 also kernel panics on upgrade. There is a legacy VGA header onboard. I have not done any diagnostics.
Same issue here with Old HP Elite 8000 SFF desktop recommissioned as an awesome firewall until 24.7.2
I have been doing regular updates for a a couple of years with no problem.
Exact same problems - whole thing crashes upgrading 24.7.1 to 24.7.2
So annoying - I saved the config - but it needs a re-install of all the plugins etc etc and now a right royal pita
Do we have a fix yet
Quote from: Penetr8or on August 25, 2024, 10:51:06 PM
Do we have a fix yet
Helps to read the thread... Start here (https://forum.opnsense.org/index.php?topic=42373.msg209659#msg209659) and follow up with the next post.
Quote from: franco on August 22, 2024, 12:21:03 PM
Hi Robert,
I hope not. It looks like a fringe kernel issue with the OP's hardware (AGP slot in particular) that doesn't surface on FreeBSD because ZPOOL_IMPORT_PATH wasn't bootstrapped ever since FreeBSD changed ZFS implementations in version 13 so this will likely remain to go unnoticed.
We do have a debug kernel, but it requires the system to boot up first. If we can manage to get a core dump we can probably apply a bandaid and report to FreeBSD.
That being said I see no reason to revoke the ZPOOL_IMPORT_PATH. All hell would have broken loose already if it was a major problem. But even then I still don't think an environment variable should crash a user system ever.
Cheers,
Franco
Based on this - it's time to move away from the product - 100% experienced the same issue and to think it wont be fixed.....
Quote from: Penetr8or on August 25, 2024, 10:55:48 PM
Based on this - it's time to move away from the product - 100% experienced the same issue and to think it wont be fixed.....
Ok, if reading is too hard even when people link you directly to the solution...
Quote from: doktornotor on August 25, 2024, 10:53:56 PM
Quote from: Penetr8or on August 25, 2024, 10:51:06 PM
Do we have a fix yet
Helps to read the thread... Start here (https://forum.opnsense.org/index.php?topic=42373.msg209659#msg209659) and follow up with the next post.
Sorry bro - these forums are like a bowl of spaghetti
Okay update...
I panicked more than the kernel :-[
Running on a repurposed HP Elite 8000 Circa 2007 C2Duo 3Ghz 8GB Ram 120GB SSD 4x1GbE plus onboard nic.
Migrated from pfSense and been running smooth butter for a couple years updating without any issues at all!
So originally the update from 24.7 to 24.7.2 hosed my system completely and I never had a bare-metal backup.
I had been downloading the config file before each update thinking a roll-back would be simple in a catastrophe. ;D
But when I couldn't get the thing to boot without the panic crash - i had no way in.
Downloaded the latest 24.7 from the website and thought okay a quick re-install and apply the config i backed up.
Loads of errors ofc because the pluggins weren't reinstalled yet - start again lol
Eventually made a couple of panicky comments here and in the wrong threads and got rightfully roasted so my apologies to doktornotor and franco who after i read more of the spaghetti have clearly made HUGE efforts to help and it did - thank you!!
My solution so far has been to use the download from the website of latest vga version [i did download francos version posted here but didn't need it]
I shelled into the cli on the console and did the opnsense-update -zkr 24.7.2
Went back to the gui when it rebooted and did the full update - voila!!
Thanks chaps
[added:] - for anyone else -
After i did the updates -
i re-installed the acme [no config] -
i then installed zenarmor [run initial config with defaults]
THEN - I restored of the backup configs i had done religiously and it restored EVRYTHING back to normal including all the SSL certificates and zenarmor configs
AMAZING - I love this product!!!
Quote from: Penetr8or on August 26, 2024, 12:35:11 AM
I panicked more than the kernel :-[
Good that it's working now... ;D
Ok it seems the disable for agp(4) is viable although some doubts WRT VGA capability loss remain at least for me. What's strange is that cheap hardware vendor buys chipsets with AGP support for building serial appliances, but it was probably the cheapest option. ;)
According to Wikipedia "As of 2013, PCI Express has replaced AGP as the default interface for graphics cards on new systems.". I think that's the benchmark we have to apply for this wager. I'll put this on the agenda for today's developer meeting.
@mroess I don't have any immediate means to have you upload that file. Do you have a dropbox or online drive or something where you could put it?
Cheers,
Franco
Hello all, sorry I'm a bit late to the party. Had same error as OP. Read entire thread to help troubleshoot. Tried the "set hint.agp.0.disabled=1" and got unknown variable error. I'm running Opnsense on a Checkpoint 2200 (more than one to be exact) and all have had this issue. Neither of these have VGA capabilities. Also have one running on a Vsphere VM on a Dell R330 and has the same issue. Attached is the info I could pull when booting from a console as a "single user" and from troubleshooting on the Checkpoint.
Ultimate fix for me:
- Download new Serial image from website and install using ZFS (same as before)
- Restore configs and access router via SSH and web GUI
- Run opnsense-update -zkr 24.7.2 from shell
- Reboot and run update from web GUI
System is back up and so far is stable.
*UPDATE*
So, my VM did not go down, only the 2 CheckPoint Serial devices. So, it does only affect the devices that are serial only. Issue with my VM is a remote administration issue, something else to fix for me..... :-[
Kernel panic and attempt to disable agp (I did try many different variations of this, not just the one below)
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0x0
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff804d7de7
stack pointer = 0x28:0xfffffe007b1e3b20
frame pointer = 0x28:0xfffffe007b1e3b40
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 30 (zpool)
rdi: fffff800035d9400 rsi: 0000000000020005 rdx: 000000000000000b
rcx: fffff80003797900 r8: 0000000000000001 r9: 0000000000000000
rax: 0000000000000000 rbx: fffff800035d9400 rbp: fffffe007b1e3b40
r10: 0000000000000016 r11: fffff80005d7ec60 r12: 0000000000002000
r13: 0000000000020005 r14: fffff80003761900 r15: fffff80003761a00
trap number = 12
panic: page fault
cpuid = 0
time = 1724695959
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe007b1e3810
vpanic() at vpanic+0x131/frame 0xfffffe007b1e3940
panic() at panic+0x43/frame 0xfffffe007b1e39a0
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe007b1e3a00
trap_pfault() at trap_pfault+0x46/frame 0xfffffe007b1e3a50
calltrap() at calltrap+0x8/frame 0xfffffe007b1e3a50
--- trap 0xc, rip = 0xffffffff804d7de7, rsp = 0xfffffe007b1e3b20, rbp = 0xfffffe007b1e3b40 ---
agp_close() at agp_close+0x57/frame 0xfffffe007b1e3b40
giant_close() at giant_close+0x68/frame 0xfffffe007b1e3b90
devfs_close() at devfs_close+0x4b3/frame 0xfffffe007b1e3c00
VOP_CLOSE_APV() at VOP_CLOSE_APV+0x1d/frame 0xfffffe007b1e3c20
vn_close1() at vn_close1+0x14c/frame 0xfffffe007b1e3c90
vn_closefile() at vn_closefile+0x3d/frame 0xfffffe007b1e3ce0
devfs_close_f() at devfs_close_f+0x2a/frame 0xfffffe007b1e3d10
_fdrop() at _fdrop+0x11/frame 0xfffffe007b1e3d30
closef() at closef+0x24a/frame 0xfffffe007b1e3dc0
closefp_impl() at closefp_impl+0x58/frame 0xfffffe007b1e3e00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe007b1e3f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe007b1e3f30
--- syscall (6, FreeBSD ELF64, close), rip = 0x2123944fb2ba, rsp = 0x2123994a4d88, rbp = 0x2123994a4da0 ---
KDB: enter: panic
[ thread pid 30 tid 100220 ]
Stopped at kdb_enter+0x33: movq $0,0xfd9962(%rip)
db> set hint.agp.0.disabled=1
Unknown variable
db>
Single User boot and trying to discover what was using agp
root@:/ # dmesg | grep agp
agp0: <Intel Pineview SVGA controller> on vgapci0
WARNING: Device "agp" is Giant locked and may be deleted before FreeBSD 15.0.
agp0: aperture size is 256M, detected 8188k stolen memory
agp0: <Intel Pineview SVGA controller> on vgapci0
WARNING: Device "agp" is Giant locked and may be deleted before FreeBSD 15.0.
agp0: aperture size is 256M, detected 8188k stolen memory
agp_close() at agp_close+0x57/frame 0xfffffe007b1efb40
agp0: <Intel Pineview SVGA controller> on vgapci0
WARNING: Device "agp" is Giant locked and may be deleted before FreeBSD 15.0.
agp0: aperture size is 256M, detected 8188k stolen memory
In today's meeting we agreed to go with the disabling of the agp device in the 24.7.3 kernel.
Thanks for a further datapoint that these devices appear to be serial ones without relevant VGA capabilities.
Cheers,
Franco
My console is working fine on the alternative pci device without the AGP driver.
Quote from: franco on August 26, 2024, 10:51:45 PM
In today's meeting we agreed to go with the disabling of the agp device in the 24.7.3 kernel.
Thanks for a further datapoint that these devices appear to be serial ones without relevant VGA capabilities.
Cheers,
Franco
Quote from: doktornotor on August 24, 2024, 11:28:01 AM
#metoo
set hint.agp.0.disabled=1
set hint.agp.1.disabled=1
set hint.agp.2.disabled=1
set hint.agp.3.disabled=1
boot
can get you running from the loader shell probably (on most setups, agp.0 should be enough).
Worked for me on my test (home) setup but with the added gotcha of having to access it blind, as for some reason the (very small cheap) monitor is blank during the boot menu stage (works before and after). Very careful one-finger typing got me there eventually ;D
In case of power outage when I'm not available to restore it, is there a way to add these settings to apply every boot?
Mark from FreeBSD provided a patch so I built a test kernel with agp reenabled and the fix in place.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281035#c8
# opnsense-update -zkr 24.7.3-agp
Just in case to recover reboot into kernel.old from the boot menu (if possible).
Though it looks like the most likely fix at the moment. To confirm-confirm test if agp was already loaded when it boots ok when initial 24.7.2 would not:
# kldload agp
Should complain about already being loaded. :)
Though we will probably keep the driver disabled by default unless it creates other problems with 23.7.3. Only one way to find out either way.
Cheers,
Franco
....unless it creates other problems with [s][i][b]23[/b][/i][/s]24.7.3.
You lost one year while writing your post. Please don't take us back to the horrible 2023 with your time travel capacities...
The secret has been revealed. ;)
Of course I mean 24.7.3.
Cheers,
Franco
sorry guys, I'm having a lot of reboots because of Kernel Panics. How can I know if it's related to this AGP issue?
Fatal trap 9: general protection fault while in kernel mode
cpuid = 15; apic id = 2e
instruction pointer = 0x20:0xffffffff810924ee
stack pointer = 0x28:0xfffffe014b0a4bd0
frame pointer = 0x28:0xfffffe014b0a4c00
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 36814 (python3.11)
rdi: fffffe001ea22b00 rsi: 000000000000000f rdx: 00000000000000ed
rcx: 2d8be74f1d661a99 r8: 000007fffffff000 r9: fffff800019c6868
rax: fffff801a04247b0 rbx: fffffe00070c2a08 rbp: fffffe014b0a4c00
r10: 0000000115915425 r11: fffff80000000000 r12: fffffe001ea22b00
r13: 0000000000000000 r14: fffff801a04247a8 r15: fffffe014b0a4c60
trap number = 9
panic: general protection fault
cpuid = 15
time = 1724877840
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe014b0a4910
vpanic() at vpanic+0x131/frame 0xfffffe014b0a4a40
panic() at panic+0x43/frame 0xfffffe014b0a4aa0
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe014b0a4b00
calltrap() at calltrap+0x8/frame 0xfffffe014b0a4b00
--- trap 0x9, rip = 0xffffffff810924ee, rsp = 0xfffffe014b0a4bd0, rbp = 0xfffffe014b0a4c00 ---
pmap_try_insert_pv_entry() at pmap_try_insert_pv_entry+0xbe/frame 0xfffffe014b0a4c00
pmap_copy() at pmap_copy+0x549/frame 0xfffffe014b0a4cb0
vmspace_fork() at vmspace_fork+0xc90/frame 0xfffffe014b0a4d30
fork1() at fork1+0x52e/frame 0xfffffe014b0a4da0
sys_fork() at sys_fork+0x54/frame 0xfffffe014b0a4e00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe014b0a4f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe014b0a4f30
--- syscall (0, FreeBSD ELF64, syscall), rip = 0x826ce51fa, rsp = 0x8204b87d8, rbp = 0x8204b8830 ---
KDB: enter: panic
panic.txt0600003014663706020 7135 ustarrootwheelgeneral protection faultversion.txt0600007414663706020 7540 ustarrootwheelFreeBSD 14.1-RELEASE-p3 ixl_revert-n267779-6ca05616b9e9 SMP
Trying to mount root from zfs:zroot/ROOT/default []...
uhub0: 4 ports with 4 removable, self powered
uhub1: 16 ports with 16 removable, self powered
Root mount waiting for: usbus1
ugen1.2: <MediaTek Inc. WirelessDevice> at usbus1
Dual Console: Serial Primary, Video Secondary
pid 31 (zpool) is attempting to use unsafe AIO requests - not logging anymore
root@opnsense:~ # dmesg | grep agp
root@opnsense:~ #
Not my intention to mix topics, so if this has nothing to do with the AGP thing, I will open a new topic.
That one is completely unrelated to agp.
Quote from: doktornotor on August 29, 2024, 12:08:19 AM
That one is completely unrelated to agp.
I will open a new topic to not mix things here then. thanks buddy!
I tried to follow this whole thread but found it a bit confusing.
For anyone who happens to be running PCengines APU2-series boards, all 3 of my APU2-based opnsense boxes, running ZFS, applied the 24.7.2 update without any problems. No kernel panics or other weirdness.
@furfix
I'm suspecting a bad zpool in your install that is now found since zpool import -Na does what it should.
Cheers,
Franco
Quote from: franco on August 29, 2024, 07:34:10 AM
I'm suspecting a bad zpool in your install that is now found since zpool import -Na does what it should.
What's the best way to check this and hopefully fixing it before attempting an upgrade?
A damaged zpool crashing the kernel? That would be unpredictable and best course of action is probably a reinstall.
FWIW, I'm only speculating on a forum looking at the information given at the time they were.
Cheers,
Franco
I will have a try today. But first I wall save and export my current config... ;)
Quote from: franco on August 28, 2024, 05:39:11 PM
Mark from FreeBSD provided a patch so I built a test kernel with agp reenabled and the fix in place.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281035#c8
# opnsense-update -zkr 24.7.3-agp
Just in case to recover reboot into kernel.old from the boot menu (if possible).
Though it looks like the most likely fix at the moment. To confirm-confirm test if agp was already loaded when it boots ok when initial 24.7.2 would not:
# kldload agp
Should complain about already being loaded. :)
Though we will probably keep the driver disabled by default unless it creates other problems with 23.7.3. Only one way to find out either way.
Cheers,
Franco
On
opnsense-update -zkr 24.7.3-agp
No signature found.
opnsense-update -zfikr 24.7.3-agp
No update found.
Did you remove the kernel already?
Cheers,
Michiel
Quote from: franco on August 28, 2024, 05:39:11 PM
Mark from FreeBSD provided a patch so I built a test kernel with agp reenabled and the fix in place.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281035#c8
# opnsense-update -zkr 24.7.3-agp
Just in case to recover reboot into kernel.old from the boot menu (if possible).
Though it looks like the most likely fix at the moment. To confirm-confirm test if agp was already loaded when it boots ok when initial 24.7.2 would not:
# kldload agp
Should complain about already being loaded. :)
Though we will probably keep the driver disabled by default unless it creates other problems with 23.7.3. Only one way to find out either way.
Cheers,
Franco
No it's here: https://pkg.opnsense.org/FreeBSD:14:amd64/snapshots/sets/
Do you have a mirror selected that is not synced?
Cheers,
Franco
Meh, dumb, I called it 23.7.3-agp
Rebooting as we speak.
Wait for it...
wait for it...
Yes, it booted. The patch seems to work.
Quote from: franco on August 29, 2024, 11:31:05 AM
Meh, dumb, I called it 23.7.3-agp
On the dashboard: FreeBSD 14.1-RELEASE-p3
# kldload agp
module already loaded or in the kernel
Correct.
Michiel
Quote from: mifi42 on August 29, 2024, 11:35:42 AM
Rebooting as we speak.
Wait for it...
wait for it...
Yes, it booted. The patch seems to work.
Quote from: franco on August 29, 2024, 11:31:05 AM
Meh, dumb, I called it 23.7.3-agp
Thanks a lot, I'll let FreeBSD know to commit the patch.
Cheers,
Franco
Quote from: franco on August 28, 2024, 09:17:07 PM
The secret has been revealed. ;)
Of course I mean 24.7.3.
Cheers,
Franco
Hi All,
Update below...
Hardware: Watchguard XTM505. 3GB DDR2 using single SATA Samsung 870 EVO 256GB SSD.
I clean installed 24.7 using serial image and chose ZFS. Configured for connectivity only. Upgraded to 24.7.3 using web GUI without any issues and rebooted normally.
Happy to test anything if anyone needs anything additional
Cheers and thank you to Franco, and the OPNsense team.
Hi All,
the update from 24.7.2 with the ZFS patch removed to 24.7.3 went smoothly. My box boots up normal.
I have verified that the ZPOOL_IMPORT_PATH is in the rc file again.
So the Topic is solved for me.
Regards
Marian
Thanks for confirming! The actual fix is in FreeBSD now:
https://cgit.freebsd.org/src/commit/?id=12500c1428
Cheers,
Franco
Hi,
I can confirm that the update from 24.7 to 24.7.3_1 on Lanner FW-7568 went smoothly.
Thank you!
Confirming that upgrade to 2.4.3 worked just fine for me as well. Thanks for the fix to everyone involved.