Hi, I installed the latest OPNsense image to my Watchguard XTM 5. This worked fine. Then I applied the updates.
Since then I get a kernel panic. I can still boot the machine selecting the previous kernel in the booloader.
The working kernel is
FreeBSD 10.2-RELEASE-p14 #0 b8ff7a2(stable/16.1): Sun Mar 20 09:38:35 CET 2016
The broken kernel writes this:
Booting...
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2015 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.2-RELEASE-p18 #0 f5a1b2f(stable/16.1): Wed Jun 1 07:38:06 CEST 2016
root@sensey32:/usr/obj/usr/src/sys/SMP i386
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
kernel trap 12 with interrupts disabled
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0xf00001af
fault code = supervisor read, page not present
instruction pointer = 0x20:0xc0cece69
stack pointer = 0x28:0xc2420d0c
frame pointer = 0x28:0xc2420d0c
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = resume, IOPL = 0
current process = 0 ()
[ thread pid 0 tid 0 ]
Stopped at soreceive+0x9: movl 0x30(%eax),%eax
What are the changes between those two versions? Any idea how to solve this?
Thx in advance...
Hi there,
Not nice... You should be able to boot the old kernel from the boot menu option 5 switching to "kernel.old", option 1 for boot.
When it's booted back up you should not update the system as that would flush out the good kernel.
Can you give us a "bt" command output from the ddb prompt of the bad kernel?
Biggest change was ASLR, which should run its init shortly after where this stops, I'll have Shawn look at this.
Cheers,
Franco
Here is another dump with bt at the end. The error is related to clnt_dg_soupcall. If you need more infos I'll try to produce them.
FreeBSD 10.2-RELEASE-p18 #0 f5a1b2f(stable/16.1): Wed Jun 1 07:38:06 CEST 2016
root@sensey32:/usr/obj/usr/src/sys/SMP i386
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
kernel trap 12 with interrupts disabled
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0xf00001af
fault code = supervisor read, page not present
instruction pointer = 0x20:0xc0cece69
stack pointer = 0x28:0xc2420d0c
frame pointer = 0x28:0xc2420d0c
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = resume, IOPL = 0
current process = 0 ()
[ thread pid 0 tid 0 ]
Stopped at soreceive+0x9: movl 0x30(%eax),%eax
db> bt
Tracing pid 0 tid 0 td 0xc1e8f160
soreceive(0,0,c2420d44,c2420d40,c2420d3c,...) at soreceive+0x9/frame 0xc2420d0c
clnt_dg_soupcall() at clnt_dg_soupcall+0xbd/frame 0xc2420d38
begin() at begin+0x22
db> reboot
Hey bugbuster,
Can you tell us a bit more about your setup? Can you tell us if Suricata is enabled and is in IPS mode? What other services are set up? At what point does the kernel panic happen? Before even init runs?
Thanks,
Shawn
Hi lattera,
this happens just 1 second after the kernel starts booting. Way before init runs. This is a fresh install. We have some interfaces with dhcp and port forwarding rules enabled.
Intrusion detection is disabled but I don't think that really matters.
Best regards,
Stephan
At the boot loader, could you escape to the loader prompt then use these commands:
set hardening.pax.aslr.status=0
boot
Does it boot fine for you after doing that?
Disabling aslr does not change the result. Have there been changes to the compiler? Different flags or version?
______ _____ _____
/ __ |/ ___ |/ __ |
| | | | |__/ | | | |___ ___ _ __ ___ ___
| | | | ___/| | | / __|/ _ \ '_ \/ __|/ _ \
| |__| | | | | | \__ \ __/ | | \__ \ __/
|_____/|_| |_| /__|___/\___|_| |_|___/\___|
+============Welcome to OPNsense==========+ @@@@@@@@@@@@@@@@@@@@@@@@@@@@
| | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
| 1. Boot Multi User [Enter] | @@@@@ @@@@@
| 2. Boot [S]ingle User | @@@@@ @@@@@
| 3. [Esc]ape to loader prompt | @@@@@@@@@@@ @@@@@@@@@@@
| 4. Reboot | \\\\\ /////
| | )))))))))))) (((((((((((
| Options: | ///// \\\\\
| 5. [K]ernel: kernel (1 of 2) | @@@@@@@@@@@ @@@@@@@@@@@
| 6. Configure Boot [O]ptions... | @@@@@ @@@@@
| | @@@@@ @@@@@
| | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
| | @@@@@@@@@@@@@@@@@@@@@@@@@@@@
+=========================================+
16.1 ``Crafty Coyote''
-
To get back to the menu, type `menu' and press ENTER
or type `boot' and press ENTER to start FreeBSD.
Type '?' for a list of commands, 'help' for more detailed help.
OK 3
OK set hardening.pax.aslr.status=0
OK boot
/boot/kernel/kernel text=0x11a4a1f data=0x785f48+0x190408 syms=[0x4+0xf7e50+0x4+0x18e08c]
Booting...
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2015 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.2-RELEASE-p18 #0 f5a1b2f(stable/16.1): Wed Jun 1 07:38:06 CEST 2016
root@sensey32:/usr/obj/usr/src/sys/SMP i386
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
kernel trap 12 with interrupts disabled
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0xf00001af
fault code = supervisor read, page not present
instruction pointer = 0x20:0xc0cece69
stack pointer = 0x28:0xc2420d0c
frame pointer = 0x28:0xc2420d0c
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = resume, IOPL = 0
current process = 0 ()
[ thread pid 0 tid 0 ]
Stopped at soreceive+0x9: movl 0x30(%eax),%eax
db> bt
Tracing pid 0 tid 0 td 0xc1e8f160
soreceive(0,0,c2420d44,c2420d40,c2420d3c,...) at soreceive+0x9/frame 0xc2420d0c
clnt_dg_soupcall() at clnt_dg_soupcall+0xbd/frame 0xc2420d38
begin() at begin+0x22
db>
Given that turning off ASLR didn't change the resulting kernel panic, I don't think ASLR was the issue. I'll let Franco take it from here.
Sorry, busy times for me elsewhere.
We were thinking the kernel may just be damaged and it would need a fresh replacement so...
# cd /tmp
# fetch https://pkg.opnsense.org/sets/kernel-16.1.16-i386.txz
# rm -r /boot/kernel
# tar -C / -xf kernel-16.1.16-i386.txz
# /usr/local/etc/rc.reboot
If that fails, we can try the development kernel for 10.3.
kernel.old will still be there for fallback
Cheers,
Franco
I did what you suggested.
These are the results.
/boot/kernel/kernel text=0x11a4a1f data=0x785f48+0x190408 syms=[0x4+0xf7e50+0x4+0x18e08c]
Booting...
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2015 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.2-RELEASE-p18 #0 f5a1b2f(stable/16.1): Wed Jun 1 07:38:06 CEST 2016
root@sensey32:/usr/obj/usr/src/sys/SMP i386
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
kernel trap 12 with interrupts disabled
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0x6d4bed18
fault code = supervisor read, page not present
instruction pointer = 0x20:0xc0fe1b93
stack pointer = 0x28:0xc2420cd4
frame pointer = 0x28:0xab7d61e4
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = resume, IOPL = 0
current process = 0 ()
[ thread pid 0 tid 0 ]
Stopped at xdrmem_putlong_aligned+0x23: orb 0(%edi,%ebp,1),%bh
db>
Where do I find the 10.3 kernel?
I'm going to test yesterdays update before
https://pkg.opnsense.org/sets/kernel-16.1.18-i386.txz
The kernel from
https://pkg.opnsense.org/sets/kernel-16.1.18-i386.txz
is booting. I did not find release notes for that release. Where do I find the changes?
It's not released yet... this is funky... there have been very little changes, namely:
o src: tzdata updated to 2016e
o src: fix pf fragement timeout
None of which would matter here.