OPNsense Forum

Archive => 21.7 Legacy Series => Topic started by: jbattermann on December 04, 2021, 07:12:26 pm

Title: Sporadic & endless (re)boot loop - Anyone has an idea what might be going on?
Post by: jbattermann on December 04, 2021, 07:12:26 pm
Good afternoon,

my 21.7(.6 atm) OPNsense system has this re-occurring issue that for some reboots it 'falls' into an endless reboot cycle with panics shortly into the boot sequence.

The system itself is stable while up for weeks and weeks, but once I or an update performs a reboot, it sometimes (not always?) ends up in a neverending boot > crash > reboot > crash etc loop and all that seems to help is to turn the system physically off and back on again.

I have a hard time pinpointing what might be the cause here (and I know, without configs etc it might be difficult for others as well), but can maybe someone interpret the System > Firmware > Reporter logs and sees what might be going on?

dmesg.boot:
Code: [Select]
---<>---
Copyright (c) 2013-2019 The HardenedBSD Project.
Copyright (c) 1992-2019 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 12.1-RELEASE-p21-HBSD #0  1c99b63a2ba(stable/21.7)-dirty: Wed Nov 10 11:17:14 CET 2021
    root@sensey:/usr/obj/usr/src/amd64.amd64/sys/SMP amd64
FreeBSD clang version 8.0.1 (tags/RELEASE_801/final 366581) (based on LLVM 8.0.1)
VT(efifb): resolution 800x600
HardenedBSD: initialize and check features (__HardenedBSD_version 1200059 __FreeBSD_version 1201000).
CPU: Intel(R) Xeon(R) E-2246G CPU @ 3.60GHz (3600.21-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x906ea  Family=0x6  Model=0x9e  Stepping=10
  Features=0xbfebfbff
  Features2=0x7ffafbff
  AMD Features=0x2c100800
  AMD Features2=0x121
  Structured Extended Features=0x29c6fbb
  Structured Extended Features3=0x9c002600
  XSAVE Features=0xf
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
real memory  = 68719476736 (65536 MB)
avail memory = 66566623232 (63482 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table:
FreeBSD/SMP: Multiprocessor System Detected: 6 CPUs
FreeBSD/SMP: 1 package(s) x 6 core(s) x 2 hardware threads
FreeBSD/SMP Online: 1 package(s) x 6 core(s)
random: unblocking device.
ioapic0  irqs 0-119 on motherboard
Launching APs: 1 2 3 5 4
Timecounter "TSC-low" frequency 1800104924 Hz quality 1000
wlan: mac acl policy registered
random: entropy device external interface
kbd1 at kbdmux0
module_register_init: MOD_LOAD (vesa, 0xffffffff812947f0, 0) error 19
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
000.000056 [4344] netmap_init               netmap: loaded module
[ath_hal] loaded
nexus0
efirtc0:  on motherboard
efirtc0: registered as a time-of-day clock, resolution 1.000000s
cryptosoft0:  on motherboard
acpi0:  on motherboard
acpi0: Power Button (fixed)
cpu0:  on acpi0
hpet0:  iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 24000000 Hz quality 950
Event timer "HPET" frequency 24000000 Hz quality 550
Event timer "HPET1" frequency 24000000 Hz quality 440
Event timer "HPET2" frequency 24000000 Hz quality 440
attimer0:  port 0x40-0x43,0x50-0x53 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1808-0x180b on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib1:  irq 16 at device 1.0 on pci0
pci1:  on pcib1
pcib2:  mem 0x81500000-0x8151ffff irq 16 at device 0.0 on pci1
pci2:  on pcib2
pcib3:  irq 16 at device 0.0 on pci2
pci3:  on pcib3
pci3:  at device 0.0 (no driver attached)
pcib4:  irq 17 at device 1.0 on pci2
pci4:  on pcib4
pci4:  at device 0.0 (no driver attached)
pcib5:  irq 18 at device 2.0 on pci2
pci5:  on pcib5
pci5:  at device 0.0 (no driver attached)
pcib6:  irq 16 at device 1.1 on pci0
pci6:  on pcib6
ixl0:  mem 0x4010800000-0x4010ffffff,0x4011008000-0x401100ffff irq 17 at device 0.0 on pci6
ixl0: fw 8.4.66032 api 1.14 nvm 8.40 etid 8000aba4 oem 1.267.0
ixl0: The driver for the device detected a newer version of the NVM image than expected.
ixl0: Please install the most recent version of the network driver.
ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
ixl0: Using 1024 TX descriptors and 1024 RX descriptors
ixl0: Using 6 RX queues 6 TX queues
ixl0: Using MSI-X interrupts with 7 vectors
ixl0: Ethernet address: 3c:fd:fe:9f:62:4c
ixl0: Allocating 8 queues for PF LAN VSI; 6 queues active
ixl0: PCI Express Bus: Speed 8.0GT/s Width x8
ixl0: SR-IOV ready
ixl0: netmap queues/slots: TX 6/1024, RX 6/1024
ixl1:  mem 0x4010000000-0x40107fffff,0x4011000000-0x4011007fff irq 17 at device 0.1 on pci6
ixl1: fw 8.4.66032 api 1.14 nvm 8.40 etid 8000aba4 oem 1.267.0
ixl1: The driver for the device detected a newer version of the NVM image than expected.
ixl1: Please install the most recent version of the network driver.
ixl1: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
ixl1: Using 1024 TX descriptors and 1024 RX descriptors
ixl1: Using 6 RX queues 6 TX queues
ixl1: Using MSI-X interrupts with 7 vectors
ixl1: Ethernet address: 3c:fd:fe:9f:62:4d
ixl1: Allocating 8 queues for PF LAN VSI; 6 queues active
ixl1: PCI Express Bus: Speed 8.0GT/s Width x8
ixl1: SR-IOV ready
ixl1: netmap queues/slots: TX 6/1024, RX 6/1024
vgapci0:  port 0x4000-0x403f mem 0x4012000000-0x4012ffffff,0x4000000000-0x400fffffff irq 16 at device 2.0 on pci0
vgapci0: Boot video device
xhci0:  mem 0x4013000000-0x401300ffff irq 16 at device 20.0 on pci0
xhci0: 32 bytes context size, 64-bit DMA
usbus0 on xhci0
usbus0: 5.0Gbps Super Speed USB v3.0
pci0:  at device 20.2 (no driver attached)
sdhci_pci0:  mem 0x4013019000-0x4013019fff irq 19 at device 20.5 on pci0
sdhci_pci0: 1 slot(s) allocated
pci0:  at device 21.0 (no driver attached)
pci0:  at device 21.1 (no driver attached)
pci0:  at device 22.0 (no driver attached)
pci0:  at device 22.1 (no driver attached)
pci0:  at device 22.4 (no driver attached)
ahci0:  port 0x4090-0x4097,0x4080-0x4083,0x4060-0x407f mem 0x81900000-0x81901fff,0x81903000-0x819030ff,0x81902000-0x819027ff irq 16 at device 23.0 on pci0
ahci0: AHCI v1.31 with 8 6Gbps ports, Port Multiplier not supported
ahcich0:  at channel 0 on ahci0
ahcich1:  at channel 1 on ahci0
ahcich2:  at channel 2 on ahci0
ahcich3:  at channel 3 on ahci0
ahcich4:  at channel 4 on ahci0
ahcich5:  at channel 5 on ahci0
ahcich6:  at channel 6 on ahci0
ahcich7:  at channel 7 on ahci0
ahciem0:  at channel 2147483647 on ahci0
device_attach: ahciem0 attach returned 6
pcib7:  irq 16 at device 27.0 on pci0
pci7:  on pcib7
pcib8:  irq 16 at device 27.4 on pci0
pci8:  on pcib8
ix0:  mem 0x4011800000-0x4011bfffff,0x4011c04000-0x4011c07fff irq 16 at device 0.0 on pci8
ix0: Using 2048 TX descriptors and 2048 RX descriptors
ix0: Using 6 RX queues 6 TX queues
ix0: Using MSI-X interrupts with 7 vectors
ix0: allocated for 6 queues
ix0: allocated for 6 rx queues
ix0: Ethernet address: d0:50:99:d9:f0:97
ix0: PCI Express Bus: Speed 8.0GT/s Width x4
ix0: netmap queues/slots: TX 6/2048, RX 6/2048
ix1:  mem 0x4011400000-0x40117fffff,0x4011c00000-0x4011c03fff irq 17 at device 0.1 on pci8
ixl1: Link is up, 10 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: None
ixl1: link state changed to UP
ix1: Using 2048 TX descriptors and 2048 RX descriptors
ix1: Using 6 RX queues 6 TX queues
ix1: Using MSI-X interrupts with 7 vectors
ix1: allocated for 6 queues
ix1: allocated for 6 rx queues
ix1: Ethernet address: d0:50:99:d9:f0:96
ix1: PCI Express Bus: Speed 8.0GT/s Width x4
ix1: netmap queues/slots: TX 6/2048, RX 6/2048
pcib9:  irq 16 at device 28.0 on pci0
pci9:  on pcib9
pcib10:  irq 16 at device 0.0 on pci9
pci10:  on pcib10
vgapci1:  port 0x3000-0x307f mem 0x80000000-0x80ffffff,0x81000000-0x8101ffff irq 16 at device 0.0 on pci10
pcib11:  irq 16 at device 29.0 on pci0
pci11:  on pcib11
nvme0:  mem 0x81600000-0x81603fff irq 16 at device 0.0 on pci11
ixl0: Link is up, 10 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: None, Autoneg: False, Flow Control: None
ixl0: link state changed to UP
pci0:  at device 30.0 (no driver attached)
isab0:  at device 31.0 on pci0
isa0:  on isab0
pci0:  at device 31.5 (no driver attached)
acpi_button0:  on acpi0
acpi_tz0:  on acpi0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
acpi_syscontainer0:  on acpi0
orm0:  at iomem 0xc0000-0xc7fff pnpid ORM0000 on isa0
atrtc0:  at port 0x70 irq 8 on isa0
atrtc0: Warning: Couldn't map I/O.
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
est0:  on cpu0
Timecounters tick every 1.000 msec
ugen0.1: <0x8086 XHCI root HUB> at usbus0
uhub0: <0x8086 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
nvd0:  NVMe namespace
nvd0: 244198MB (500118192 512 byte sectors)
Trying to mount root from ufs:/dev/gpt/rootfs [rw]...
WARNING: /mnt was not properly dismounted
WARNING: /mnt: mount pending error: blocks 0 files 1
WARNING: /mnt: reload pending error: blocks 0 files 1

/var/crash/info.0:
Code: [Select]
Dump header from device: /dev/gpt/swapfs
  Architecture: amd64
  Architecture Version: 4
  Dump Length: 74752
  Blocksize: 512
  Compression: none
  Dumptime: Sat Dec  4 11:47:41 2021
  Hostname:
  Magic: FreeBSD Text Dump
  Version String: FreeBSD 12.1-RELEASE-p21-HBSD #0  1c99b63a2ba(stable/21.7)-dirty: Wed Nov 10 11:17:14 CET 2021
    root@sensey:/usr/obj/usr/src/amd64.amd64/sys/SMP
  Panic String: page fault
  Dump Parity: 3066618741
  Bounds: 0
  Dump Status: good

/var/crash/textdump.tar.0: See https://pastebin.com/407qeQMu
Title: Re: Sporadic & endless (re)boot loop - Anyone has an idea what might be going on?
Post by: cookiemonster on December 04, 2021, 10:13:26 pm
Hi. Most kernel panics are from faulty hardware or hardware with buggy firmware. It can take a while to pinpoint by elimination. The first step I would take is to reinstall with RootonZFS to eliminate the UFS quirks in these scenarios.
Title: Re: Sporadic & endless (re)boot loop - Anyone has an idea what might be going on?
Post by: franco on December 06, 2021, 09:16:02 am
urndis_ctrl_handle() backtrace suggest an issue with the attached USB networking[1]. I think you would get this issue on any OPNsense version.

USB devices can work reliably on FreeBSD but there's only a subset of available ones actually do. Instead of hunting for these it might be better to look for PCI(e) alternatives or switch the hardware altogether for a setup with more hardwired network ports (Intel is mostly good except for ixgbe 500 series).


Cheers,
Franco

[1] https://www.freebsd.org/cgi/man.cgi?query=urndis&sektion=4