1
21.1 Legacy Series / Lots of Hyper-V SR-IOV Mellanox boot errors after upgrading to 21.1.8
« on: July 09, 2021, 06:16:05 am »
Running OPNsense in a Hyper-V VM with five SR-IOV enabled network interfaces.
mlx4en_load=YES is set in Tunables.
With 21.1.7 everything was working fine.
After upgrading to 21.1.8 I now see lots and lots of Mellanox Hyper-V errors and boot takes several minutes.
mlx4en_load=YES is set in Tunables.
With 21.1.7 everything was working fine.
After upgrading to 21.1.8 I now see lots and lots of Mellanox Hyper-V errors and boot takes several minutes.
Code: [Select]
Copyright (c) 2013-2019 The HardenedBSD Project.
Copyright (c) 1992-2019 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 12.1-RELEASE-p19-HBSD #0 0c59842367b(stable/21.1)-dirty: Mon Jul 5 15:08:43 CEST 2021
root@sensey:/usr/obj/usr/src/amd64.amd64/sys/SMP amd64
FreeBSD clang version 8.0.1 (tags/RELEASE_801/final 366581) (based on LLVM 8.0.1)
SRAT: Ignoring memory at addr 0x108000000
SRAT: Ignoring memory at addr 0x1000000000
SRAT: Ignoring memory at addr 0x10000000000
SRAT: Ignoring memory at addr 0x20000000000
SRAT: Ignoring memory at addr 0x40000000000
SRAT: Ignoring memory at addr 0x80000000000
VT(efifb): resolution 1024x768
Hyper-V Version: 10.0.17763 [SP2]
Features=0x2e7f<VPRUNTIME,TMREFCNT,SYNIC,SYNTM,APIC,HYPERCALL,VPINDEX,REFTSC,IDLE,TMFREQ>
PM Features=0x0 [C2]
Features3=0xbed7b2<DEBUG,XMMHC,IDLE,NUMA,TMFREQ,SYNCMC,CRASH,NPIEP>
Timecounter "Hyper-V" frequency 10000000 Hz quality 2000
HardenedBSD: initialize and check features (__HardenedBSD_version 1200059 __FreeBSD_version 1201000).
CPU: AMD Ryzen 7 2700 Eight-Core Processor (3194.10-MHz K8-class CPU)
Origin="AuthenticAMD" Id=0x800f82 Family=0x17 Model=0x8 Stepping=2
Features=0x1783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,HTT>
Features2=0xfed83203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV>
AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
AMD Features2=0x4003f3<LAHF,CMP,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,Topology>
Structured Extended Features=0x209c01a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA>
XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
AMD Extended Feature Extensions ID EBX=0x2001004<XSaveErPtr>
Hypervisor: Origin = "Microsoft Hv"
real memory = 4294967296 (4096 MB)
avail memory = 4118040576 (3927 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <VRTUAL MICROSFT>
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s) x 2 hardware threads
random: unblocking device.
ioapic0 <Version 1.1> irqs 0-23 on motherboard
Launching APs: 1 2 3
random: entropy device external interface
wlan: mac acl policy registered
Timecounter "Hyper-V-TSC" frequency 10000000 Hz quality 3000
kbd0 at kbdmux0
module_register_init: MOD_LOAD (vesa, 0xffffffff8128e7f0, 0) error 19
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
000.000055 [4336] netmap_init netmap: loaded module
[ath_hal] loaded
nexus0
efirtc0: <EFI Realtime Clock> on motherboard
efirtc0: registered as a time-of-day clock, resolution 1.000000s
cryptosoft0: <software crypto> on motherboard
acpi0: <VRTUAL MICROSFT> on motherboard
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_syscontainer0: <System Container> on acpi0
vmbus0: <Hyper-V Vmbus> on acpi_syscontainer0
vmbus_res0: <Hyper-V Vmbus Resource> irq 5 on acpi0
Timecounters tick every 10.000 msec
usb_needs_explore_all: no devclass
vmbus0: version 3.0
hvet0: <Hyper-V event timer> on vmbus0
Event timer "Hyper-V" frequency 10000000 Hz quality 1000
hvkbd0: <Hyper-V KBD> on vmbus0
hvheartbeat0: <Hyper-V Heartbeat> on vmbus0
hvkvp0: <Hyper-V KVP> on vmbus0
hvshutdown0: <Hyper-V Shutdown> on vmbus0
hvtimesync0: <Hyper-V Timesync> on vmbus0
hvtimesync0: RTT
hvvss0: <Hyper-V VSS> on vmbus0
storvsc0: <Hyper-V SCSI> on vmbus0
da0 at storvsc0 bus 0 scbus0 target 0 lun 0
da0: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device
da0: 300.000MB/s transfers
da0: Command Queueing enabled
da0: 65536MB (134217728 512 byte sectors)
hn0: <Hyper-V Network Interface> on vmbus0
cd0 at storvsc0 bus 0 scbus0 target 0 lun 1
cd0: <Msft Virtual DVD-ROM 1.0> Removable CD-ROM SPC-3 SCSI device
cd0: 300.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed
hn0: got notify, nvs type 128
hn0: Ethernet address: XX:XX:XX:bf:a3:0c
hn0: link state changed to UP
hn1: <Hyper-V Network Interface> on vmbus0
hn1: got notify, nvs type 128
hn1: Ethernet address: XX:XX:XX:bf:a3:0d
hn1: link state changed to UP
hn2: <Hyper-V Network Interface> on vmbus0
hn2: got notify, nvs type 128
hn2: Ethernet address: XX:XX:XX:bf:a3:0e
hn2: link state changed to UP
hn3: <Hyper-V Network Interface> on vmbus0
hn3: got notify, nvs type 128
hn3: Ethernet address: XX:XX:XX:bf:a3:0f
hn3: link state changed to UP
hn4: <Hyper-V Network Interface> on vmbus0
hn4: got notify, nvs type 128
hn4: Ethernet address: XX:XX:XX:bf:a3:10
hn4: link state changed to UP
pcib0: <Hyper-V PCI Express Pass Through> on vmbus0
Trying to mount root from ufs:/dev/gpt/rootfs [rw]...
pci0: <PCI bus> on pcib0
mlx4_core0: <mlx4_core> at device 2.0 on pci0
mlx4_core: Mellanox ConnectX core driver v3.5.1 (April 2019)
mlx4_core: Initializing mlx4_core
mlx4_core0: Detected virtual function - running in slave mode
mlx4_core0: Sending reset
mlx4_core0: Sending vhcr0
mlx4_core0: HCA minimum page size:512
mlx4_core0: Timestamping is not supported in slave mode
mlx4_en mlx4_core0: Activating port:1
mlxen0: Ethernet address: XX:XX:XX:bf:a3:0c
mlx4_en: mlx4_core0: Port 1: Using 4 TX rings
mlx4_en: mlx4_core0: Port 1: Using 4 RX rings
mlxen0: link state changed to DOWN
mlx4_en: mlxen0: Using 4 TX rings
hn0: link state changed to DOWN
mlx4_en: mlxen0: Using 4 RX rings
mlx4_en: mlxen0: Initializing port
pcib1: <Hyper-V PCI Express Pass Through> on vmbus0
pci1: <PCI bus> on pcib1
mlx4_core1: <mlx4_core> at device 2.0 on pci1
mlx4_core: Initializing mlx4_core
mlx4_core1: Detected virtual function - running in slave mode
mlx4_core1: Sending reset
mlx4_core1: Sending vhcr0
mlx4_core1: HCA minimum page size:512
mlx4_core1: Timestamping is not supported in slave mode
mlx4_en mlx4_core1: Activating port:1
mlxen2: Ethernet address: XX:XX:XX:bf:a3:0d
mlx4_en: mlx4_core1: Port 1: Using 4 TX rings
mlxen2: link state changed to DOWN
mlx4_en: mlx4_core1: Port 1: Using 4 RX rings
hn1: link state changed to DOWN
mlx4_en: mlxen2: Using 4 TX rings
mlx4_en: mlxen2: Using 4 RX rings
mlx4_en: mlxen2: Initializing port
pcib2: <Hyper-V PCI Express Pass Through> on vmbus0
pci2: <PCI bus> on pcib2
mlx4_core2: <mlx4_core> at device 2.0 on pci2
mlx4_core: Initializing mlx4_core
mlx4_core2: Detected virtual function - running in slave mode
mlx4_core2: Sending reset
mlx4_core2: Sending vhcr0
mlx4_core2: HCA minimum page size:512
mlx4_core2: Timestamping is not supported in slave mode
mlx4_en mlx4_core2: Activating port:1
mlxen4: Ethernet address: XX:XX:XX:bf:a3:0e
mlx4_en: mlx4_core2: Port 1: Using 4 TX rings
mlx4_en: mlx4_core2: Port 1: Using 4 RX rings
mlxen4: link state changed to DOWN
mlx4_en: mlxen4: Using 4 TX rings
hn2: link state changed to DOWN
mlx4_en: mlxen4: Using 4 RX rings
mlx4_en: mlxen4: Initializing port
pcib3: <Hyper-V PCI Express Pass Through> on vmbus0
pci3: <PCI bus> on pcib3
mlx4_core3: <mlx4_core> at device 2.0 on pci3
mlx4_core: Initializing mlx4_core
mlx4_core3: Detected virtual function - running in slave mode
mlx4_core3: Sending reset
mlx4_core3: Sending vhcr0
mlx4_core3: HCA minimum page size:512
mlx4_core3: Timestamping is not supported in slave mode
mlx4_en mlx4_core3: Activating port:1
mlxen6: Ethernet address: XX:XX:XX:bf:a3:0f
mlx4_en: mlx4_core3: Port 1: Using 4 TX rings
mlx4_en: mlx4_core3: Port 1: Using 4 RX rings
mlxen6: link state changed to DOWN
mlx4_en: mlxen6: Using 4 TX rings
hn3: link state changed to DOWN
mlx4_en: mlxen6: Using 4 RX rings
mlx4_en: mlxen6: Initializing port
pcib4: <Hyper-V PCI Express Pass Through> on vmbus0
pci4: <PCI bus> on pcib4
mlx4_core4: <mlx4_core> at device 2.0 on pci4
mlx4_core: Initializing mlx4_core
mlx4_core4: Detected virtual function - running in slave mode
mlx4_core4: Sending reset
mlx4_core4: Sending vhcr0
mlx4_core4: HCA minimum page size:512
mlx4_core4: Timestamping is not supported in slave mode
mlx4_en mlx4_core4: Activating port:1
mlxen8: Ethernet address: XX:XX:XX:bf:a3:10
mlx4_en: mlx4_core4: Port 1: Using 4 TX rings
mlx4_en: mlx4_core4: Port 1: Using 4 RX rings
mlx4_en: mlxen8: Using 4 TX rings
mlx4_en: mlxen8: Using 4 RX rings
mlxen8: link state changed to DOWN
hn4: link state changed to DOWN
mlx4_en: mlxen8: Initializing port
mlxen0: tso6 disabled due to -txcsum6.
mlxen0: enable txcsum6 first.
mlxen2: tso6 disabled due to -txcsum6.
mlxen2: enable txcsum6 first.
mlxen4: tso6 disabled due to -txcsum6.
mlxen4: enable txcsum6 first.
mlxen6: tso6 disabled due to -txcsum6.
mlxen6: enable txcsum6 first.
mlxen8: tso6 disabled due to -txcsum6.
mlxen8: enable txcsum6 first.
lo0: link state changed to UP
aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS,SHA1,SHA256> on motherboard
hn2: disable IPV6 mbuf hash delivery
mlxen4: link state changed to UP
hn2: link state changed to UP
mlx4_core2: going promisc on 1
hn4: disable IPV6 mbuf hash delivery
mlxen8: link state changed to UP
hn4: link state changed to UP
mlx4_core4: going promisc on 1
hn3: disable IPV6 mbuf hash delivery
mlxen6: link state changed to UP
hn3: link state changed to UP
mlx4_core3: going promisc on 1
hn1: disable IPV6 mbuf hash delivery
mlxen2: link state changed to UP
hn1: link state changed to UP
mlx4_core1: going promisc on 1
hn0: disable IPV6 mbuf hash delivery
mlxen0: link state changed to UP
hn0: link state changed to UP
mlx4_core0: going promisc on 1
hn3: got notify, nvs type 129
mlx4_core1: Internal error detected on the communication channel
mlx4_core2: Internal error detected on the communication channel
mlx4_core3: Internal error detected on the communication channel
mlx4_core4: Internal error detected on the communication channel
mlx4_core3: device is going to be reset
mlx4_core3: VF reset is not needed
mlx4_core3: device was reset successfully
mlx4_en mlx4_core3: Internal error detected, restarting device
mlx4_core4: device is going to be reset
mlx4_core4: VF reset is not needed
mlx4_core4: device was reset successfully
mlx4_en mlx4_core4: Internal error detected, restarting device
mlx4_core1: device is going to be reset
mlx4_core1: VF reset is not needed
mlx4_core1: device was reset successfully
mlx4_en mlx4_core1: Internal error detected, restarting device
mlx4_core2: device is going to be reset
mlx4_core2: VF reset is not needed
mlx4_core2: device was reset successfully
mlx4_en mlx4_core2: Internal error detected, restarting device
mlx4_core2: command 0x5 failed: fw status = 0x1
hn3: got notify, nvs type 128
hn3: got notify, nvs type 128
hn2: got notify, nvs type 129
hn2: got notify, nvs type 128
hn2: got notify, nvs type 128
hn4: got notify, nvs type 129
hn4: got notify, nvs type 128
hn4: got notify, nvs type 128
hn1: got notify, nvs type 129
hn1: got notify, nvs type 128
hn1: got notify, nvs type 128
pflog0: promiscuous mode enabled
mlx4_en: mlxen2: Failed disabling multicast filter
mlx4_en: mlxen4: Failed disabling multicast filter
mlx4_core3: Failed to close slave function
mlx4_core3: Detected virtual function - running in slave mode
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled
mlx4_core4: Failed to close slave function
mlx4_core4: Detected virtual function - running in slave mode
mlx4_core1: Failed to close slave function
mlx4_core1: Detected virtual function - running in slave mode
mlx4_core2: Failed to close slave function
mlx4_core2: Detected virtual function - running in slave mode
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled
mlx4_core3: recovering from previously mis-behaved VM
mlx4_core4: recovering from previously mis-behaved VM
mlx4_core1: recovering from previously mis-behaved VM
mlx4_core2: recovering from previously mis-behaved VM
mlx4_core3: Communication channel is offline.
mlx4_core3: PF is not responsive, skipping initialization
mlx4_core3: Failed to initialize slave
mlx4_core3: mlx4_restart_one: ERROR: mlx4_load_one failed, pci_name=Mellanox driver (3.5.1), err=-5
mlx4_core3: mlx4_restart_one was ended, ret=-5
mlx4_core4: Communication channel is offline.
mlx4_core4: PF is not responsive, skipping initialization
mlx4_core4: Failed to initialize slave
mlx4_core4: mlx4_restart_one: ERROR: mlx4_load_one failed, pci_name=Mellanox driver (3.5.1), err=-5
mlx4_core4: mlx4_restart_one was ended, ret=-5
mlx4_core1: Communication channel is offline.
mlx4_core1: PF is not responsive, skipping initialization
mlx4_core1: Failed to initialize slave
mlx4_core1: mlx4_restart_one: ERROR: mlx4_load_one failed, pci_name=Mellanox driver (3.5.1), err=-5
mlx4_core1: mlx4_restart_one was ended, ret=-5
mlx4_core2: Communication channel is offline.
mlx4_core2: PF is not responsive, skipping initialization
mlx4_core2: Failed to initialize slave
mlx4_core2: mlx4_restart_one: ERROR: mlx4_load_one failed, pci_name=Mellanox driver (3.5.1), err=-5
mlx4_core2: mlx4_restart_one was ended, ret=-5
mlx4_core2: mlx4_remove_one: interface is down
mlx4_core2: detached
mlx4_core4: mlx4_remove_one: interface is down
mlx4_core4: detached
mlx4_core1: mlx4_remove_one: interface is down
mlx4_core1: detached
mlx4_core3: mlx4_remove_one: interface is down
mlx4_core3: detached
pci3: detached
pcib3: detached
pci4: detached
pcib4: detached
pci2: detached
pcib2: detached
pci1: detached
pcib1: detached
pcib1: <Hyper-V PCI Express Pass Through> on vmbus0
pci1: <PCI bus> on pcib1
mlx4_core1: <mlx4_core> at device 2.0 on pci1
mlx4_core: Initializing mlx4_core
mlx4_core1: Detected virtual function - running in slave mode
mlx4_core1: Sending reset
mlx4_core1: Sending vhcr0
mlx4_core1: HCA minimum page size:512
mlx4_core1: Timestamping is not supported in slave mode
mlx4_en mlx4_core1: Activating port:1
mlxen2: Ethernet address: XX:XX:XX:bf:a3:0f
mlx4_en: mlx4_core1: Port 1: Using 4 TX rings
mlx4_en: mlx4_core1: Port 1: Using 4 RX rings
mlxen2: link state changed to DOWN
mlx4_en: mlxen2: Using 4 TX rings
hn3: link state changed to DOWN
mlx4_en: mlxen2: Using 4 RX rings
mlx4_en: mlxen2: Initializing port
pcib2: <Hyper-V PCI Express Pass Through> on vmbus0
pci2: <PCI bus> on pcib2
mlx4_core2: <mlx4_core> at device 2.0 on pci2
mlx4_core: Initializing mlx4_core
mlx4_core2: Detected virtual function - running in slave mode
mlx4_core2: Sending reset
mlx4_core2: Sending vhcr0
mlx4_core2: HCA minimum page size:512
mlx4_core2: Timestamping is not supported in slave mode
mlx4_en mlx4_core2: Activating port:1
mlxen4: Ethernet address: XX:XX:XX:bf:a3:10
mlx4_en: mlx4_core2: Port 1: Using 4 TX rings
mlx4_en: mlx4_core2: Port 1: Using 4 RX rings
mlx4_en: mlxen4: Using 4 TX rings
mlx4_en: mlxen4: Using 4 RX rings
mlxen4: link state changed to DOWN
hn4: link state changed to DOWN
mlx4_en: mlxen4: Initializing port
pcib3: <Hyper-V PCI Express Pass Through> on vmbus0
pci3: <PCI bus> on pcib3
mlx4_core3: <mlx4_core> at device 2.0 on pci3
mlx4_core: Initializing mlx4_core
mlx4_core3: Detected virtual function - running in slave mode
mlx4_core3: Sending reset
mlx4_core3: Sending vhcr0
mlx4_core3: HCA minimum page size:512
mlx4_core3: Timestamping is not supported in slave mode
mlx4_en mlx4_core3: Activating port:1
mlxen6: Ethernet address: XX:XX:XX:bf:a3:0e
mlx4_en: mlx4_core3: Port 1: Using 4 TX rings
mlx4_en: mlx4_core3: Port 1: Using 4 RX rings
mlxen6: link state changed to DOWN
mlx4_en: mlxen6: Using 4 TX rings
hn2: link state changed to DOWN
mlx4_en: mlxen6: Using 4 RX rings
mlx4_en: mlxen6: Initializing port
pcib4: <Hyper-V PCI Express Pass Through> on vmbus0
pci4: <PCI bus> on pcib4
mlx4_core4: <mlx4_core> at device 2.0 on pci4
mlx4_core: Initializing mlx4_core
mlx4_core4: Detected virtual function - running in slave mode
mlx4_core4: Sending reset
mlx4_core4: Sending vhcr0
mlx4_core4: HCA minimum page size:512
mlx4_core4: Timestamping is not supported in slave mode
mlx4_en mlx4_core4: Activating port:1
mlxen8: Ethernet address: XX:XX:XX:bf:a3:0d
mlx4_en: mlx4_core4: Port 1: Using 4 TX rings
mlx4_en: mlx4_core4: Port 1: Using 4 RX rings
mlx4_en: mlxen8: Using 4 TX rings
mlx4_en: mlxen8: Using 4 RX rings
mlxen8: link state changed to DOWN
hn1: link state changed to DOWN
mlx4_en: mlxen8: Initializing port
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled
hn3: disable IPV6 mbuf hash delivery
mlxen2: link state changed to UP
hn3: link state changed to UP
mlx4_core1: going promisc on 1
hn4: disable IPV6 mbuf hash delivery
mlxen4: link state changed to UP
hn4: link state changed to UP
mlx4_core2: going promisc on 1
hn2: disable IPV6 mbuf hash delivery
mlxen6: link state changed to UP
hn2: link state changed to UP
mlx4_core3: going promisc on 1
hn1: disable IPV6 mbuf hash delivery
mlxen8: link state changed to UP
hn1: link state changed to UP
mlx4_core4: going promisc on 1
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled
pflog0: promiscuous mode disabled
pflog0: promiscuous mode enabled