OPNsense Forum

Archive => 22.1 Legacy Series => Topic started by: Railgun on March 09, 2022, 07:16:36 PM

Title: Broadcom P225P/BCM57414 SR-IOV/VF inconsistent behavior/no longer loading
Post by: Railgun on March 09, 2022, 07:16:36 PM
Hi all,

I've been using opnsense in a home lab environment for some time now and use the other version in a professional capacity, but the setup I'm doing now is new to me and doing my head in. 

I'll try to be brief here. 

In short, I'm building a new server.  EPYC 7282 with a Supermicro H12SSL-i board running ESXi 7.

I have a Broadcom P225P/BCM57414 NIC with SR-IOV enabled. 

The initial deployment of a new OPNSense VM was met with some issues regarding the drivers for the cards.  It showed:

none0@pci0:11:0:0: class=0x020000 rev=0x00 hdr=0x00 vendor=0x14e4 device=0x16dc subvendor=0x14e4 subdevice=0x16d7
    vendor     = 'Broadcom Inc. and subsidiaries'
    device     = 'NetXtreme-E Ethernet Virtual Function'
    class      = network
    subclass   = ethernet

none1@pci0:19:0:0: class=0x020000 rev=0x00 hdr=0x00 vendor=0x14e4 device=0x16dc subvendor=0x14e4 subdevice=0x16d7
    vendor     = 'Broadcom Inc. and subsidiaries'
    device     = 'NetXtreme-E Ethernet Virtual Function'
    class      = network
    subclass   = ethernet

none2@pci0:27:0:0: class=0x020000 rev=0x00 hdr=0x00 vendor=0x14e4 device=0x16dc subvendor=0x14e4 subdevice=0x16d7
    vendor     = 'Broadcom Inc. and subsidiaries'
    device     = 'NetXtreme-E Ethernet Virtual Function'
    class      = network
    subclass   = ethernet



...similar to another thread I'd seen.  I was able to manually load the drivers via "kldload if_bnxt" with success, and added it to loader.conf.local, also with success upon a reboot. 

However, upon rebooting the host, it all went to pot. 

After previously starting to configure the new interfaces within the UI, and rebooting, all the newly created interfaces disappeared.  I was met with the same "none0@" output above.  However, upon checking to see if something didn't run as expected:

kldload if_bnxt
kldload: can't load if_bnxt: module already loaded or in kernel


I rebooted the host again, only to be met with the VM not starting up at all as the NIC in question seemed to drop out of having SR-IOV enabled.  I rebooted once more, and one port was enabled, one disabled.  Both times ESXi was indicating that it WAS enabled, but required a reboot. 

This was done, but still experienced these interfaces dropping off.  I deleted the loader.conf.local file to prevent the drivers from being loaded, rebooted, manually tried to load the drivers again but indicated they were already loaded. 

The only thing I could see from dmesg was

bnxt0: <Broadcom NetXtreme-E Ethernet Virtual Function> mem 0xffa04000-0xffa07fff,0xff900000-0xff9fffff,0xffa00000-0xffa03fff at device 0.0 on pci5
bnxt0: Timeout sending HWRM_VER_GET: (timeout: 1000) seq: 0
bnxt0: attach: hwrm ver get failed
bnxt0: IFDI_ATTACH_PRE failed 60
device_attach: bnxt0 attach returned 60



In other attempts to boot, I saw:

bnxt0: Timeout sending HWRM_RING_ALLOC: (timeout: 2000) seq: 225
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 226
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 227
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 228
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 229
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 230
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 231
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 232
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 233
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 234
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 235
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 236
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 237
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 238
bnxt0: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 239
bnxt0: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 240
bnxt0: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 241
bnxt1: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 15
bnxt1: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 16
bnxt1: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 17
bnxt2: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 15
bnxt2: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 16
bnxt2: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 17
bnxt0: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 242
...



Which seemed to repeat endlessly. 

I now have no idea why or how that occurred and am at a loss here.  I'm guessing this is a guest issue.  I'm spinning up various other VMs at the moment but this has been the first.  I'll see whether there is other odd behavior with other VMs. 
Title: Re: Broadcom P225P/BCM57414 SR-IOV/VF inconsistent behavior/no longer loading
Post by: Railgun on March 12, 2022, 11:35:50 AM
This particular thread can be closed. 

There was some odd sequencing somehow that got these setup into an odd state.