Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Railgun

#1
This is a follow up from https://forum.opnsense.org/index.php?topic=27414.msg133006#msg133006.

For the basic relevant bits here:

-EPYC 7282 with a Supermicro H12SSL-i board running ESXi 7.
-Broadcom P225P/BCM57414 NIC with SR-IOV enabled. 

After the basic setup of getting these NICs sorted, I'm attempting to further this setup along.  Immediately after setting VLANs up within the UI, I'm met with the following:

vlan0: changing name to 'bnxt2_vlan2'
vlan0: changing name to 'bnxt2_vlan2'
vlan1: changing name to 'bnxt2_vlan3'
vlan1: changing name to 'bnxt2_vlan3'
bnxt2: Timeout sending HWRM_RING_ALLOC: (timeout: 2000) seq: 378
bnxt2: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 379
bnxt0: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 162
bnxt0: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 163
bnxt0: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 164
bnxt1: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 162


The HWRM timeout messages are also seen in another FreeBSD based VM that has not had any config change made. 

These are VMs under ESXi7.  I have setup portgroups as follows for this particular message:

Physical NIC: vmnic2 -> vswitch: 25G-1 -> Port Group: ALL-1 (VL4095)

And in this instance, four interfaces have been created on this VM; one mgmt with a usual VMXNET interface, and three via SR-IOV.  One that is allowing all VLANs as depicted above and will tag within OPNSense, and two that have tagged port groups where I will not be tagging in the firewall. 

Given this error is seen across multiple VMs, this leads me to believe it's a possible HW setup issue.  However, this is the first time I've been playing with SR-IOV and am at a bit of a loss.  I haven't really been able to find this message and what the actual issue is. 
#2
This particular thread can be closed. 

There was some odd sequencing somehow that got these setup into an odd state. 
#3
Hi all,

I've been using opnsense in a home lab environment for some time now and use the other version in a professional capacity, but the setup I'm doing now is new to me and doing my head in. 

I'll try to be brief here. 

In short, I'm building a new server.  EPYC 7282 with a Supermicro H12SSL-i board running ESXi 7.

I have a Broadcom P225P/BCM57414 NIC with SR-IOV enabled. 

The initial deployment of a new OPNSense VM was met with some issues regarding the drivers for the cards.  It showed:

none0@pci0:11:0:0: class=0x020000 rev=0x00 hdr=0x00 vendor=0x14e4 device=0x16dc subvendor=0x14e4 subdevice=0x16d7
    vendor     = 'Broadcom Inc. and subsidiaries'
    device     = 'NetXtreme-E Ethernet Virtual Function'
    class      = network
    subclass   = ethernet

none1@pci0:19:0:0: class=0x020000 rev=0x00 hdr=0x00 vendor=0x14e4 device=0x16dc subvendor=0x14e4 subdevice=0x16d7
    vendor     = 'Broadcom Inc. and subsidiaries'
    device     = 'NetXtreme-E Ethernet Virtual Function'
    class      = network
    subclass   = ethernet

none2@pci0:27:0:0: class=0x020000 rev=0x00 hdr=0x00 vendor=0x14e4 device=0x16dc subvendor=0x14e4 subdevice=0x16d7
    vendor     = 'Broadcom Inc. and subsidiaries'
    device     = 'NetXtreme-E Ethernet Virtual Function'
    class      = network
    subclass   = ethernet



...similar to another thread I'd seen.  I was able to manually load the drivers via "kldload if_bnxt" with success, and added it to loader.conf.local, also with success upon a reboot. 

However, upon rebooting the host, it all went to pot. 

After previously starting to configure the new interfaces within the UI, and rebooting, all the newly created interfaces disappeared.  I was met with the same "none0@" output above.  However, upon checking to see if something didn't run as expected:

kldload if_bnxt
kldload: can't load if_bnxt: module already loaded or in kernel


I rebooted the host again, only to be met with the VM not starting up at all as the NIC in question seemed to drop out of having SR-IOV enabled.  I rebooted once more, and one port was enabled, one disabled.  Both times ESXi was indicating that it WAS enabled, but required a reboot. 

This was done, but still experienced these interfaces dropping off.  I deleted the loader.conf.local file to prevent the drivers from being loaded, rebooted, manually tried to load the drivers again but indicated they were already loaded. 

The only thing I could see from dmesg was

bnxt0: <Broadcom NetXtreme-E Ethernet Virtual Function> mem 0xffa04000-0xffa07fff,0xff900000-0xff9fffff,0xffa00000-0xffa03fff at device 0.0 on pci5
bnxt0: Timeout sending HWRM_VER_GET: (timeout: 1000) seq: 0
bnxt0: attach: hwrm ver get failed
bnxt0: IFDI_ATTACH_PRE failed 60
device_attach: bnxt0 attach returned 60



In other attempts to boot, I saw:

bnxt0: Timeout sending HWRM_RING_ALLOC: (timeout: 2000) seq: 225
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 226
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 227
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 228
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 229
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 230
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 231
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 232
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 233
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 234
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 235
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 236
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 237
bnxt0: Timeout sending HWRM_FUNC_RESET: (timeout: 2000) seq: 238
bnxt0: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 239
bnxt0: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 240
bnxt0: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 241
bnxt1: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 15
bnxt1: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 16
bnxt1: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 17
bnxt2: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 15
bnxt2: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 16
bnxt2: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 17
bnxt0: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 242
...



Which seemed to repeat endlessly. 

I now have no idea why or how that occurred and am at a loss here.  I'm guessing this is a guest issue.  I'm spinning up various other VMs at the moment but this has been the first.  I'll see whether there is other odd behavior with other VMs.