OPNsense Forum

English Forums => General Discussion => Topic started by: morphxyz on August 08, 2024, 09:30:07 AM

Title: OPNsense crash with bnxt driver
Post by: morphxyz on August 08, 2024, 09:30:07 AM
Dear Community

Two days ago we've had our OPNsense suddenly stop working (pic1).
It's been working great for months and we have rebooted it many times before without issues.
But this time when we booted.. (pic2). bnxt couldn't load.
Yes we do have a custom tunable in place. only this one (pic3)

After another reboot the OPNsense is working smooth again.
Has anybody ever had a similar event happening with the bnxt driver?
Is there a way to have a deeper look in what actually happened except the Logs in the Web GUI?
We are since a bit afraid to touch it at all.

Would you suggest switching to a natively supported Network card?
It's an idea we've had for a while.
Is the error shown in pic1 the cause for the crash even?

Thank you for your thoughts and ideas :-)
Title: Re: OPNsense crash with bnxt driver
Post by: franco on August 08, 2024, 09:41:12 AM
Try 24.7 first. There have been bnxt fixes most likely.


Cheers,
Franco
Title: Re: OPNsense crash with bnxt driver
Post by: morphxyz on August 08, 2024, 09:47:26 AM
Thank you for the swift response franco.

Will do tonight!
Title: Re: OPNsense crash with bnxt driver
Post by: morphxyz on August 13, 2024, 11:19:25 AM
We updated to 24.7.1
No issues during updates except CrowdSec. But we solved that and everything is up to date.

We had another crash of said firewall (or at least the bnxt0 and bnxt1) yesterday though. It only came back after several reboots

We can't pin the issue down. The first error messages we can find are those:

2024-08-12T18:45:33   Notice   kernel   bnxt0: Timeout sending HWRM_PORT_PHY_QCFG: (timeout: 2000) seq: 44051   
2024-08-12T18:45:33   Notice   kernel   bnxt0: Timeout sending HWRM_PORT_QSTATS: (timeout: 2000) seq: 44050   
2024-08-12T18:45:33   Notice   kernel   bnxt1: Timeout sending HWRM_PORT_QSTATS: (timeout: 2000) seq: 24278   
2024-08-12T18:36:00   Error   configctl   error in configd communication Traceback (most recent call last): File "/usr/local/sbin/configctl", line 65, in exec_config_cmd line = sock.recv(65536).decode() ^^^^^^^^^^^^^^^^ TimeoutError: timed out   
2024-08-12T18:26:00   Error   configctl   error in configd communication Traceback (most recent call last): File "/usr/local/sbin/configctl", line 65, in exec_config_cmd line = sock.recv(65536).decode() ^^^^^^^^^^^^^^^^ TimeoutError: timed out   
2024-08-12T18:16:00   Error   configctl   error in configd communication Traceback (most recent call last): File "/usr/local/sbin/configctl", line 65, in exec_config_cmd line = sock.recv(65536).decode() ^^^^^^^^^^^^^^^^ TimeoutError: timed out

followed by a lot more kernel notices about bnxt0 and bnxt1

Seems to be similar to this post but not identical: https://forum.opnsense.org/index.php?topic=38434.0

We wonder if it's a hardware issue.. or maybe suricata/sensei?

Any ideas anyone?
Title: Re: OPNsense crash with bnxt driver
Post by: franco on August 13, 2024, 11:54:41 AM
https://forums.freebsd.org/threads/problem-with-a-broadcom-bcm957504-p425g-card.90771/

It seems to be a general issue in FreeBSD although it is being claimed we run "heavily modified" so that's what it must be. ;)

https://bugs.freebsd.org/bugzilla/buglist.cgi?quicksearch=bnxt&list_id=716179


Cheers,
Franco
Title: Re: OPNsense crash with bnxt driver
Post by: morphxyz on August 13, 2024, 01:09:00 PM
Great! At least we know what to do now.
Let's hope reassigning the interfaces will do after changing the NIC.
Else it's going to be a long night.

Might also just end up using proxmox and opnsense within. we've had no issues with those nics there (virtualized, NOT pass through).

Thank you for the assistance and information franco!
Title: Re: OPNsense crash with bnxt driver
Post by: franco on August 13, 2024, 01:17:36 PM
I'm willing to take upstream fixes into our kernel. There seems to be a lot of movement in bnxt, but sadly most of it was missed for FreeBSD 14.1.

% git diff --stat upstream/stable/14 sys/dev/bnxt
sys/dev/bnxt/bnxt.h                         |    803 +
sys/dev/bnxt/bnxt_en/bnxt.h                 |   1314 -
sys/dev/bnxt/bnxt_en/bnxt_auxbus_compat.c   |    194 -
sys/dev/bnxt/bnxt_en/bnxt_auxbus_compat.h   |     75 -
sys/dev/bnxt/bnxt_en/bnxt_dcb.c             |    861 -
sys/dev/bnxt/bnxt_en/bnxt_dcb.h             |    127 -
sys/dev/bnxt/bnxt_en/bnxt_ulp.c             |    526 -
sys/dev/bnxt/bnxt_en/bnxt_ulp.h             |    161 -
sys/dev/bnxt/{bnxt_en => }/bnxt_hwrm.c      |   1306 +-
sys/dev/bnxt/{bnxt_en => }/bnxt_hwrm.h      |     24 +-
sys/dev/bnxt/{bnxt_en => }/bnxt_ioctl.h     |      0
sys/dev/bnxt/{bnxt_en => }/bnxt_mgmt.c      |     69 +-
sys/dev/bnxt/{bnxt_en => }/bnxt_mgmt.h      |     31 +-
sys/dev/bnxt/bnxt_re/bnxt_re-abi.h          |    177 -
sys/dev/bnxt/bnxt_re/bnxt_re.h              |   1077 -
sys/dev/bnxt/bnxt_re/ib_verbs.c             |   5498 --
sys/dev/bnxt/bnxt_re/ib_verbs.h             |    632 -
sys/dev/bnxt/bnxt_re/main.c                 |   4467 -
sys/dev/bnxt/bnxt_re/qplib_fp.c             |   3544 -
sys/dev/bnxt/bnxt_re/qplib_fp.h             |    638 -
sys/dev/bnxt/bnxt_re/qplib_rcfw.c           |   1338 -
sys/dev/bnxt/bnxt_re/qplib_rcfw.h           |    354 -
sys/dev/bnxt/bnxt_re/qplib_res.c            |   1226 -
sys/dev/bnxt/bnxt_re/qplib_res.h            |    840 -
sys/dev/bnxt/bnxt_re/qplib_sp.c             |   1234 -
sys/dev/bnxt/bnxt_re/qplib_sp.h             |    432 -
sys/dev/bnxt/bnxt_re/qplib_tlv.h            |    187 -
sys/dev/bnxt/bnxt_re/stats.c                |    773 -
sys/dev/bnxt/bnxt_re/stats.h                |    353 -
sys/dev/bnxt/{bnxt_en => }/bnxt_sysctl.c    |   1097 +-
sys/dev/bnxt/{bnxt_en => }/bnxt_sysctl.h    |      2 -
sys/dev/bnxt/{bnxt_en => }/bnxt_txrx.c      |      0
sys/dev/bnxt/{bnxt_en => }/convert_hsi.pl   |      0
sys/dev/bnxt/{bnxt_en => }/hsi_struct_def.h | 116136 ++++++++++---------------
sys/dev/bnxt/{bnxt_en => }/if_bnxt.c        |   1789 +-
35 files changed, 45919 insertions(+), 101366 deletions(-)

If any of this fixes the current issue you're seeing I don't know though.


Cheers,
Franco
Title: Re: OPNsense crash with bnxt driver
Post by: Patrick M. Hausen on August 13, 2024, 01:18:46 PM
That unfriendly remark on the forum aside bnxt and Broadcom in general is a tough territory for the FreeBSD project and it looks like Broacom don't really care to support anything but Windows and Linux. IMHO FreeBSD is not really to blame in this particular case.

Follow e.g. this discussion:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269133

Although a Broadcom employee, Chandrakanth Patil, tried to help and promised a fix, it's still not solved.

I would just stay away from their gear.
Title: Re: OPNsense crash with bnxt driver
Post by: franco on August 13, 2024, 01:20:23 PM
Well, Chandrakanth Patil did author a lot of these changes currently queued up for 14.2.


Cheers,
Franco
Title: Re: OPNsense crash with bnxt driver
Post by: Patrick M. Hausen on August 13, 2024, 01:23:53 PM
*fingers crossed*