SOLVED: mlx4en connection issues

Started by CJ, March 22, 2025, 04:46:13 PM

Previous topic - Next topic
March 22, 2025, 04:46:13 PM Last Edit: April 27, 2025, 03:42:41 PM by CJ Reason: mark topic as solved
What and how does everyone have their Connect-X cards connected to?

I've been running a Connect-X 3 dual port 10g card for several years and multiple versions of OPNsense.  Currently I have one port connected to a QNAP switch and the other connected to a Mikrotik (a recent addition).  Depending on the version and boot circumstances, the ports don't always come up correctly.  But once I go through some variation of link up/down and/or unplugging the DAC, it's completely stable.

Upgrading to 25.1 caused the port connected to the QNAP to spam Link Down messages to the console during the upgrade process.  Despite this, I was able to log in via the console and issue a link down command on the port.  Surprisingly, that not only stopped the Link Down spam but also immediately brought the link up and in a stable state with no further intervention.  The port connected to the Mikrotik did not appear to have any problems.

Upgrading to 25.1.3 saw the same QNAP connected port start to flap but quickly correct itself with no intervention.

I plan on eventually upgrading to a newer Connect-X card as I upgrade to 25/100G.  In the meantime, I will use this thread to document further issues as they occur, especially as the Mikrotik doesn't seem to be affected.

You may wish to have a read of my post from a few years ago.

https://forum.opnsense.org/index.php?topic=31705.msg153860#msg153860

The TLDR is the mlx4en doesnt play nicely with QNAP.
I ended up moving to a Netgear 10G switch instead and the mlx4en's have been solid since....

Quote from: ProximusAl on March 24, 2025, 01:49:14 PMYou may wish to have a read of my post from a few years ago.

https://forum.opnsense.org/index.php?topic=31705.msg153860#msg153860

The TLDR is the mlx4en doesnt play nicely with QNAP.
I ended up moving to a Netgear 10G switch instead and the mlx4en's have been solid since....

I'll have to take a look through that thread.  I was going to say that none of my other mlx4en cards have problems but upon further consideration, this might explain the problems on my linux desktop.

I confirmed that OPNsense is connected not through a combo port.  It is using one of the first 4 via a Cisco DAC.

Updating to 25.1.5_4 caused the card to spam link down messages before switching to spamming an alternating link up/link down.  It did take a few iterations of the link down command to cause it to stop flapping and become stable.

Quote from: ProximusAl on March 24, 2025, 01:49:14 PMYou may wish to have a read of my post from a few years ago.

https://forum.opnsense.org/index.php?topic=31705.msg153860#msg153860

The TLDR is the mlx4en doesnt play nicely with QNAP.
I ended up moving to a Netgear 10G switch instead and the mlx4en's have been solid since....

My upgrade plans got sped up due to the tariffs and I now have a Mikrotik replacing the QNAP.  Initial testing has been positive with no link problems during boot.

OPNsense was plugged into a non-combo port on the QNAP via DAC so I'm not sure why my results were different from yours.  But I appreciate the tip and glad I could finally resolve this issue.