WAN Interface Statistics not working correctly

Started by Component0002, June 15, 2026, 11:17:31 PM

Previous topic - Next topic
Hello,

I have upgraded my firewall to a Mellanox ConnectX-3 Pro from an Intel 4 port 1Gbps NIC. I reinstalled Opnsense enabled the kernel module with mlx4en_load=YES, edited the interface names in my config and reimported it.

Everyone works correctly apart from the Interface Statistics on the Dashboard. This will regularly go down even though I'm not restarting the firewall.

For example, currently it is at 1.73GB bytes out but if I check back in a few minutes it will be at 1.2GB bytes out. I have also had it regularly go to 18.45E which I believe is 18.45 Exabytes. This happens within minutes of the firewall having started so it is obviously not true as I only have a 900Mbps line...

Furthermore, when this happens it breaks the live traffic graph and the units are all wrong. I have confirmed that this bug happens with 26.1.9 and also with 26.1.10 now.

This was never an issue with the Intel card so I'm wondering if it is an issue with the driver? The VLAN Interface Statistics display without any issues.

How do I resolve this issue? I have checked the logs but I can't see any errors here.

After the interface change the reporting database might have become corrupted. Just a guess.
Perhaps resetting would help? Reporting > Settings > Reporting Database Options.
There are options to reset / repair but beware, I imagine there could be data loss. I am not certain if the actions taken are destructive.
In the past when I changed interface hardware I had to do this. It didn't matter to me if resetting would reset to zero data but it fixed my reporting problem similar to yours.

Quote from: cookiemonster on June 16, 2026, 12:00:46 AMAfter the interface change the reporting database might have become corrupted. Just a guess.
He wrote :
Quote from: Component0002 on June 15, 2026, 11:17:31 PMI reinstalled Opnsense
So that can't be it ;)

A while back I have read this :
- https://forum.opnsense.org/index.php?topic=51907.msg267084#msg267084
- https://forum.opnsense.org/index.php?topic=51907.msg267104#msg267104
Which I find hard to believe, but if it's true then that might be the cause of this issue ?!
Weird guy who likes everything Linux and *BSD on PC/Laptop/Tablet/Mobile and funny little ARM based boards :)

I have ran the reset of RRD Data but the issue persists unfortunately.

Surely I can't be the only person with a Melanox Connectx 3 Pro card. I wonder if other people have the same issue?

Although I have just gotten this card perhaps I need to dump it and look into swapping to an Intel card instead.

I've attached a screenshot of the issue too.

Run 'netstat -ib'.

Maybe this way we can at least rule out OPNsense and isolate the issue to FreeBSD.
N5105 | 8/250GB | 4xi226-V | Community

Quote from: nero355 on June 16, 2026, 01:33:48 PM
Quote from: cookiemonster on June 16, 2026, 12:00:46 AMAfter the interface change the reporting database might have become corrupted. Just a guess.
He wrote :
Quote from: Component0002 on June 15, 2026, 11:17:31 PMI reinstalled Opnsense
So that can't be it ;)

I beg to differ. The config backup contains the RRD data unless it's actively unselected. Therefore if it was saved (as per default) then the import would have contained this data.

Quote from: cookiemonster on June 17, 2026, 12:03:31 AMI beg to differ.

The config backup contains the RRD data unless it's actively unselected.
Therefore if it was saved (as per default) then the import would have contained this data.
OK, but still a different situation then... :)
Weird guy who likes everything Linux and *BSD on PC/Laptop/Tablet/Mobile and funny little ARM based boards :)

Quote from: OPNenthu on June 16, 2026, 02:56:47 PMRun 'netstat -ib'.

Maybe this way we can at least rule out OPNsense and isolate the issue to FreeBSD.

Running this I get:
Name  Mtu Network  Address                                                Ipkts  Ierrs  Idrop        Ibytes      Opkts  Oerrs        Obytes   Coll
mlxen0    1500 <Link#6>                                 MAC Address                           24067      0  15755      17619261     375393      0     651359478      0
mlxen0       - Local IPV6             6753      -      -        565951       6768      -        586868      -
mlxen0       - 0.0.0.0/22                          example.org                     1810036      -      -     626700987    3165999      -    3460842453      -
mlxen0       - IPV6 Address         67311      -      -      27675314     110451      -      41138776      -

The firewall has been up for 2 days. Does this look correct to you? (I have redacted the IPs)

Insights is reporting:
Bytes In Bytes Out Total:
195.33 G 3.77 T    3.96 T

I don't know if it's correct or not but I was curious if there would be a discrepancy between the two sources.  If they agreed (and the data was obviously wrong) then I would have said the problem could be at the OS/driver level.

What you show here is nowhere near 1TB even, let alone 3.96T.  I had ChatGPT convert the numbers to human readable format:

Name     Network          RX Packets   RX Data     TX Packets   TX Data
-------  ---------------  -----------  ----------  -----------  ----------

mlxen0   <Link#6>             24,067    16.8 MiB      375,393    621.2 MiB
         Local IPv6            6,753   552.7 KiB        6,768    573.1 KiB
         0.0.0.0/22        1,810,036   597.7 MiB    3,165,999     3.22 GiB
         IPv6 Address         67,311    26.4 MiB      110,451     39.2 MiB

The netstat numbers (I think) are cumulative since the last boot or when the network interface was initialized.

I'm not sure what the reporting period of your insight data is.  In your first post you said you were looking at the interface totals on the Dashboard, which I would guess more closely align to netstat, with some rounding.

N5105 | 8/250GB | 4xi226-V | Community

Quote from: OPNenthu on June 17, 2026, 11:10:00 PMI don't know if it's correct or not but I was curious if there would be a discrepancy between the two sources.  If they agreed (and the data was obviously wrong) then I would have said the problem could be at the OS/driver level.

What you show here is nowhere near 1TB even, let alone 3.96T.  I had ChatGPT convert the numbers to human readable format:

Name     Network          RX Packets   RX Data     TX Packets   TX Data
-------  ---------------  -----------  ----------  -----------  ----------

mlxen0   <Link#6>             24,067    16.8 MiB      375,393    621.2 MiB
         Local IPv6            6,753   552.7 KiB        6,768    573.1 KiB
         0.0.0.0/22        1,810,036   597.7 MiB    3,165,999     3.22 GiB
         IPv6 Address         67,311    26.4 MiB      110,451     39.2 MiB

The netstat numbers (I think) are cumulative since the last boot or when the network interface was initialized.

I'm not sure what the reporting period of your insight data is.  In your first post you said you were looking at the interface totals on the Dashboard, which I would guess more closely align to netstat, with some rounding.



I have ran it again and yes it is wrong on netstat:

mlxen0    1500 <Link#6>                                 MAC Address                       18446744073709394600      0  17108  18446744073657478076  18446744073709195408      0  18446744073486500945      0
mlxen0       - Local IPv6                                                 6934      -      -                582195                  6955      -                604056      -
mlxen0       - 0.0.0.0/22                          example.org                                1828945      -      -             630193193               3185468      -            3462912008      -
mlxen0       -      IPv6 Address                                   72090      -      -              29279334                115332      -              41640965      -

This is when it is showing 18.4E in Interface Statistics. I guess this shows it is an OS issue?

Yeah, I would try a different NIC.  That's crazy.

Side note: the maximum value of a 64-bit unsigned int is 18446744073709551615 bytes (16 EiB - 1 byte) and is likely what netstat uses for its counter, at least according to ChatGPT.

Your result is only 157,015 bytes off from that.  It's probably overflowing the counter and wrapping back around to 0, so that's why you see it increase and decrease probably.
N5105 | 8/250GB | 4xi226-V | Community

I will look into another NIC but that means buying new hardware and cables as I'm using a custom Direct Attach Cable.

Is there anywhere to report this bug perhaps?

(Sorry, I was looking at the packet count rather than byte count when I subtracted, but still those are crazy numbers...)

I think FreeBSD would be the best place but I've never submitted a bug there.  Maybe someone else knows.

OPNsense also tracks upstream issues: https://github.com/opnsense/src/issues

I don't know how those get tracked and resolved by upstream, though.  I was always too shy to ask but maybe this is a good time :)
N5105 | 8/250GB | 4xi226-V | Community

It sounds similar to, or a regression of, this older issue?

https://www.mail-archive.com/freebsd-net%40freebsd.org/msg52770.html

Does 'sysctl hw.mlxen1.stat.rx_bytes' show correctly?
N5105 | 8/250GB | 4xi226-V | Community