Interfaces report in/out errors

Started by NW4FUN, July 05, 2022, 10:34:42 PM

Previous topic - Next topic
Hey guys,

I implemented the proposed solution as per this thread https://forum.opnsense.org/index.php?topic=28725.0 by disabling Spanning Tree (not sure is the safest thing to do in any case) on my Meraki Switches (2x MS125-24p and 1x MS120-8 running on FW: 14.33.1), however, interfaces are still showing errors on ax0 for both LAN (physical interface) and its VLANS (LAN being the parent interface).

Any words of wisdom anyone? My DEC3840 has given me headaches with 10G since day1 and I'm running out of options here...

This is annoying and only happens on ax0/1 interfaces, igb0/1/2/3/4 are looking good (with or without STP enabled)

Cheers,

NW4FUN

Did you disable hardware CRC offloading?
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

What real world impact do these error counters have? It seems that e.g. Netmap (Zenarmor or Intrusion Detection IPS mode) with VLANs generates spurious errors...


Cheers,
Franco

Are you using SFP+ 10Gb modules?

Cisco Meraki doesn't have 10Gb sfp+ (it has 1Gb SFP mini-GBIC), so you might get some errors due to that

Oh my bad, you were talking about Cisco Meraki Cloud Managed series switches (just noticed there are 2 different Meraki series switches), yea those do have 10Gb SFP+ ports

July 07, 2022, 10:22:37 AM #5 Last Edit: July 07, 2022, 10:46:44 AM by NW4FUN
Quote from: franco on July 06, 2022, 08:35:56 AM
What real world impact do these error counters have? It seems that e.g. Netmap (Zenarmor or Intrusion Detection IPS mode) with VLANs generates spurious errors...


Cheers,
Franco

Hi Franco,

I do not know what's the "real world impact" for this, however, errors are showing not only on the VLANs, also on physical interfaces (LAN). Bizarrely enough I'm experiencing those errors exclusively on ax0/1 ports, igb0/1/2/3 are totally fine error counting wise.

NW4FUN


Error counting keeps growing at a massive pace...
Any idea anyone?

Quote from: NW4FUN on July 09, 2022, 08:50:30 AM
Error counting keeps growing at a massive pace...
Any idea anyone?

Ok I searched Cisco specification site, since they do sell models with different port amount and types of SFP ports as well (meaning model with 48 ports has SFP+ where exactly same model with 8 or 24 ports only has 1Gb SFP)

https://documentation.meraki.com/MS/MS_Overview_and_Specifications/MS125_Overview_and_Specifications <--- according those specifications, Meraki MS120 series switches don't have 10Gb SFP+. so it might be your 8 port meraki causing those (last numbers in each model name defines the amount of ports the switch has).

Nothing to worry about, should be fixed when you switch the 10Gb SFP+ cable connected to MS120-8 to 1Gb SFP cable. Basically could indicate that errors arre related to how many packets are dropped due to port not being able to receive them at faster rate, so if all works just fine, then you can just ignore it

July 09, 2022, 09:32:02 AM #8 Last Edit: July 09, 2022, 09:50:12 AM by Vilhonator
And here are specifications for MS120 switches
https://documentation.meraki.com/MS/MS_Overview_and_Specifications/MS120_Overview_and_Specifications

It is easy to confuse with SFP unless you have been working with those. SFP is 1Gb and SFP+ 10Gb and it is backwards compatible with SFP (to my knowledge, but there are SFP to ethernet modules as well https://www.cisco.com/c/en/us/products/collateral/interfaces-modules/small-business-network-accessories/datasheet-c78-741408.html).

But I suspect that reason for errors is basically because you have connected 10Gb SFP+ to SFP port and therefore, everything isn't going through and your network gear has to wait for those packets or something (Collisions are what you really want to avoid).

You can test if it is something you should fix by storing 500GB or 1TB file to your computer connected to MS125-24 switch and send it to computer connected to MS120-8 switch. If file doesn't get corrupted and can be read, repeat the test with multiple files which together are huge and if you can, add few more clients to the test.

Basically only thing that should be effected, is that it takes bit more longer than what average 10Gb/s shared between amount of clients with 1 Gb/s connections would be. To get 10Gb/s speeds, you need to have computers with 10Gb network interfaces connected to 10Gb ports.

Also you need to consider Switching Capacity of each switch, obviously there will be some delay when faster switch is pushing things to slower one, this is why it is recommended to combine 2 of same models, even if you don't need all of their ports.

It is worth to check, since 1Gb/s ethernet can reach higher speeds than actually 1Gb/s so despite some errors, connections could hold quite well untill total bandwidth that switch has to handle exceeds it's switching capacity by a long shot.

If you do test sending files, remember that network speed is counted in bits per second (so 1GB file is actually 10Gb not 1Gb way you calculate it is bits per second / 8 ), so there it's not question of if issues could occur, but more likely how many files or clients can the switch handle without issues other than speeds slowing down.

Quote from: Vilhonator on July 09, 2022, 09:26:20 AM
Quote from: NW4FUN on July 09, 2022, 08:50:30 AM
Error counting keeps growing at a massive pace...
Any idea anyone?

Ok I searched Cisco specification site, since they do sell models with different port amount and types of SFP ports as well (meaning model with 48 ports has SFP+ where exactly same model with 8 or 24 ports only has 1Gb SFP)

https://documentation.meraki.com/MS/MS_Overview_and_Specifications/MS125_Overview_and_Specifications <--- according those specifications, Meraki MS120 series switches don't have 10Gb SFP+. so it might be your 8 port meraki causing those (last numbers in each model name defines the amount of ports the switch has).

Nothing to worry about, should be fixed when you switch the 10Gb SFP+ cable connected to MS120-8 to 1Gb SFP cable. Basically could indicate that errors arre related to how many packets are dropped due to port not being able to receive them at faster rate, so if all works just fine, then you can just ignore it

Hi Vilhonator,

Thanks for taking the time to look into this, however you must have confused my topology...
All it matters is the core layer which is a MS125-24p where the DEC3840 is hooked into.
The access layer is yet another MS125-24p which is bound to the core via 20G LACP link.
The small MS120-8 you're referring to, is a small service switch indeed connected at 1G into the access layer exclusively used for serving 2 devices.
For clarity I'm adding a screenshot of my topology.

Also, I'm adding a screenshot of the interface errors on OPNsense, hopefully someone (Franco??) may help in shedding some light here.

Cheers,

NW4FUN


Hmm shouldn't really be much of an issue.

If ports which indicate errors are connected to switches and all devices connected to them get internet connection through the firewall, then errors could happen becase maybe traffic going thru the firewall being to huge  and Opnsense showing errors in too high detail to be bit confusing. (https://shop.opnsense.com/product/dec3840-opnsense-rack-security-appliance/ <----- check the System Performance statistics of your model)

Whenever you do face errors of this kind, you should check if it is something that needs to be taken care of, since it indicates that firewall or some other device connected to it just can't keep up.

If that is the case, then simple traffic shaping (https://docs.opnsense.org/manual/how-tos/shaper_share_evenly.html) should fix it.

Also it could be just some bug in Opnsense and worth reporting (which I think you allready have when you created the post)

Shaping a 10G/10G link on a FW that is supposed to be able to route 17Gbps NGFW to me seems at least odd...

@Franco - what's your take on these errors?

Quote from: NW4FUN on July 11, 2022, 06:53:29 PM
Shaping a 10G/10G link on a FW that is supposed to be able to route 17Gbps NGFW to me seems at least odd...

@Franco - what's your take on these errors?


That's not odd, it's simplicity in its finest, reason why your firewall might get overwhelmed, is because you have only one firewall and multiple switches forwarding internet connections in and out thru it. If too many computers are sending and recieveing packages from and to the internet at the same time, your firewall can get overwhelmed and errors appear (that's how simplest form of DoS attacks work, send and request too many packages till firewall just dies out and shuts all connections)

It's same as with connecting too many electronics to same power distributor, when power consumption exceeds (usually 10A), fuse will kick in and shuts down power, firewalls and switches aren't any different except they don't shut connections untill maximum threashold is reached or they get massively overwhelmed.