[SOLVED] LAGG interface working fine but shown as "flapping"?

Started by Patrick M. Hausen, January 09, 2022, 02:56:24 PM

Previous topic - Next topic
Hi all,

I run lagg interfaces on a couple of OPNsense firewalls. HW offloading on the physical ports disabled, of course. Peers either another OPNsense or some Cisco gear. For the Ciscos I restrict the lagghash to L2 and L3, all other parameters I left at the default settings.

The interfaces seem to work fine, no observable packet loss or anything. But the overview section in the UI shows the interfaces as "flapping". This is for both kinds of peers - Cisco and OPNsense.

Does anyone know what this is supposed to tell me?

The small alarm sign at the top in this particular case means the interface is not in use for layer 3 - strictly a parent for VLANs. So not related to my question here.

Thanks!
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)


No log entries. Current state:
cisco#sh lacp 3 internal
Flags:  S - Device is requesting Slow LACPDUs
        F - Device is requesting Fast LACPDUs
        A - Device is in Active mode       P - Device is in Passive mode     

Channel group 3
                            LACP port     Admin     Oper    Port        Port
Port      Flags   State     Priority      Key       Key     Number      State
Gi0/15    SA      bndl      32768         0x3       0x3     0x110       0x3D 
Gi0/16    SA      bndl      32768         0x3       0x3     0x111       0x3D 
cisco#sh lacp 3 counters
             LACPDUs         Marker      Marker Response    LACPDUs
Port       Sent   Recv     Sent   Recv     Sent   Recv      Pkts Err
---------------------------------------------------------------------
Channel group: 3
Gi0/15      332904 304500   0      54       54     0         0     
Gi0/16      333079 304511   0      38       38     0         0     

cisco#sh lacp 3 neighbor
Flags:  S - Device is requesting Slow LACPDUs
        F - Device is requesting Fast LACPDUs
        A - Device is in Active mode       P - Device is in Passive mode     

Channel group 3 neighbors

Partner's information:

                  LACP port                        Admin  Oper   Port    Port
Port      Flags   Priority  Dev ID          Age    key    Key    Number  State
Gi0/15    SA      32768     3cec.ef00.5430  15s    0x0    0x12B  0x1     0x3D 
Gi0/16    SA      32768     3cec.ef00.5430  11s    0x0    0x12B  0x2     0x3D 
cisco#sh int port-channel 3
Port-channel3 is up, line protocol is up (connected)
  Hardware is EtherChannel, address is 00b6.70d6.3290 (bia 00b6.70d6.3290)
  Description: OPNsense
  MTU 1500 bytes, BW 2000000 Kbit/sec, DLY 10 usec,
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 1000Mb/s, link type is auto, media type is unknown
  input flow-control is off, output flow-control is unsupported
  Members in this channel: Gi0/15 Gi0/16
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:20, output 00:00:01, output hang never
  Last clearing of "show interface" counters never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 5055725
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 2033000 bits/sec, 255 packets/sec
  5 minute output rate 205000 bits/sec, 124 packets/sec
     2904153822 packets input, 2492649896 bytes, 0 no buffer
     Received 6668516 broadcasts (6612086 multicasts)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 watchdog, 6612086 multicast, 0 pause input
     0 input packets with dribble condition detected
     2458759583 packets output, 2147033041 bytes, 0 underruns
     0 output errors, 0 collisions, 1 interface resets
     0 unknown protocol drops
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 pause output
     0 output buffer failures, 0 output buffers swapped out
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)


Yep. I wonder where from the UI pulls that information.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Looks like this is the piece of code responsible for receiving that information:
https://github.com/opnsense/core/blob/fb041467bf17155b1736b42c8e65a769f984d0b1/src/etc/inc/interfaces.lib.inc#L352-L357

It seems to parse the output of "ifconfig -m -v [interface]" and assumes that there are only these two lines (active and flapping porst). Can you try that command on the CLI and see if there's something else? Could be a misinterpretation.

root@ffmgate1:~ # ifconfig -m -v  lagg0
lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=800028<VLAN_MTU,JUMBO_MTU>
capabilities=f53fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,NETMAP,RXCSUM_IPV6,TXCSUM_IPV6>
ether f4:90:ea:00:71:42
inet6 fe80::f690:eaff:fe00:7142%lagg0 prefixlen 64 scopeid 0xb
laggproto lacp lagghash l2,l3
lagg options:
flags=10<LACP_STRICT>
flowid_shift: 16
lagg statistics:
active ports: 2
flapping: 4
lag id: [(8000,F4-90-EA-00-71-42,016B,0000,0000),
(7F9B,00-23-04-EE-BE-3F,8001,0000,0000)]
laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
[(8000,F4-90-EA-00-71-42,016B,8000,0002),
(7F9B,00-23-04-EE-BE-3F,8001,8000,0101)]
laggport: igb2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING>
[(8000,F4-90-EA-00-71-42,016B,8000,0003),
(7F9B,00-23-04-EE-BE-3F,8001,8000,4101)]
groups: lagg
media: Ethernet autoselect
status: active
supported media:
media autoselect
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

I don't know if this will solve your issue but I had a similar issue with an DEC3860 using 2 10G interfaces in a LAGG configuration connected to a Cisco switch a while back.
You might try this:

set net.link.lagg.default_use_flowid = 1 under System->Setting->Tunables

Once I set that tunable the lagg interface never flapped again.

Hello,

i have the issue that under the Gui - LAGG Statistics - is shown flapping3

And strangely under Console, flapping 0

ifconfig -m -v lagg0
lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=900028<VLAN_MTU,JUMBO_MTU,NETMAP>
        capabilities=f53fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,NETMAP,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 00:xx:xx:xx:xx:xx
        inet6 fe80::xxxx:xxxx:xxxx:xxxx%lagg0 prefixlen 64 scopeid 0xb
        inet 192.168.99.1 netmask 0xffffff00 broadcast 192.168.99.255
        laggproto lacp lagghash l2
        lagg options:
                flags=90<LACP_STRICT>
                flowid_shift: 16
        lagg statistics:
                active ports: 3
                flapping: 0


What is now correct?
Cheers,
Crissi

Quote from: infinisourcekc on January 10, 2022, 03:54:30 AM
set net.link.lagg.default_use_flowid = 1 under System->Setting->Tunables
That did it, thanks! No need to set a tunable, though. You can do that directly in the LAGG interface configuration.

lagg statistics:
active ports: 2
flapping: 0

Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

@pmhausen

And LAGG Statistics Gui is also correct displayed?
Cheers,
Crissi

Not yet, but I have not rebooted the system, since. I don't know when this information is refreshed.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Would be interesting, i have rebooted several times, and it seems the gui status is not refreshing an pick up the correct value
Cheers,
Crissi