BGP (FRR) drops all LAN routes when adding WAN Virtual IP (CARP) - HA Cluster

l.ansaloni · February 18, 2026, 04:28:09 PM

Subject: BGP (FRR) drops all LAN routes when adding WAN Virtual IP (CARP) - HA Cluster

Hi everyone,

I am facing a critical issue with my OPNsense HA cluster where adding or removing a Virtual IP (Alias/CARP) on the WAN interface causes the entire BGP (FRR) routing table on the LAN side to be dropped/flushed, causing downtime for several minutes.

My Environment:

Setup: 2x OPNsense instances in High Availability (Master/Slave).
BGP Plugin: os-frr (BGP) enabled.
Backend: A Kubernetes cluster using MetalLB in BGP mode.
Logic:
- MetalLB advertises internal private IPs (e.g., 192.168.9.x/32) via BGP to the OPNsense LAN/VLAN interfaces.
- OPNsense learns these routes and knows exactly which K8s node to send traffic to.
- I own a public /22 range. I manually assign specific Public IPs from this range as Virtual IP Aliases on the OPNsense WAN.
- I use Port Forward (NAT) to map the Public WAN IP to the Private BGP-learned IP.

The Problem:
Whenever I need to add or remove a Public IP from the WAN interface (following the standard CARP procedure: disable CARP on secondary -> add VIP -> add on primary -> re-enable CARP on secondary), the moment I Apply Changes on the primary unit:

The BGP table is completely flushed.
The sessions with the K8s neighbors (LAN side) seem to flap or restart.
It takes 3 to 5 minutes for the routes to be relearned and the traffic to flow again.

Since the WAN VIPs and the LAN BGP sessions are on completely different interfaces, I wouldn't expect a change on the WAN to trigger a full re-initialization of the FRR routing table or LAN-side sessions.

Logs:
I have captured the logs during the event. It seems the FRR service is being stopped/restarted completely.
Notice the frr_carp: no frr deamons active and the transition from BGP_Stop to BGP_Start.

Code Select

2026-02-18T15:52:09	Error	bgpd	[H4B4J-DCW2R][EC 33554455] 10.21.1.14 [Error] bgp_read_packet error: Connection reset by peer
...
2026-02-18T15:51:53	Error	bgpd	[H4B4J-DCW2R][EC 33554455] 10.21.1.11 [Error] bgp_read_packet error: Connection reset by peer
...
2026-02-18T15:49:53	Error	frr_carp	no frr deamons active.
2026-02-18T15:49:53	Error	bgpd	[J9K4Q-T8STY][EC 33554466] 10.21.1.16 [FSM] Failure handling event BGP_Start in state Idle, prior events BGP_Stop, (null), fd -1, last reset: No AFI/SAFI activated for peer
2026-02-18T15:49:53	Error	bgpd	[J9K4Q-T8STY][EC 33554466] 10.21.1.15 [FSM] Failure handling event BGP_Start in state Idle, prior events BGP_Stop, (null), fd -1, last reset: No AFI/SAFI activated for peer
...
2026-02-18T15:49:53	Error	bgpd	[J9K4Q-T8STY][EC 33554466] 10.20.1.13 [FSM] Failure handling event BGP_Start in state Idle, prior events BGP_Stop, (null), fd -1, last reset: Update source change

Configuration Details:

AS OPNsense: 64512 / AS K8s: 64514.
BGP Neighbors configured with: Next-Hop-Self, Multi-Hop (nodes are on a different VLAN), and BFD.
The issue happens exactly when the "Interface/VIP" configuration is reloaded by the OS.

Questions:

Is it expected behavior for FRR to restart or drop routes when any interface configuration (even an unrelated WAN Alias) is modified?
Looking at the log frr_carp: no frr deamons active., it implies the CARP hook script might be forcing a restart or finding the service dead. Is there a way to prevent this for WAN-only changes?
Is there a way to make the FRR process "immune" to interface reloads that don't involve the BGP-facing interfaces?

I need to be able to manage my Public IP pool without taking down the internal routing for the whole cluster. Any advice is welcome!

BGP (FRR) drops all LAN routes when adding WAN Virtual IP (CARP) - HA Cluster

l.ansaloni

February 18, 2026, 04:28:09 PM