Print Page - BGP (FRR) drops all LAN routes when adding WAN Virtual IP (CARP)

Title: BGP (FRR) drops all LAN routes when adding WAN Virtual IP (CARP) - HA Cluster
Post by: l.ansaloni on February 18, 2026, 04:28:09 PM

Hi everyone,

I am facing a critical issue with my OPNsense HA cluster where adding or removing a Virtual IP (Alias/CARP) on the WAN interface causes the entire BGP (FRR) routing table on the LAN side to be dropped/flushed, causing downtime for several minutes.

My Environment:

Setup: 2x OPNsense instances in High Availability (Master/Slave).
BGP Plugin: os-frr (BGP) enabled.
Backend: A Kubernetes cluster using MetalLB in BGP mode.
Logic:
- MetalLB advertises internal private IPs (e.g., 192.168.9.x/32) via BGP to the OPNsense LAN/VLAN interfaces.
- OPNsense learns these routes and knows exactly which K8s node to send traffic to.
- I own a public /22 range. I manually assign specific Public IPs from this range as Virtual IP Aliases on the OPNsense WAN.
- I use Port Forward (NAT) to map the Public WAN IP to the Private BGP-learned IP.

The Problem:
Whenever I need to add or remove a Public IP from the WAN interface (following the standard CARP procedure (https://docs.opnsense.org/manual/how-tos/carp.html#example-adding-a-virtual-ip-to-an-active-vhid-group): disable CARP on secondary -> add VIP -> add on primary -> re-enable CARP on secondary), the moment I Apply Changes on the primary unit:

The BGP table is completely flushed.
The sessions with the K8s neighbors (LAN side) seem to flap or restart.
It takes 3 to 5 minutes for the routes to be relearned and the traffic to flow again.

Since the WAN VIPs and the LAN BGP sessions are on completely different interfaces, I wouldn't expect a change on the WAN to trigger a full re-initialization of the FRR routing table or LAN-side sessions.

Logs:
I have captured the logs during the event. It seems the FRR service is being stopped/restarted completely.
Notice the frr_carp: no frr deamons active and the transition from BGP_Stop to BGP_Start.

Code Select

2026-02-18T15:52:09	Error	bgpd	[H4B4J-DCW2R][EC 33554455] 10.21.1.14 [Error] bgp_read_packet error: Connection reset by peer
...
2026-02-18T15:51:53	Error	bgpd	[H4B4J-DCW2R][EC 33554455] 10.21.1.11 [Error] bgp_read_packet error: Connection reset by peer
...
2026-02-18T15:49:53	Error	frr_carp	no frr deamons active.
2026-02-18T15:49:53	Error	bgpd	[J9K4Q-T8STY][EC 33554466] 10.21.1.16 [FSM] Failure handling event BGP_Start in state Idle, prior events BGP_Stop, (null), fd -1, last reset: No AFI/SAFI activated for peer
2026-02-18T15:49:53	Error	bgpd	[J9K4Q-T8STY][EC 33554466] 10.21.1.15 [FSM] Failure handling event BGP_Start in state Idle, prior events BGP_Stop, (null), fd -1, last reset: No AFI/SAFI activated for peer
...
2026-02-18T15:49:53	Error	bgpd	[J9K4Q-T8STY][EC 33554466] 10.20.1.13 [FSM] Failure handling event BGP_Start in state Idle, prior events BGP_Stop, (null), fd -1, last reset: Update source change

Configuration Details:

AS OPNsense: 64512 / AS K8s: 64514.
BGP Neighbors configured with: Next-Hop-Self, Multi-Hop (nodes are on a different VLAN), and BFD.
The issue happens exactly when the "Interface/VIP" configuration is reloaded by the OS.

Questions:

Is it expected behavior for FRR to restart or drop routes when any interface configuration (even an unrelated WAN Alias) is modified?
Looking at the log frr_carp: no frr deamons active., it implies the CARP hook script might be forcing a restart or finding the service dead. Is there a way to prevent this for WAN-only changes?
Is there a way to make the FRR process "immune" to interface reloads that don't involve the BGP-facing interfaces?

I need to be able to manage my Public IP pool without taking down the internal routing for the whole cluster. Any advice is welcome!

OPNsense Forum

English Forums => High availability => Topic started by: l.ansaloni on February 18, 2026, 04:28:09 PM