OPNsense Forum

English Forums => 25.7 Series => Topic started by: davidfi01 on September 24, 2025, 03:36:45 PM

Title: [SOLVED]: HA issue: Dpinger's down state does not convert to CARP demotion
Post by: davidfi01 on September 24, 2025, 03:36:45 PM
I have implemented LAN CARP and can confirm using "sysctl net.inet.carp.demotion=240" that the primary demotion level sets to 240 and master -> backup traffic flow works.  However, in my situation with 2 independent WAN Gateways (one on Primary, the other on Secondary) dpinger's DOWN state on the primary isn't converted into a CARP demotion on the primary.  The current carp demotion always remains at 0 when I simulate an isp outage.

How can I get CARP to see dpinger's report of down state on isp side?

D

Title: Re: HA issue: Dpinger's down state does not convert to CARP demotion
Post by: viragomann on September 24, 2025, 05:56:11 PM
CARP and multi WAN are total different things.
CARP = failover from primary OPNsense instance to secondary
multi WAN = failover from the primary gateway to the secondary

To combine both you need to assign both internet connnections and both gateways to both OPNsense boxes.
Title: Re: HA issue: Dpinger's down state does not convert to CARP demotion
Post by: davidfi01 on September 24, 2025, 06:37:02 PM
Yes, but depends on config.  In my case Primary router connected to primary isp.  Secondary router connected to secondary isp.  This is a dual wan, not multi-wan config.  In dual wan, Primary router uses primary isp exclusively.  If dpinger is active, then upon fail to connect to primary isp, dpinger flags gateway as down, and LAN CARP should move traffic over to secondary router using secondary isp.

Still not working although I am thinking I need to create a CARP VIP for WAN interface so Primary runs as Master, and then if its wan link goes down, secondary will be notices over carp link??

D
Title: Re: HA issue: Dpinger's down state does not convert to CARP demotion
Post by: Patrick M. Hausen on September 24, 2025, 06:40:46 PM
CARP cannot do that. It's a strictly link local protocol.

Your primary router needs a second gateway: the secondary router on the dedicated HA interface. Thus when the primary ISP fails, clients still use the primary router which in turn sends the traffic to the secondary one.

Secondary router needs the primary one as the second gateway, respectively.

Title: Re: HA issue: Dpinger's down state does not convert to CARP demotion
Post by: viragomann on September 24, 2025, 06:51:02 PM
Quote from: davidfi01 on September 24, 2025, 06:37:02 PMIn my case Primary router connected to primary isp.  Secondary router connected to secondary isp.  This is a dual wan, not multi-wan config.
This is a single-WAN setup from the view of each OPNsense box.

Quote from: davidfi01 on September 24, 2025, 06:37:02 PMIf dpinger is active, then upon fail to connect to primary isp, dpinger flags gateway as down, and LAN CARP should move traffic over to secondary router using secondary isp.
Why should it do this?

The gateway status has nothing to do with CARP at all.
If the gateway is down, OPNsense can switch over to another one. However, there isn't any.
Title: Re: HA issue: Dpinger's down state does not convert to CARP demotion
Post by: davidfi01 on September 24, 2025, 11:32:45 PM
It's a 2-node configuration.  When dpinger is used to monitor the primary node's wan connection by pinging a known public address, and it fails, dpinger issues a wan interface down message.  CARP is supposed to see that message and raise demotion level from 0 to 240. CARP is not seeing the interface down message, although the interface down message appears in the system logs.  There is something either not correctly configured properly or CARP is not receiving or acting upon the wan interface down message.

D
Title: Re: HA issue: Dpinger's down state does not convert to CARP demotion
Post by: viragomann on September 24, 2025, 11:59:16 PM
Quote from: davidfi01 on September 24, 2025, 11:32:45 PMWhen dpinger is used to monitor the primary node's wan connection by pinging a known public address, and it fails, dpinger issues a wan interface down message.
No. It tells you that the gateway is down.
The interface might still be up at this time.

Quote from: davidfi01 on September 24, 2025, 11:32:45 PMCARP is not seeing the interface down message
In a CARP setup two boxes share a MAC and an IP in each network segment. Both interface need to be in the same layer 2 (connected with a switch / bridge). The backup node's e.g. WAN interface monitors the advertisement packets of the primary. If they are not arriving on its WAN, the secondary takes over the virtual MAC and IP.
But "gateway down" does nothing here.

Quote from: davidfi01 on September 24, 2025, 11:32:45 PMThere is something either not correctly configured properly
Exactly. Connect each WAN two both OPNsense boxes and configure CARP (https://docs.opnsense.org/manual/how-tos/carp.html#configure-carp) properly.
Title: Re: HA issue: Dpinger's down state does not convert to CARP demotion
Post by: davidfi01 on September 26, 2025, 04:47:27 AM
SOLVED:  After some thought, and a bit of experimentation, the solution was quite simple.  I created one other CARP VIP for primary wan interface on the primary node.  This connected the drop of the wan gateway through its interface to the carp demotion processing.  As soon as the primary node  wan gateway monitor sees the gateway is down, it triggers a change to the CARP demotion value via the new wan interface CARP VIP from 0 to 240, which cascades across all the carp vips, and puts the primary node into backup, and promotes the secondary node to Master.  With dns and dhcp properly set up in HA, failover and fail back are now completely seamless.

I have physically removed the primary wan cable, triggered the failover, transfer of dns and dhcp processing and verified with whatsmyip and checking both the primary and secondary vip status to confirm successful failover.  Reconnecting the wan cable triggered the failback.  I also shut down primary node and powered it back up to simulate loss of the node.  Failover and failback work seamlessly now.
Therte is no need for wan grouping, or for the wan interface ips and gatways to be on the same switch or segment.  There is NO load balancing. The primary node serves all traffic, the secondary node is a hot standby, only routing when the primary node is down or the primary node's gateway is down.