Ethernet unplugging does not trigger failover

Started by pinpoint, September 10, 2025, 06:05:57 PM

Previous topic - Next topic
I run OPNsense on two separate nodes, both receive their own public ip from my ISP. Enabled high availability and CARP. If I shut down the master node, the backup node immediately becomes the master and resumes the internet connection after just a few secs. However, if I unplug the ethernet cable (instead of powering down the master node), the node is still listed as the master node and the backup node still remain backup causing me to loose internet connection. How can I fix this?

Quote from: pinpoint on September 10, 2025, 06:05:57 PMHowever, if I unplug the ethernet cable
Do you only have one? If you've multiple, which one? Try the others as well.

How does your setup look like?
Which hardware is OPNsense running on?

September 12, 2025, 07:29:33 AM #2 Last Edit: September 12, 2025, 10:11:19 AM by pinpoint
I am running opnsense on two proxmox nodes, both have 1 WAN and 1 LAN port each. WAN on both nodes are connected to a switch, which is connected to a fiber modem (bridge). I get two public ips from my ISP so each node has its own IP and both are online and have internet. It should be setup correctly. I also setup monitor IP 1.1.1.1 on wan, on both nodes. When I unplug, the WAN interface is still registered as up, while in gateway configuration WAN_DHCP is registered as down.
I have configured CARP VIP LAN 192.168.1.1 on both nodes. HA works and I am able to sync configurations. Master VHID 1 (freq. 1/0), backup VHID 1  (freq. 1/100).
The node does not seem to know that the WAN is unplugged, because the interfase is still up.

I`m starting to wonder if this has something to do with proxmox bridging NIC to vmbr0, and vmbr0 always think link status is connected even when disconnected.

You've mixed CARP High availability with Multi-WAN here, but these  are totally different things.
In Multi-WAN mode OPNsense monitors multiple upstream gateways (or other public IPs routed over them) and switch to the gateway with the next priority if the active one fails.
In CARP the secondary monitors the interfaces of the primary and take over if one fails. But your WANs are not configured as CARP as I understand. So why should the switch to the other node?

Your setup only fails over if the LAN interface of the primary goes down, as you've noticed.

Multi-WAN is handled by a single node, CARP require two. If you want to combine this you need to configure both WANs on both nodes, primary and secondary, and both as CARP as well.

September 12, 2025, 11:09:34 PM #4 Last Edit: September 12, 2025, 11:21:36 PM by pinpoint
As I understand, HA CARP only detects when LAN interface is down.
Gateway and ip monitoring is for multi-wan, but I can`t set gateway groups with only one wan port on my node so multi wan is not an option for me.
My goal is to have continous internet connection. I have two different ISPs connected to each node which is in cluster. If my main ISP is disconnected, I want my other node/secondary ISP to automaticly take over.
Somebody mentioned dynamic routing. Is this the best way to achieve that with my 2 nic setup??

Yes, it is the best and only way.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

September 13, 2025, 08:16:23 PM #6 Last Edit: September 13, 2025, 08:21:31 PM by pinpoint
Thank you, that was exactly what I was looking for. I have set up OSPF with BFD on both nodes, so the service is up and running and seems to communicate with each other, but I still loose internet when disconnecting WAN on the master node, so there is no rerouting to node 2 and I am trying to figure out what may be wrong. First of all, is it sufficient for me to use OSPF with BFD or do I need to setup BGP to achive this? This is only for a small homelab, so I am not trying to make this more complex than necessary and I understand BGP can be quite complex.

My OSPF days were back when we ran Catalyst 6500 with IOS, so my experience with OPNsense implementation is limited.

Your OSPF processes need to announce a default route to the respective partner router. And that default route needs to be withdrawn when the nodes own default gateway is removed because the uplink fails.

That's the general approach. Hope you can find what is needed in the UI.

HTH,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Thanks!
I  have tried different configurations but for some reasopn I can`t get it to work. The nodes do communicate with each other, master state is "Full/DR" and backup is "Full/Backup", however OSPF does not respond if WAN gateway  is down. When unplugging, gateway is down within just a few seconds. These are my configurations (neighbors, prefix lists and route maps are empty). I also turned off BFD until I get OSPF working. I have no gateway groups. My CARP VIP LAN ip is 192.168.50.1. Master router: 192.168.50.2, backup is 192.168.50.3. I have tried both with carp failover/demote, but that neither worked. So my current configuration avoids CARP so node1 always is master.


My ISP gateway ip










Could you check the routing tables on both nodes? "Advertise default gateway" should lead to each node having a second default route with lower priority and the respective other node as the gateway.

When the WAN gateway goes away that route should take over.

The traffic should then go over the HA link - this is where you run OSPF, right? You also need firewall rules on that link permitting the traffic.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

I think so. Here are some more screenshots that might help. (Black theme is node1 192.168.50.2, white is node 2 192.168.50.3.)
Routing table:









Is there a default route in "netstat -rn" output when the WAN link is unplugged? And does it lead to the other node as the gateway.

And please do not use external image hosting sites (/me adding "8upload" to my blocklist ...) You can attach images to forum posts right on the forum.

Kind regards,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Oh sorry. :)
It does not seem to lead to the backup node 192.168.50.3 when disconnected.

Routing tables

Internet:
Destination        Gateway            Flags         Netif Expire
default            151.130.80.1       UGS          vtnet0
8.8.8.8            151.130.80.1       UGHS         vtnet0
10.10.10.0/24      link#7             U               wg0
10.10.10.1         link#3             UHS             lo0
10.10.10.2         link#7             UHS             wg0
10.10.10.3         link#7             UHS             wg0
10.10.10.4         link#7             UHS             wg0
127.0.0.1          link#3             UH              lo0
192.168.50.0/24    link#2             U            vtnet1
192.168.50.1       link#3             UHS             lo0
192.168.50.2       link#3             UHS             lo0
151.130.80.0/20    link#1             U            vtnet0
151.130.84.90      link#3             UHS             lo0

Internet6:
Destination                       Gateway                       Flags         Netif Expire
::1                               link#3                        UHS             lo0
fe80::%lo0/64                     link#3                        U               lo0
fe80::1%lo0                       link#3                        UHS             lo0

default            151.130.80.1       UGS          vtnet0

That's a virtual interface inside of some hypervisor. You did not mention that you run OPNsense in a VM. That interface will not go down when you unplug the physical cable. You need a physical interface to notice up/down events - either by running OPNsense directly on the hardware or by using PCIe passthrough at least for WAN.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)