BGP with CARP LAN

Started by duimeziod, September 28, 2024, 05:19:29 AM

Previous topic - Next topic
I have 2 OPNSense routers/firewalls that I want to run in a high availability configuration. They are connected to separate uplinks and receive BGP routes from their upstreams. How should I configure these so that clients on the LAN will not be interrupted if one machine's WAN or LAN is disconnected? These were originally setup as traditional HA with CARP on both WAN and LAN, and that worked fine, but I had to switch to BGP for the upstream.

I currently have them configured with CARP on the LAN interface. This works in normal operation, but if I set the master node into CARP Persistent Maintenance Mode (so I can perform upgrades), LAN clients can no longer reach the internet. Furthermore, if the master node's WAN is disconnected or if the BGP service is shut down, LAN clients also cannot reach the internet.

I believe what's happening is that in the first case, traffic destined for my LAN is still going to the master node, but not reaching the LAN from it because it is in maintenance mode. In the second case, I believe the CARP VIP is still bound to the master node so outbound traffic goes to the master node, but it has no routes to the internet.

Have you tried "enable carp failover" in the routing settings?

Quote from: bimbar on September 28, 2024, 11:32:38 AM
Have you tried "enable carp failover" in the routing settings?
I've looked at that setting and I don't think it does what I want it to do. The docs say that it will shutdown the BGP service when CARP is in backup, but that means that the failover will take a bit of time while BGP starts up on the other node and therefore result in a noticeable interruption. It's also unclear to me whether that setting will force CARP into backup mode if BGP is down.

I'd rather an acitve-active setup for BGP and have it switchover with minimal interruption.

I think there must be some way to configure BGP so traffic can be routed to the other node if one of the WAN or LAN interfaces is down/in backup.

CARP is for clients internal to your AS so you need that on your LAN. BGP is for external links. So you did everything correct so far.

Now the missing piece is that you need a link - preferably dedicated high bandwidth - between the two boxes and run a BGP peering on that. This is called iBGP (internal). The only difference if a iBGP and an eBGP (external) peering is that in iBGP both peers use the same AS number.

You can use the HA link for that, of course.

Now in case one of the external peerings goes down, but packets still arrive at the box now without a proper uplink, it will know to forward the traffic to the peer.

HTH,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

September 30, 2024, 01:52:23 AM #4 Last Edit: September 30, 2024, 02:47:46 AM by duimeziod
Quote from: Patrick M. Hausen on September 29, 2024, 03:17:58 PM
Now the missing piece is that you need a link - preferably dedicated high bandwidth - between the two boxes and run a BGP peering on that. This is called iBGP (internal). The only difference if a iBGP and an eBGP (external) peering is that in iBGP both peers use the same AS number.

You can use the HA link for that, of course.

Now in case one of the external peerings goes down, but packets still arrive at the box now without a proper uplink, it will know to forward the traffic to the peer.
Ok, I've tried setting that up but it doesn't seem to be routing properly. Can you suggest specific settings/configuration for this?

What I currently have is, on both nodes, the other node added as a neighbor with "Next-Hop-Self" checked, a weight of 1, and prefix lists of "any" for in and out.

When I disconnected the WAN on the master node, I could still ping out, but could not ping in. I could ping the routed IPs assigned to the routers, but not any other IPs on those subnets that I normally can. However a client on the LAN was able to ping external IPs.

When I set the CARP master to maintenance mode, nothing was reachable in both directions.

I think the problem may be related to having my routed IPs in the routing table have the next hop of their best route be 0.0.0.0 rather than the other node, but I can't figure out how to remove that.

Edit: Maybe I should try without setting weight. I think I did that earlier and it didn't work, but I think I had other misconfiguration at the time too.

Sorry, I only ever did that with Cisco gear. So I can only tell you the fundamentals.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: duimeziod on September 28, 2024, 08:09:31 PM
Quote from: bimbar on September 28, 2024, 11:32:38 AM
Have you tried "enable carp failover" in the routing settings?
I've looked at that setting and I don't think it does what I want it to do. The docs say that it will shutdown the BGP service when CARP is in backup, but that means that the failover will take a bit of time while BGP starts up on the other node and therefore result in a noticeable interruption. It's also unclear to me whether that setting will force CARP into backup mode if BGP is down.

I'd rather an acitve-active setup for BGP and have it switchover with minimal interruption.

I think there must be some way to configure BGP so traffic can be routed to the other node if one of the WAN or LAN interfaces is down/in backup.

Yes, but that's the easiest way to do a switchover. Not sure how you would do that in another way.

October 03, 2024, 01:44:49 AM #7 Last Edit: October 03, 2024, 01:48:09 AM by duimeziod
I've since figured out that the routes with 0.0.0.0 as next hop are those being originated by the router. So I think the issue is ultimately that when the router is in CARP backup, it still thinks it can route those networks when it actually can't.

I was unable to figure out how to make it remove those routes/prefer the other route when in CARP backup. I think it may be a limitation of CARP since an IP on the network seems to still be assigned to the interface.

I've ended up going with the "enable carp failover" option even though it seems to be less than idea. At least it works with a few seconds of downtime.

Unfortunately "enable carp failover" is still insufficient as it doesn't handle the case when the master's WAN is disconnected. In that situation, the master remains the CARP master even though it cannot route any traffic to the WAN.

I have continued to try to have the backup as an iBGP peer but it runs into the same problem as before where inbound traffic is being sent to the LAN interface rather than the master node.

The problem is also that the firewall state tables for the backup router don't necessarily match if BGP and CARP don't match, so the firewall drops return packets.
Not a problem for cisco routers, because they don't have a state table.

Quote from: duimeziod on October 07, 2024, 07:35:14 AM
I have continued to try to have the backup as an iBGP peer but it runs into the same problem as before where inbound traffic is being sent to the LAN interface rather than the master node.
Why is this a problem? Neither source nor destination of the packets are tied to a particular OPNsense node, correct? So the packet just needs to get at the destination in LAN *somehow*?

Quote from: bimbar on October 07, 2024, 10:50:03 AM
The problem is also that the firewall state tables for the backup router don't necessarily match if BGP and CARP don't match, so the firewall drops return packets.
Not a problem for cisco routers, because they don't have a state table.
If you want to turn OPNsense into a HA BGP system, possibly disable stateful filtering?
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: Patrick M. Hausen on October 07, 2024, 10:56:44 AM
Quote from: bimbar on October 07, 2024, 10:50:03 AM
The problem is also that the firewall state tables for the backup router don't necessarily match if BGP and CARP don't match, so the firewall drops return packets.
Not a problem for cisco routers, because they don't have a state table.
If you want to turn OPNsense into a HA BGP system, possibly disable stateful filtering?

Maybe. At that point I would use a real router and the opnsense behind it.

Quote from: Patrick M. Hausen on October 07, 2024, 10:56:44 AM
Why is this a problem? Neither source nor destination of the packets are tied to a particular OPNsense node, correct? So the packet just needs to get at the destination in LAN *somehow*?
I'm not entirely sure.

It seems that my LAN clients can reach the internet, but I can't reach them from outside.

However, I do also have Wireguard setup which is tied to the CARP VIP. This becomes unreachable when the master's WAN is disconnected, presumably because the Wireguard packets are not being sent to the master which has the Wireguard service running.

Quote from: Patrick M. Hausen on October 07, 2024, 10:56:44 AM
Quote from: bimbar on October 07, 2024, 10:50:03 AM
The problem is also that the firewall state tables for the backup router don't necessarily match if BGP and CARP don't match, so the firewall drops return packets.
Not a problem for cisco routers, because they don't have a state table.
If you want to turn OPNsense into a HA BGP system, possibly disable stateful filtering?
Shouldn't pfSync sync the firewall state so that shouldn't be a problem?

Quote from: bimbar on October 07, 2024, 11:00:23 AM
Maybe. At that point I would use a real router and the opnsense behind it.
I'm thinking of doing that, but then I'd have to buy more hardware and I'd really rather not have to do that.

Yes, pfsync syncs the states, but it is not usually quick enough to catch the first return packet.

October 09, 2024, 06:01:14 AM #14 Last Edit: October 09, 2024, 06:04:37 AM by duimeziod
I ended up switching the backup node to using bird instead of frr and configuring it on cli.

The final solution was to have the master node with frr with CARP failover enabled and the backup node as an ibgp neighbor. The backup node has bird installed configured to have the master node as an ibgp neighbor connected over the pfSync interface. Direct routes are also imported. Routes sent by that neighbor are set to have a preference value higher than the value for direct routes. The ibgp connection also uses bfd for faster failover.

The result is that when I disconnect the master's WAN, traffic is routed by my ISP to the backup node, which sends the traffic to the master node over the pfSync interface. When the master's LAN is disconnected (or CARP is disabled/demoted), frr on the master shuts down which removes the ibgp routes on the backup node and so all traffic goes through the backup node.

The failover time for these scenarios is still multiple seconds and it seems like established connections do get interrupted. But I'm not running anything that requires more uptime.