CARP on WAN behaving weirdly...

Started by ghosterius, November 24, 2024, 11:04:00 PM

Previous topic - Next topic
Hi Everyone!

So, my setup is as follows:
2 OPNSense virtualized on Proxmox with 1 vNIC and 2 physical host NICs assigned.

The vNIC is trunked and has multiple vLANs crossing it, no issues there, everything's working wonderfully (CARP and the lot work fine there).

Then there's 1 NIC dedicated to the WAN connection (and this is the one's acting a bit tricky... more in a sec) and 1 NIC dedicated to CARP between the 2 VMs.

CARP's vIP configured for the internal LAN networks (multiple vLANs) and everything's sort of alright... but the WAN connection is just acting weirdly.

Whenever I enable CARP on the backup machine, all vIPs get on BACKUP mode, but a few second (minutes) later, WAN gets into MASTER, while on the main Firewall, it's also at MASTER!

I've checked physical cables, I've checked firewall status and logs but nothing comes up really as being blocked at any point...
attached the images of the configs.

On the log I see this:

2024-11-24T19:26:56 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member " (192.168.17.2) (2@igb0)" has resumed the state "MASTER" for vhid 2
2024-11-24T19:26:56 Notice kernel <6>carp: 2@igb0: BACKUP -> MASTER (master timed out)
2024-11-24T19:22:35 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : openvpn_refresh_crls(1))
2024-11-24T19:22:35 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : core_trust_crl(1))
2024-11-24T19:22:35 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (1)
2024-11-24T19:22:35 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : openvpn_refresh_crls(1))
2024-11-24T19:22:35 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member " (192.168.18.2) (10@vtnet0_vlan10)" has resumed the state "BACKUP" for vhid 10


An interesting aspect is that I don't even have OpenVPN configured, so I don't know wth openvpn wants with the lot but... OK...

I must admit I am lost... I don't know why this is happening and why it doesn't "see" that the other node has the WAN vIP up!

As a final point on the architecture explanation, in front of the 2 FW there's an ISP router which works absolutely fine and it has been working for years without a problem on the other *Sense firewall software.

Did you ever find out of this?

I have an even simpler setup with two physical boxes and dedicated cable (cross-over) between the sync-interfaces. No vlans. I'm new to opnsense, used pfSense before and struggle with this here...

I can see traffic over sync-interface, but it still activates both CARP VIP, so it causes conflict when both as set to master. And the backup-unit is the one working if anything working. It is so strange. I saw some videos explaining the setup and even there they got weird issues.

CARP does not happen over the sync interface but on each CARP interface individually.

How exactly did you configure the CARP interface(s)?
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

November 29, 2024, 10:23:45 AM #3 Last Edit: November 29, 2024, 10:29:25 AM by firewallfun
Yes, that's true I guess. More the communication behind it that activates it I think. So that only one VIP is active to avoid conflict. And to be sure of that, the sync interface has to work properly.

I have the interfaces like attached in the images.. Since I have public IPs on both WAN (/29) transport network and LAN (/24) , I have censored part of the IPs that are public.

Needed another post to be allowed to upload last image I had :)

Why are you using unicast instead of multicast? Also I recommend using the fitting /29 prefix length for both fixed IP addresses but /32 for the CARP VIP.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

November 29, 2024, 06:30:23 PM #6 Last Edit: November 29, 2024, 06:47:41 PM by firewallfun
Well, I watched a lot of guides recommend it, also documentation seemed to favour unicast-method when on a dedicated interface. No need to do discovery on each request. But who knows. I can remove it. It is a dedicated cross over cable directly from port to port, so it shouldn't matter a lot. Is it safe to have unicast on the WAN-interface?

I did a completely new install and fixed lagg issue I had on LAN.

Btw, I have WAN CARP VIP kind of working now. I can sync the rules over. And if I take one of the firewalls offline, the WAN CARP IP replies perfectly, just a single missed ping. And everything works 100% when the backup opnsense fw is under reboot. Only visible issue and that is a big one - is that when I log in to the CARP WAN IP when both servers are up, I get to the backup unit. I only get to the master when the backup-unit is down for reboot. Then I get the master on the CARP WAN IP.

When checking the VIP-status when both FW are on, the WAN shows correctly MASTER on both my CARP WAN and CARP LAN IP on the Master-unit. Like it should.

But the bad is that backup-unit shows MASTER on CARP WAN IP as well. CARP LAN IP behaves correctly.

So you can say that CARP LAN-IP works on both as it should. But the CARP WAN is active at both places at the same time, I think it creates some issues :( Before I had both of them wrong - both WAN and LAN - so progress :) I have tried many reboots and trying to activate/deactivate the carp, but it doesn't seem to change.

PS: I removed just now the peer IP to use multicast (I think). Did it all 4 places (WAN+LAN). Just leaving it empty like that I guess?

I can sync between everything just as with IP. So it is only this master/backup on VIP that is the issue. The pfsync interface workning just fine.

Note that if I click Persistent CARP maintance mode on - on the backup one - it reduces the demotion level   to 240, but it will still show status MASTER on the WAN CARP IP. If I click Temporary disable CARP, it shows Backup on both CARP IPs Wan and Lan two seconds before Wan becomes Master again (and master on both opensense-bokses).

I wonder if there is some outbound NAT things I have to do to fix this?

The peering on the HA sync interface and CARP are in no way connected. CARP works completely isolated. You can have two FreeBSD machines with identically configured (e.g.) varnish proxies and set up CARP in the publicly accessible interface manually. No sweat.

CARP state is negotiated for each interface directly on that interface.

The HA sync synchronises firewall state and configuration but not CARP.

So the sync interface aside

- configure a static IP address on both nodes on all interfaces where you want CARP
- configure a CARP VIP on that interface with /32 netmask on the master and sync that configuration to the backup

HTH,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

But how can the master/backup work for only the LAN in this case? The primary VIP (carp) on the WAN isn't supposed to be active on both devices at the same time?

Because the WAN is in some way configured wrong. Simple as that  ;)

Every interface is negotiated individually with CARP.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Yeah, I assume so too :) But what..  It looks so simple..  I can at least ping between the pfsync interface on both fw, from each of them inside shell. So there isn't a sync issue there.

Can it have something with NAT to do or where can I find some logs to help me.

I'm trying to follow the video here: https://www.youtube.com/watch?v=I5n3QXOlxmw&t=643s

Can you ping from WAN to WAN? Do both have a static IP address as the interface address?

pfsync/HA has absolutely no say in CARP.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

November 29, 2024, 07:30:05 PM #12 Last Edit: November 29, 2024, 07:48:23 PM by firewallfun
That's a negative. I can not ping from inside SSH from neither master or backup to the other one (if that is considered WAN-to-WAN). Only can ping it's own WAN interface and others IP's in the /29.

I have a public static transport network (a /29 net) - from my ISP that I use on the wan side.

My ISP provided my with 2 fibers and instructed me to use .89/29 on master and .90/29 on the backup.
And have .88/29 as HA/CARP like I have on my CARP WAN IP.

So each unit has their single WAN-connection directly from my ISP (they provide me with .85 as my GW) and I have configured each units WAN-interface accordingly, with .85 as the GW. I can ping the GW from both of them.

I have plugged in a laptop with IP .86 (taken out my ISP's fiber in each fw and plugged in as my laptop was my ISP) and I can ping both WAN from that to each of the fw WAN IP. So both WAN interfaces are responding correctly there at least, directly connected. But I can't ping from shell from one unit to the other. I can however ping the CARP WAN IP (.88) and their gw (.85) from both WAN (from shell).

To give even more details you don't need, they also provide me with a /24 that I use on my LAN-side (also public static IP-addresses - as all my servers are web-servers that are ment to be public/on public IP - so I basically get kind of transparent fw-ish with my setup).

I can ping my static public LAN-IPs from outside of my WAN, so traffic going through the WAN->LAN perfectly. I can ping LAN-interface .1 .2 and .3 on both fw from shell. In both directions.

Ping from box to box on WAN might not work because by default there is no rule in place that allows that. I forgot that I have floating rules that unconditionally allow ICMP echo on all of my firewalls. No point in blocking "ping".

Your static setup on WAN looks good and if both boxes can ping the ISP gateway - great.

See my screen shot for how configure the CARP VIP. Use a /32 netmask for that - but keep the /29 for the interface addresses on both boxes. Also use Multicast for the CARP sync as shown in my screen shot.

I suspect the automatic firewall rules for CARP do not allow unicast sync by default, but I did not look that deep for now.

HTH,
Patrick

P.S. same configuration (/32 netmask, multicast) on LAN! Just plain works.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Under automatic rules on the WAN-interface, it is listed this:
IPv4 CARP   *   *   224.0.0.18   *   *   *   *   CARP defaults