CARP on WAN behaving weirdly...

Started by ghosterius, November 24, 2024, 11:04:00 PM

Previous topic - Next topic
Then leave your CARP VIP configuration at the defaults. How is it supposed to work if you change the addresses without adding rules?

Also if you synchronise the VIP configuration from master to backup, the backup will end up with the identical peer IP address. Which also can only work as long as that is multicast.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

November 29, 2024, 09:00:34 PM #16 Last Edit: November 29, 2024, 09:05:15 PM by firewallfun
I changed to multicast in beginning of this thread long time ago, so that's what I'm using today :)

I have deleted all my VIPs on both machines. Rebooted both machines. Then configured only the Carp WAN on master with a vhid of 10 instead of 1 - and longer timeout. So it shows master and green. I synced everything over for this WAN interface and now I have CARP VIP on both of them working. My primary says primary and backup says backup.

However, 20 seconds later, both of them shows master. And I get the error message below when I check the log:

2024-11-29T20:49:13   Notice   opnsense   /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "WAN CARP (.188) (10@ix1)" has resumed the state "MASTER" for vhid 10   
2024-11-29T20:49:13   Notice   kernel   <6>carp: 10@ix1: BACKUP -> MASTER (master timed out)

So now they are both active again.. so weird. I know the pfsync interface can communicate with each other, as ha-sync is performing as it should. I can also ping both directions.

I tried a floating WAN-rule - allow all ping. But still didn't help on pinging WAN->WAN. I also disabled firewalling (pfctl -d) totally. Still can't ping wan->wan from shell. Can there be a hidden setting somewhere  ???

This happens while firewalling are totally off with command above, so it can't be a firewall rule. So it must have something to do with the public ip.

pfsync is not related to CARP. (I'm repeating myself - the state of your HA interface is irrelevant)

How are the WAN interfaces connected to each other and to the ISP router? Any chance there is a switch involved that might not properly support multicast?
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

I just can't understand how they are not related :) pfsync (I have called the pfsync interface pfsync, with ip 192.168.60.2 and 192.168.60.3 communicates what CARP-IP that should be active?

However: In the data center, I only get two fiber cables from a ISP room directly that I plug directly to each of my 2 opnsense-boxes, one in each So I only know what my ISP has said to me. They have a Cisco HSRP/VRRPP-router (or similar) in HA-setup. So their "CARP-IP" is .85 (that is also my gateway I'm told), but the individual routers then have is .86 and .87.

Until now, I have used pfSense and redundant WAN on same unit. That has worked great. I had to open port 1985 or something like on WAN that so that their two routers can see both lines at all times.

Quote from: firewallfun on November 29, 2024, 09:22:21 PM
I just can't understand how they are not related :) pfsync (I have called the pfsync interface pfsync, with ip 192.168.60.2 and 192.168.60.3 communicates what CARP-IP that should be active?

No. Two nodes speaking CARP exchange which one takes priority via the CARP protocol on the very interface where CARP is active.

pfsync only synchronises the firewall state so in case the master crashes and the backup takes over (via CARP) the connections are not interrupted because the firewall state is missing.

pfsync and CARP are orthogonal technologies. You can have - as I wrote as an example - two proxies (not OPNsense) with CARP and no pfsync at all because there is no pf or other firewall involved.

CARP manages a virtual IP between two nodes and that is all.

Quote from: firewallfun on November 29, 2024, 09:22:21 PM
However: In the data center, I only get two fiber cables from a ISP room directly that I plug directly to each of my 2 opnsense-boxes, one in each So I only know what my ISP has said to me. They have a Cisco HSRP/VRRPP-router (or similar) in HA-setup. So their "CARP-IP" is .85 (that is also my gateway I'm told), but the individual routers then have is .86 and .87.

Until now, I have used pfSense and redundant WAN on same unit. That has worked great. I had to open port 1985 or something like on WAN that so that their two routers can see both lines at all times.

Ask your ISP if on the other side of these two links there is a switch that allows the two OPNsense boxes to communicate with each other or if you are supposed to provide your own.

You need a flat network with

- your ISP default gateway
- both your OPNsense boxes' WAN

so CARP can work.

Again: CARP is a local protocol that manages failover of IP addresses and nothing else. Two nodes in a cluster run CARP on each interface - separately and independently of all other interfaces.

pfsync manages an entirely different part of what makes up a HA firewall cluster.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

November 29, 2024, 09:55:04 PM #20 Last Edit: November 29, 2024, 09:59:18 PM by firewallfun
Ah, ok. I think I got it now. pfsync basically shouldn't have anything to do with this problem at all, it is not the problem here for sure :) Since synchronisation of data works. It is only the CARP/VIP that's the problem.

They have previously said I could use a single unmanaged switch and connect both lines. As long as both their routers can "see each other" through my switch, there is redundant internet and they will choose what line they send data over automatically. I suspect that can be the error as well, it might not be "flat" as it is configured now :)   

Here is what they said before:

"These are Layer 3 router ports on our end, so this setup will not create a loop.

Our routers will broadcast HSRP packets to each other via your switch and set up a virtual IP, preferably on Connection 52, with failover to Connection 53 if Connection 52 goes down. Outgoing traffic will then use Connection 52 as long as it is active, while incoming traffic will be distributed across both, depending on where the traffic originates (shortest route)."

Yes, you need a flat network connecting both their uplinks and both your WAN interfaces.

If you want to avoid that single point of failure, you need two switches of a kind that supports "stacking" i.e. acting as if it were a single one. There are various options from different vendors.

HTH,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

I use that on the LAN-side actually, that's what I have lagg against.

But I would prefer to avoid having additional two switches just for my two lines. Stacking switches cost like 3000 usd per unit for rack-mounted with dual psu. But let's see what my ISP says over the weekend, maybe they have some way to provide me this in a more flat way so I can just have these two fw and save power/rack-space and cabling. I suspect it is just a config change at their end.

Nope, if they have two redundant boxes with CARP or VRRP or HSRP, and you want to do the same, you need a flat intermediate network. Which can be one switch or two.

Alternatively you can of yourse use routing, e.g. BGP. But that's an entirely different setup.

Why not use your two stacking switches you already bought for LAN and create another VLAN over four ports (two on each) to connect your ISP systems and your WAN interfaces? That's the beauty of modern datacentre/enterprise gear: you do not need a physical box for each job.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

I know they already do BGP for me. They asked me if I wanted to set it up myself or if they should take care of it.  Since I have a /24 I have bought and they route it for me somehow. But maybe in different context than you talk about, not sure. It's greek to me :)

I also found this in my email, when I asked them for some details (anonymized the IP using chat gpt):

"Link network: 203.0.120.184/29
We use 203.0.120.185 (HSRP), 203.0.120.186 (e01), and 203.0.120.187 (e02).
You should use 203.0.120.188 (HSRP/VRRP or equivalent), 203.0.120.189, and 203.0.120.190.

We route 198.51.101.0/24 behind 203.0.120.188 with tracking of interface line protocol, meaning the route will only be active internally and in BGP if the port to you is up. The network will therefore not be visible on the internet until you have connected the links."

They are doing an equivalent of CARP so both your WAN interfaces and both their links must share a single network. Single switch to get it up and running, then consider redundant variants.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Ah, I see. Is it as simple as this if I choose to use the VLAN-method on my stacked LAN-switches? From Chat GPT  ;D

Ports for ISP Lines:

Connect one ISP line to port 24 on Switch 1.
Connect the second ISP line to port 24 on Switch 2.

Ports for Firewall WAN Interfaces:

Connect port 25 on Switch 1 to Firewall 1's WAN interface.
Connect port 25 on Switch 2 to Firewall 2's WAN interface.

VLAN Configuration:

Create a dedicated VLAN for your WAN traffic (e.g., VLAN 10).
Assign ports 24 and 25 on both switches to VLAN 10.
Configure these ports as untagged (access mode) for VLAN 10 since ISP lines typically do not tag traffic.

Stacked Switch Behavior:

Ensure the switches are correctly stacked and function as a single logical unit, so VLAN 10 spans both switches seamlessly.

In this case, I can basically run the setup and IPs I already have.

Yes, exactly like this.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

November 30, 2024, 07:56:03 PM #29 Last Edit: November 30, 2024, 08:08:52 PM by firewallfun
Ignore the sync-thing here below, you have explained to me that it doesn't involve the VIP, so it hasn't anything to do with the WAN-network as such. But here is what ISP said:

"CARP works such that you have two IP interfaces that continuously communicate with each other to check if the other side is present, so there must be a physical Layer 2 Ethernet network between the two boxes. I don't know what the sync link is and what it's used for, but if it's a regular Ethernet connection between the boxes and the boxes communicate Ethernet over it, it should work with that cable. "

They recommend two unmanaged switches to create that flat network as you say - since they have routers on their side that senses what line is active etc. But they are not familiar with Opnsense and how it works. Isn't there any way to create a way for both boxes to see which one is active? Like if I create a direct connection on another port and bridge the two fw and the ports.. maybe not possible, just wanted a last try to rescue me from switch-solution :) I think I will just go for two new switches on WAN.

If I was a programmer, I would think it was easy to constantly ping my CARP-IP. If it is is not active (no ping reply), then make this backup-fw primary and make the CARP-IP active. Until I hear from the master via the sync-interface or a seperate line, then deactivate it. Why is it so hard to do :) Why not communicate signals like this on a seperate cable or just use the sync-interface as it is already in constant internal traffic. I don't get it. I wouldn't mind if it took a minute instead of seconds even, in case it needed to be sure it is really down.