OPNsense Forum

English Forums => High availability => Topic started by: petersen on March 18, 2024, 03:12:29 PM

Title: Problems when enabling "Synchronize States"
Post by: petersen on March 18, 2024, 03:12:29 PM
Hello,

We would like to use OPNsense with High Availability, but keep running into the following problem during setup.

We are using two identical hardware systems with OPNsense version 24.1.3_1.

The following sources were used as instructions:
- https://docs.opnsense.org/manual/how-tos/carp.html
- https://www.thomas-krenn.com/en/wiki/OPNsense_HA_Cluster_configuration (it's a German website)
- https://www.youtube.com/watch?v=I5n3QXOlxmw

Up to the step "Setup pfSync and HA sync (xmlrpc)" everything works without any problems.

The firewalls communicate with each other.
I can send a ping to 1.1.1.1 and get a response.
I can switch off one firewall and the other firewall takes over immediately.
Everything works as it should.

However, as soon as I check the "Synchronize States" checkbox under "System > High Availability > Settings", it no longer works.
Under "System > High Availability > Status" I get the message "The backup firewall is not accessible or not configured" after waiting a while.
The ping to 1.1.1.1 is lost if the master firewall is not available.

As soon as I remove the tick from the "Synchronize States" checkbox, it works again without any problems.
Firewall 2 takes over if Firewall 1 is not available and vice versa.


I have configured the corresponding interfaces on both firewalls.
I have created the rules for both the sync interface with "Allow all" on both firewalls, as well as a rule for the CARP protocol on the WAN and LAN interface.
I have created the corresponding VIPs on both firewalls.
I have created NAT on both firewalls.


Which settings am I overlooking?

Thank you for any help! If any further information is needed, I will try to provide it.
Title: Re: Problems when enabling "Synchronize States"
Post by: Patrick M. Hausen on March 18, 2024, 03:28:22 PM
Use tcpdump to trace the packets on the HA link between both systems. That should give you a hint about what exactly is failing.

Anything special about the HA link? Is it a dedicated interface? Is it just a patch cable or is a switch involved? Are you using the default multicast address for pfsync?
Title: Re: Problems when enabling "Synchronize States"
Post by: petersen on March 21, 2024, 10:57:28 AM
Hi Patrick,

thank you for your answer.

The HA link is on a dedicated interface with a cable directly between the two firewalls. I am using the address of the other firewall for pfsync.

Unfortunately I'm still learning how to use and interpret tcpdump, but maybe you can help?
I have put the packet capture from the master firewall in the attachments from the moment when I enable "Synchronize States".
Title: Re: Problems when enabling "Synchronize States"
Post by: mimugmail on March 21, 2024, 11:03:01 AM
A screenshot of sync state section on both firewalls would be great. Also the log when applying
Title: Re: Problems when enabling "Synchronize States"
Post by: petersen on March 21, 2024, 11:23:42 AM
Hi mimugmail,

i have attached 3 files. Two screenshots of the sync state section and one of the log when applying.

Or do you want a different log? If yes, please specify what log you wanna see.

Thank you for your help!
Title: Re: Problems when enabling "Synchronize States"
Post by: mimugmail on March 21, 2024, 11:31:04 AM
When "Informational" instead of "Notice" there is nothing more on master and backup?
Also a "dmesg -a" via CLI of both systems would be good
Title: Re: Problems when enabling "Synchronize States"
Post by: petersen on March 21, 2024, 11:41:49 AM
There is nothing more. There is nothing in the informational log.

The result of the "dmesg -a" is in the attached screenshot.
Title: Re: Problems when enabling "Synchronize States"
Post by: mimugmail on March 21, 2024, 08:52:06 PM
Can you try it without lagg?
Title: Re: Problems when enabling "Synchronize States"
Post by: petersen on March 22, 2024, 11:05:43 AM
Hi mimugmail,

i have removed the lagg and now use a single direct cable connection between the two firewalls but still have the same problem  :(
Title: Re: Problems when enabling "Synchronize States"
Post by: anomaly0617 on March 25, 2024, 01:26:18 AM
Might be a long shot, but under Firewall >> Rules >> [Interface], do you have a rule to allow "IPv4 CARP" traffic from the "[Interface] net" to any port, any destination, any gateway, on any schedule?

If so, what happens if you create that same rule but under Firewall >> Rules >> Floating and check it for all interfaces other than your WAN interface (because you really don't want CARP traffic from the internet)?

When I saw this, it turned out I wasn't thinking in the right perspective on how the firewalls were communicating the Synchronization statuses, and after doing the floating rule and it all of a sudden worked like a charm, I deleted it, started creating individual rules for each interface (the copy/clone button is a godsend), and then disabling them one by one until I figure out what was going on.

Hope this helps!
Title: Re: Problems when enabling "Synchronize States"
Post by: petersen on March 25, 2024, 11:22:02 AM
Hi anomaly0617,

i have double and triple checked the firewall rules. I allow all IPv4 CARP traffic on all interfaces. On the pfSync Interface I have the rule that allows all traffic.

I have these rules on both firewalls but I still have the same problem :(
Title: Re: Problems when enabling "Synchronize States"
Post by: petersen on April 05, 2024, 12:41:25 PM
Hi,

It looks like I've solved the problem.

Yesterday I tried to set up High Availability on another hardware machine to rule out a hardware problem. After booting OPNsense from a live stick, I was able to set up High Availability and it worked.

The difference: The live stick is running version 23.1.
The other system was running 23.1.3.1.

So I reinstalled OPNsense and set up High Availability. But again the same problem. But possible on the system with a live stick. So the problem must be with the hard disk?
So I completely wiped the hard disk with an external tool and reinstalled OPNsense. Again the same problem...
But this time I had a working config from the live stick. So I imported it, restarted the system and it works?

Why? What am I missing?


At least I can now continue to test the functions and operation of High Availability.

Thanks to those who tried to help and the input!
Title: Re: Problems when enabling "Synchronize States"
Post by: itngo on November 05, 2024, 12:14:09 PM
Sorry to bring this up again,
we have the same issue here. State-Sync enabled on master and slave brings "split-brain" after some days. Disable state-sync system is smooth as butter.....

We are using Unicast-VIP but this issue exists even before 24.10_7 with multicast....