High Availability pfsync and DNS issues

Started by erje, April 15, 2020, 06:36:56 PM

Previous topic - Next topic
Hello community,

For several weeks now, I tried to setup a fully working HA setup with two APU4d boards. I got to a point where I no longer know what to look for.

This is what I try to setup:
<Image 1: schema>


What is working:

- The configuration is synced from the Master to the Backup node. This was working automatically with OPNsense 19.x but since I upgraded to v20.1.4 it seems I have to force the sync manually. Or I am not patience enough?
- State sync is working. When I pull the Master LAN, the Backup LAN becomes Master. Same thing when I pull the WAN.


What is not working:

- When I pull Master WAN, internet connection is lost. Only when I also pull the Master LAN, internet works again. I am guessing I am missing a firewall rule for the PFSYNC?

<Image 2: Firewall rules PFSYNC>

- I don't have a DNS lookup unless I change the DNS server in [Services]-[DHCPv4]-[LAN]. But I understand that I should enter the LAN VIP? When I do, nothing gets resolved. When I enter google DNS (8.8.8.8 ) it works.

<Image 3: DHCP settings>

While trying several configuration changes, occasionally thought I had it working until it stopped working again. I think caches or existing connections or something else got me tricked. Is there anything I should reset/flush after making (DNS) changes other then requesting a new DHCP release?

I also noticed that the Unbound enable switch is not synchronized between Master and Backup. Is this correct behavior?

I am not 100% sure about the NAT outbound settings. I included a picture of my settings too.
<Image 4: Firewall NAT Outbound>

Any push into the right direction would be very much appreciated!

Thanks,
Robbert

Quote- The configuration is synced from the Master to the Backup node. This was working automatically with OPNsense 19.x but since I upgraded to v20.1.4 it seems I have to force the sync manually. Or I am not patience enough?

Unfortunately too many users used low-end hardware for OPNsense and their gui got unresponsive during these automatic sync.
Instead of making it an option or depending on hardware, with version 20.x this enterprise-like feature was removed.

Everybody has to care for himself with manual syncs that slave is synced  :(

I hope somebody will fork or write a plug-in to get this feature back
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (24 cores)
256 GB RAM, 300GB RAID1, 3x4 10G Chelsio T540-CO-SR

@hbc, thanks for clarifying that up! and sorry for the late response. I've had to park this project in the fridge for a while. Though it doesn't look like I've missed a lot on this topic :-)

I'm not to sure if the HA feuture is a regular used option in OPNsense. There is very little info I can find and so far I've burned many hours with trial on error.