OPNsense Forum

Archive => 20.1 Legacy Series => Topic started by: Nico on March 09, 2020, 10:48:58 pm

Title: HA sync not automatically anymore
Post by: Nico on March 09, 2020, 10:48:58 pm
Hi,

I'm a little confused. I just set up a new pair of OPNsense firewalls in HA mode as usual but they do not seem to sync automatically anymore. I have to manually sync the state, otherwises users, firewall rules or CARP settings are not synched anymore. Is this intended or did I miss something?
I have setup dozen of those before and the configuration doesn't really differ this time from previous setups.

Thanks!
Title: Re: HA sync not automatically anymore
Post by: mimugmail on March 10, 2020, 05:53:57 am
It is intended to speedup gui. There should be a blue notice popping Up on every change
Title: Re: HA sync not automatically anymore
Post by: hbc on March 10, 2020, 10:06:46 am
Yes, the missing auto-sync is really one step backwards. I forget sync so often. I pray to god, that failover never occurs in this unsync'ed moments and chaos starts.

A blue pop-up? There should be a big, big permanent reminder until sync is really done.
Title: Re: HA sync not automatically anymore
Post by: Nico on March 10, 2020, 01:09:32 pm
I've never seen such a popup. I consider this to be a step backwards as well. Why don't make it configureable? But actually I never saw slow speeds before, syncing was always fast. And I have no idea why most stuff synced automatically before but other things like the alias lists had to be synced manually. So at least the "all or nothing" approach makes more sense to maintain a healthy state but I'd like to see the auto-sync coming back.
Title: Re: HA sync not automatically anymore
Post by: mimugmail on March 10, 2020, 02:51:52 pm
No, the UI became super slow when backup unit is unresponsive.
Title: Re: HA sync not automatically anymore
Post by: mfedv on March 11, 2020, 03:32:34 pm
Hi,

here is a "me-too" to make this configurable (ok if it defaults to manual sync).

I can imagine working around the slow GUI thing and at the same time support auto-sync active would be a rather complex task. BTW, I haven't yet found a way to just check if everything has been synced.

Still I prefer auto-sync. So far I have not encountered the slow GUI problem, and there are some problems with manual-only sync.

Nudging for a manual sync is not yet complete, e.g. scrub rules don't have it.

It is easy to forget the manual sync if you work on a mix of HA and non-HA installations, and I think that apart from initial setup the workflow for normal filter configuration should not be different between HA and non-HA installations.

One (admittedly rare) corner-case caught by auto-sync:
 - change some filter rules, but do not hit "Apply"
 - reboot master fw
=> changes are applied on master fw after reboot, but not replicated to slave fw, and no more nudging in the GUI

With auto-sync in 19.7.x, the changes were replicated to the slave fw after reboot of the master fw.

Regards
Matthias
Title: Re: HA sync not automatically anymore
Post by: mimugmail on March 11, 2020, 04:39:36 pm

Still I prefer auto-sync. So far I have not encountered the slow GUI problem, and there are some problems with manual-only sync.


Change IP of slave (sync) IP and add a firewall rule .. not slow on your side?
Which problems did you encounter on manual sync beside forgetting to sync?
Title: Re: HA sync not automatically anymore
Post by: mfedv on March 11, 2020, 05:35:18 pm
Note that I do not deny existance of "slow GUI" problem, I only said I have not encountered it so far.

The case I described does not happen regularly :-) But your example of changing slave fw address without first temporarily disabling XMLRPC sync is just calling for trouble.

People may just have different concepts of what a "cluster" is and how it works or how it should work.
E.g. pacemaker (HA-clustering for server applications) only does cluster-wide config commits.
That also was how things worked with opnsense until 20.x.

Also this change was not communicated very well. I spent quite some time debugging a new installation that just wouldn't sync, only when manually forced. The changelog only says "removed legacy xmlrpc push" (in a release candidate version changelog), and only now do I know that what I took for granted was actually just legacy functionality.

Life will continue without auto-sync, but I would still prefer to have it.
Title: Re: HA sync not automatically anymore
Post by: mimugmail on March 11, 2020, 07:24:32 pm
Every time you add an alias, firewall rule or vip there comes a blue Info bar notifying your config is not yet synced, isnt it so?

I am neither pro nor con the removal, but when core dev decide to do so chance is high there is a good reason for it :)
I think they are deacribed in a core issue in GitHub
Title: Re: HA sync not automatically anymore
Post by: hbc on March 12, 2020, 07:43:34 pm
Make it an option.

This manual sync is really a showstopper. Not everybody uses weak consumer mini pcs for OPNsense. Companies like us run it on multi-core 19" server for thousands of users and never had slow gui issues.

But the issue that would occur in case of failover when the ruleset is unsynced will be much greater than a slow gui.
Title: Re: HA sync not automatically anymore
Post by: mimugmail on March 12, 2020, 09:44:43 pm
Here is some old discussion (but please don't write in this ticket):
https://github.com/opnsense/core/issues/3635
Title: Re: HA sync not automatically anymore
Post by: hbc on March 12, 2020, 10:07:42 pm
Is the cronjob implemented? Then I can set it to 5 mins... But an option for autosync is still a better solution and more Enterprise solution like.
Title: Re: HA sync not automatically anymore
Post by: AdSchellevis on March 13, 2020, 09:19:44 am
No it's not, but all required components are nicely laid out in the docs https://docs.opnsense.org/

Enterprises can also opt for commercial support to reach their goals by the way.....

Automatic sync won't come back for obvious reasons, it was never complete (and theoretically can't be given the way its implemented)  and always left you in an uncertain state.
Some services restarted, some not, no way to tell which ones where ok, and quite some locking issues as described in the ticket (yes, also on large enterprises, it really doesn't matter how big the box is if you can't reach it).

Most large enterprises have procedures for updating equipment including rollback plans, which is where our new approach nicely fits in by the way.

Ways forward are pretty simple, one can contribute and maintain a plugin to keep track of config changes and take action when needed or contribute a pull request in core (as long as it solves all concerns mentioned earlier), the first one is usually the easiest one to start with.

Just keep in mind, the world changes every day, the current sync mechanism might be replaced with something new in the future as well.

Best regards,

Ad