Hi,
as we updated and tested the functionality of our test cluster from version 19.1.6 to 19.1.8 and after that our productive cluster from 19.1.4 to 19.1.8, we noticed that there is a little problem with switching the CARP roles. After we finished updating both nodes in each cluster we wanted to know, if the role switching behavior works as before (when MASTER is set into maintenance mode he becomes BACKUP). So node 1 (MASTER) went into maintenance mode but unfortunately stayed Master for this Cluster. Deactivating CARP on this node and activating it again didn't make it work. The only thing that helped us having a normal switching behavior again when one of the nodes is set into maintenance mode was changing the skew of the advertising frequency of one VIP on the MASTER node from 0 to 1 and then back from 1 to 0 again.
Maybe this workaround helps somebody with the same problem.
Greeetz,
bitmusician
There was a change in 19.1.8 indeed, was it one time or is it reproducable?
As I wrote i firstly noticed the problem on our test cluster (which is not a copy of the productive system) and then on our productive cluster too. It should be reproducable on any cluster after updating to 19.1.8 .
I tested this successfully in dev, maybe you have configures some tunables manually?
Check here the details:
https://github.com/opnsense/core/issues/3163
We didn't make any changes in the tunables and we also did not have packet loss.
Since I did the workaround we don't have the problem anymore.
The likely candidate is actually https://github.com/opnsense/core/commit/c5d6b6cacf but it would indicate you are relying on policy routing for a CARP setup which really shouldn't have it (best to use a dedicated CARP link).
# opnsense-patch c5d6b6cacf
Cheers,
Franco
I had exactly the same symptoms, including that disabling CARP and reenabling did not help. Pfsync bulk was successful and skew got to 0 but it did not become master again.
Instead of the workaround I just rebooted it and it became master again.
Hi,
I can confrim the issue exists. We're experiencing this behaviour on both of our Production-Clusters since upgrading to 19.1.8. I tried setting our secondary to "persistent carp maintenance mode", which usually makes the primary node master again, but this also failed. I'll reboot the secondary after work, to make it slave again.
Cheers,
Wayne
Quote from: Wayne Train on May 27, 2019, 10:50:53 AM
Hi,
I can confrim the issue exists. We're experiencing this behaviour on both of our Production-Clusters since upgrading to 19.1.8. I tried setting our secondary to "persistent carp maintenance mode", which usually makes the primary node master again, but this also failed. I'll reboot the secondary after work, to make it slave again.
Cheers,
Wayne
How many carp IPs do you have and which type?
My test cluster has 2 VIPs, both static (LAN, WAN), I can successfully switch forth and back.
My setup has 3 VIP (WAN, LAN, DMZ) + an Alias IP on WAN and has Problems with switching between Firewalls.
Please, on machine 1 and 2 a "sysctl -a | grep carp", before, and after turning into mnt mode, and then when back.
On my side it looks good:
root@OPNsense1:~ # sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
root@OPNsense1:~ # sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 240
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
root@OPNsense1:~ # sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
root@OPNsense2:~ # sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
root@OPNsense2:~ # sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
root@OPNsense2:~ # sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
Hi,
we use 11 CARP-VIPs., one for each VLAN.
Cheers,
Wayne
Quote from: Wayne Train on May 27, 2019, 04:56:18 PM
Hi,
we use 11 CARP-VIPs., one for each VLAN.
Cheers,
Wayne
sysctl like above from you too please. Can't track this down without debugging ...
coming from
Before maintenance:
#1
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
#2
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
#1 in persistent maintenance
#1
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 240
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
#2
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
#1 left perstitent maintenance
#1
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
#2
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
(https://forum.opnsense.org/index.php?topic=12943.0%5B/url)
@mimugmail as I interpret you're looking at the *primary*
net.inet.carp.demotion: 240
so it should be the same at my side, as on your side. Did you check if Master / Backup was correctly display in the webui?
- currently it looks, like the secondary is every time master not backup as in 19.1.7
- Tried to edit Virtual IP Setting and re-save it
- HAVE TO CONFIRM THAT: both become master on this IP regardless of the advertising Frequency settings
So, you left mnt mode and both were master and had the sysctl from above? Strange. Then go to CLI, do a clog -f /var/log/system.log and post the new lines when leaving mnt mode
Quote from: katamadone [CH] on July 04, 2019, 10:42:35 AM
- currently it looks, like the secondary is every time master not backup as in 19.1.7
- Tried to edit Virtual IP Setting and re-save it
- HAVE TO CONFIRM THAT: both become master on this IP regardless of the advertising Frequency settings
We are facing the same issue right now. In the GUI it looks like both nodes are Master after setting the first node into persistant CARP Maintenance mode but when we reboot this node the other "Master" (the actual Backup node) doesn't answer the Requests to the VIPs.
Is there already a solution?
You can help too:
https://forum.opnsense.org/index.php?topic=12832.msg61681#msg61681
Quote from: mimugmail on July 06, 2019, 09:54:02 AM
So, you left mnt mode and both were master and had the sysctl from above? Strange. Then go to CLI, do a clog -f /var/log/system.log and post the new lines when leaving mnt mode
node01 (normally the MASTER):
when switching into maintenance mode:Jul 10 12:53:30 node01 kernel: carp: demoted by 240 to 240 (sysctl)
when switching out of maintenance mode:Jul 10 12:54:42 node01 kernel: carp: demoted by -240 to 0 (sysctl)
-----------------------------------------------------
node02 (normally the BACKUP):
when switching into maintenance mode:Jul 10 12:53:30 node02 kernel: carp: 31@igb0: BACKUP -> MASTER (preempting a slower master)
Jul 10 12:53:31 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "xxx.xxx.xxx.xxx - VIP WAN (31@igb0)" has resumed the state "MASTER" for vhid 31
Jul 10 12:53:31 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Starting OpenVPN server instance on xxx.xxx.xxx.xxx - VIP WAN because of transition to CARP master.
Jul 10 12:53:31 node02 kernel: ovpns1: link state changed to DOWN
Jul 10 12:53:35 node02 kernel: ovpns1: link state changed to UP
Jul 10 12:53:35 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: OpenVPN server 1 instance started on PID 34541.
Jul 10 12:53:36 node02 opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ovpns1'
Jul 10 12:53:36 node02 opnsense: /usr/local/etc/rc.newwanip: Interface '' is disabled or empty, nothing to do.
when switching out of maintenance mode:
Jul 10 12:54:41 node02 kernel: carp: 31@igb0: MASTER -> BACKUP (more frequent advertisement received)
Jul 10 12:54:41 node02 kernel: ifa_maintain_loopback_route: deletion failed for interface igb0: 3
Jul 10 12:54:42 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "xxx.xxx.xxx.xxx - VIP WAN (31@igb0)" has resumed the state "BACKUP" for vhid 31
I had to revert, couldn't leave that in production. Sorry.
I'll have to start over soon and try again.
Quote from: katamadone [CH] on July 11, 2019, 03:48:15 PM
I had to revert, couldn't leave that in production. Sorry.
I'll have to start over soon and try again.
Did you have the problem already in 19.1.8 or only after updating to 19.1.10?
Quote from: bitmusician on July 10, 2019, 03:46:15 PM
Quote from: mimugmail on July 06, 2019, 09:54:02 AM
So, you left mnt mode and both were master and had the sysctl from above? Strange. Then go to CLI, do a clog -f /var/log/system.log and post the new lines when leaving mnt mode
node01 (normally the MASTER):
when switching into maintenance mode:
Jul 10 12:53:30 node01 kernel: carp: demoted by 240 to 240 (sysctl)
when switching out of maintenance mode:
Jul 10 12:54:42 node01 kernel: carp: demoted by -240 to 0 (sysctl)
-----------------------------------------------------
node02 (normally the BACKUP):
when switching into maintenance mode:
Jul 10 12:53:30 node02 kernel: carp: 31@igb0: BACKUP -> MASTER (preempting a slower master)
Jul 10 12:53:31 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "xxx.xxx.xxx.xxx - VIP WAN (31@igb0)" has resumed the state "MASTER" for vhid 31
Jul 10 12:53:31 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Starting OpenVPN server instance on xxx.xxx.xxx.xxx - VIP WAN because of transition to CARP master.
Jul 10 12:53:31 node02 kernel: ovpns1: link state changed to DOWN
Jul 10 12:53:35 node02 kernel: ovpns1: link state changed to UP
Jul 10 12:53:35 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: OpenVPN server 1 instance started on PID 34541.
Jul 10 12:53:36 node02 opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ovpns1'
Jul 10 12:53:36 node02 opnsense: /usr/local/etc/rc.newwanip: Interface '' is disabled or empty, nothing to do.
when switching out of maintenance mode:
Jul 10 12:54:41 node02 kernel: carp: 31@igb0: MASTER -> BACKUP (more frequent advertisement received)
Jul 10 12:54:41 node02 kernel: ifa_maintain_loopback_route: deletion failed for interface igb0: 3
Jul 10 12:54:42 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "xxx.xxx.xxx.xxx - VIP WAN (31@igb0)" has resumed the state "BACKUP" for vhid 31
I don't get it ... from reading the logs after switching off mnt mode second machine should be backup???
Quote from: mimugmail on July 12, 2019, 07:22:23 AM
I don't get it ... from reading the logs after switching off mnt mode second machine should be backup???
Yes when switching it off its Backup again. But when I turn on maintenance mode on the first node in the GUI both are shown as Master and nobody answers the requests to the VIP.
Quote from: bitmusician on July 12, 2019, 08:13:41 AM
Quote from: mimugmail on July 12, 2019, 07:22:23 AM
I don't get it ... from reading the logs after switching off mnt mode second machine should be backup???
Yes when switching it off its Backup again. But when I turn on maintenance mode on the first node in the GUI both are shown as Master and nobody answers the requests to the VIP.
And when you revert to 19.1.7 it's working again?
Quote from: bitmusician on July 12, 2019, 07:10:27 AM
Quote from: katamadone [CH] on July 11, 2019, 03:48:15 PM
I had to revert, couldn't leave that in production. Sorry.
I'll have to start over soon and try again.
Did you have the problem already in 19.1.8 or only after updating to 19.1.10?
had the problem already from 19.1.7 -> 19.1.8
I tend to try it again. I'm not so happy that I'm on OPNsense 19.1.7.
@mimugmail do you have some tips what I should check
#1
******************************************************
NOMAINTENANCE PRIMARY
******************************************************
sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
******************************************************
NOMAINTENANCE SECONDARY AFTER BOOT
******************************************************
sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
#2
******************************************************
MAINTENANCE PRIMARY
******************************************************
<6>carp: 1@vmx2_vlan605: MASTER -> INIT (hardware interface up)
<6>carp: 1@vmx2_vlan605: INIT -> BACKUP (initialization complete)
<6>carp: 2@vmx2_vlan621: MASTER -> INIT (hardware interface up)
<6>carp: 2@vmx2_vlan621: INIT -> BACKUP (initialization complete)
<6>carp: 3@vmx2_vlan622: MASTER -> INIT (hardware interface up)
<6>carp: 3@vmx2_vlan622: INIT -> BACKUP (initialization complete)
<6>carp: 4@vmx2_vlan623: MASTER -> INIT (hardware interface up)
<6>carp: 4@vmx2_vlan623: INIT -> BACKUP (initialization complete)
<6>carp: 5@vmx2_vlan624: MASTER -> INIT (hardware interface up)
<6>carp: 5@vmx2_vlan624: INIT -> BACKUP (initialization complete)
<6>carp: 6@vmx2_vlan625: MASTER -> INIT (hardware interface up)
<6>carp: 6@vmx2_vlan625: INIT -> BACKUP (initialization complete)
<6>carp: 7@vmx2_vlan626: MASTER -> INIT (hardware interface up)
<6>carp: 7@vmx2_vlan626: INIT -> BACKUP (initialization complete)
<6>carp: 8@vmx2_vlan627: MASTER -> INIT (hardware interface up)
<6>carp: 8@vmx2_vlan627: INIT -> BACKUP (initialization complete)
<6>carp: 9@vmx2_vlan628: MASTER -> INIT (hardware interface up)
<6>carp: 9@vmx2_vlan628: INIT -> BACKUP (initialization complete)
<6>carp: 11@vmx2_vlan630: MASTER -> INIT (hardware interface up)
<6>carp: 11@vmx2_vlan630: INIT -> BACKUP (initialization complete)
<6>carp: 13@vmx2_vlan606: MASTER -> INIT (hardware interface up)
<6>carp: 13@vmx2_vlan606: INIT -> BACKUP (initialization complete)
<6>carp: 14@vmx2_vlan611: MASTER -> INIT (hardware interface up)
<6>carp: 14@vmx2_vlan611: INIT -> BACKUP (initialization complete)
<6>carp: 15@vmx2_vlan602: MASTER -> INIT (hardware interface up)
<6>carp: 15@vmx2_vlan602: INIT -> BACKUP (initialization complete)
<6>carp: 16@vmx2_vlan107: MASTER -> INIT (hardware interface up)
<6>carp: 16@vmx2_vlan107: INIT -> BACKUP (initialization complete)
<6>carp: 17@vmx2_vlan682: MASTER -> INIT (hardware interface up)
<6>carp: 17@vmx2_vlan682: INIT -> BACKUP (initialization complete)
<6>carp: 20@vmx2_vlan607: MASTER -> INIT (hardware interface up)
<6>carp: 20@vmx2_vlan607: INIT -> BACKUP (initialization complete)
<6>carp: 21@vmx0: MASTER -> INIT (hardware interface up)
<6>carp: 21@vmx0: INIT -> BACKUP (initialization complete)
<6>carp: 23@vmx1: MASTER -> INIT (hardware interface up)
<6>carp: 23@vmx1: INIT -> BACKUP (initialization complete)
<6>carp: 25@vmx2_vlan631: MASTER -> INIT (hardware interface up)
<6>carp: 25@vmx2_vlan631: INIT -> BACKUP (initialization complete)
<6>carp: 26@vmx2_vlan632: MASTER -> INIT (hardware interface up)
<6>carp: 26@vmx2_vlan632: INIT -> BACKUP (initialization complete)
<6>carp: 27@vmx2_vlan700: MASTER -> INIT (hardware interface up)
<6>carp: 27@vmx2_vlan700: INIT -> BACKUP (initialization complete)
<6>carp: 28@vmx2_vlan701: MASTER -> INIT (hardware interface up)
<6>carp: 28@vmx2_vlan701: INIT -> BACKUP (initialization complete)
<6>carp: 29@vmx2_vlan702: MASTER -> INIT (hardware interface up)
<6>carp: 29@vmx2_vlan702: INIT -> BACKUP (initialization complete)
<6>carp: 30@vmx2_vlan703: MASTER -> INIT (hardware interface up)
<6>carp: 30@vmx2_vlan703: INIT -> BACKUP (initialization complete)
<6>carp: 31@vmx2_vlan704: MASTER -> INIT (hardware interface up)
<6>carp: 31@vmx2_vlan704: INIT -> BACKUP (initialization complete)
<6>carp: 32@vmx2_vlan705: MASTER -> INIT (hardware interface up)
<6>carp: 32@vmx2_vlan705: INIT -> BACKUP (initialization complete)
<6>carp: 33@vmx2_vlan703: MASTER -> INIT (hardware interface up)
<6>carp: 33@vmx2_vlan703: INIT -> BACKUP (initialization complete)
<6>carp: 34@vmx2_vlan704: MASTER -> INIT (hardware interface up)
<6>carp: 34@vmx2_vlan704: INIT -> BACKUP (initialization complete)
<6>carp: 35@vmx2_vlan705: MASTER -> INIT (hardware interface up)
<6>carp: 35@vmx2_vlan705: INIT -> BACKUP (initialization complete)
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
******************************************************
NOMAINTENANCE Secondary
******************************************************
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
ENTERING & LEAVING Maintenance works
BUT I'm very confused on the demotion on the master --> shouldn't it be 240?? (Last time it was)
***BOOTED***
******************************************************
MAINTENANCE PRIMARY (after boot)
******************************************************
<6>carp: 30@vmx2_vlan703: INIT -> BACKUP (initialization complete)
<6>carp: 31@vmx2_vlan704: INIT -> BACKUP (initialization complete)
<6>carp: 32@vmx2_vlan705: INIT -> BACKUP (initialization complete)
<6>carp: 33@vmx2_vlan703: INIT -> BACKUP (initialization complete)
<6>carp: 34@vmx2_vlan704: INIT -> BACKUP (initialization complete)
<6>carp: 35@vmx2_vlan705: INIT -> BACKUP (initialization complete)
<6>carp: demoted by 240 to 240 (pfsync bulk start)
<6>carp: demoted by -240 to 0 (pfsync bulk done)
<118>>>> Invoking start script 'carp'
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
------> LEAVE Perstistent
Primary stays backup on all VLANs
#3
******************************************************
NOMAINTENANCE PRIMARY
******************************************************
<118>>>> Invoking start script 'carp'
<6>carp: demoted by 0 to 0 (sysctl)
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
******************************************************
NOMAINTENANCE Secondary
******************************************************
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
Getting Back to the VMWARE snapshot with 19.1.7 firewall is switching back properly to MASTER
Things to mention:
- 29 CARP entries on separate VLANS
- Both Firewalls are on VSphere (ESXi) 6.5
- There's a separate Interface for pfSYNC
- And CARP traffic is allowed via Floating Rule (first match)
some other logs:
https://pastebin.com/rSCZGWy2
I'm confused by the (master timed out) messages, but all seems working as expected at this moment.
Are you sure you pushed button "Enter persitent maintainence mode" (would generate an other message like slower advertisements)?
For me it seems you pushed "Disable CARP" (master timeout).
nope, very sure - tried it multiple times
System log of master when you entering mnt mode please
I found a HA cluster running 19.1.7 live, I compared sysctl before, while and after mnt mode .. on master and slave, for 19.1.7, 19.1.10 and 19.7.2. It worked as expected on all version, no stucks, nothing.
I can upload all files if needed.
damn.. thats so strange - currently have no idea
Is that HA cluster virtualized on ESXi?
No, Hardware, old school :8
I've got the feeling that something did change with that whole
- ESXi
- Portgroup
- Net.ReversePathFwdCheckPromisc
- Promicious Mode
- LAG / no LAG
- LoadBalancing Hash based / Originating interface
But what I don't understand is what changed in that case with 19.1.8
In 19.1.7 maintenance mode was set some way of disable carp, with this you had a 3 times packet loss within migration time. In 19.1.8 this was changed to higher the demotion on the master to force a switch, which works without packet loss.
If your demotions are not even from the beginning it wont work, so 99% of the times it's a config error.
https://github.com/opnsense/core/issues/3671#issuecomment-525004560
All these fixes until 19.7.4_1 did the trick. Testing a little bit more today but now it's looking good!
And even the necessary reboot is fixed now :)