OPNsense Forum

Archive => 19.1 Legacy Series => Topic started by: bitmusician on May 23, 2019, 11:17:10 am

Title: CARP role doesn't switch properly after updating to 19.1.8
Post by: bitmusician on May 23, 2019, 11:17:10 am
Hi,
as we updated and tested the functionality of our test cluster from version 19.1.6 to 19.1.8 and after that our productive cluster from 19.1.4 to 19.1.8, we noticed that there is a little problem with switching the CARP roles. After we finished updating both nodes in each cluster we wanted to know, if the role switching behavior works as before (when MASTER is set into maintenance mode he becomes BACKUP). So node 1 (MASTER) went into maintenance mode but unfortunately stayed Master for this Cluster. Deactivating CARP on this node and activating it again didn't make it work. The only thing that helped us having a normal switching behavior again when one of the nodes is set into maintenance mode was changing the skew of the advertising frequency of one VIP on the MASTER node from 0 to 1 and then back from 1 to 0 again.

Maybe this workaround helps somebody with the same problem.

Greeetz,
bitmusician
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on May 23, 2019, 11:58:34 am
There was a change in 19.1.8 indeed, was it one time or is it reproducable?
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: bitmusician on May 23, 2019, 12:14:24 pm
As I wrote i firstly noticed the problem on our test cluster (which is not a copy of the productive system) and then on our productive cluster too. It should be reproducable on any cluster after updating to 19.1.8 .
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on May 23, 2019, 01:02:18 pm
I tested this successfully in dev, maybe you have configures some tunables manually?
Check here the details:
https://github.com/opnsense/core/issues/3163
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: bitmusician on May 23, 2019, 01:25:58 pm
We didn't make any changes in the tunables and we also did not have packet loss.
Since I did the workaround we don't have the problem anymore.
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: franco on May 23, 2019, 01:33:37 pm
The likely candidate is actually https://github.com/opnsense/core/commit/c5d6b6cacf but it would indicate you are relying on policy routing for a CARP setup which really shouldn't have it (best to use a dedicated CARP link).

# opnsense-patch c5d6b6cacf


Cheers,
Franco
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: ruffy91 on May 23, 2019, 08:36:31 pm
I had exactly the same symptoms, including that disabling CARP and reenabling did not help. Pfsync bulk was successful and skew got to 0 but it did not become master again.
Instead of the workaround I just rebooted it and it became master again.
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: Wayne Train on May 27, 2019, 10:50:53 am
Hi,

I can confrim the issue exists. We're experiencing this behaviour on both of our Production-Clusters since upgrading to 19.1.8. I tried setting our secondary to "persistent carp maintenance mode", which usually makes the primary node master again, but this also failed. I'll reboot the secondary after work, to make it slave again.

Cheers,
Wayne
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on May 27, 2019, 02:49:57 pm
Hi,

I can confrim the issue exists. We're experiencing this behaviour on both of our Production-Clusters since upgrading to 19.1.8. I tried setting our secondary to "persistent carp maintenance mode", which usually makes the primary node master again, but this also failed. I'll reboot the secondary after work, to make it slave again.

Cheers,
Wayne

How many carp IPs do you have and which type?
My test cluster has 2 VIPs, both static (LAN, WAN), I can successfully switch forth and back.
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: ruffy91 on May 27, 2019, 04:32:10 pm
My setup has 3 VIP (WAN, LAN, DMZ) + an Alias IP on WAN and has Problems with switching between Firewalls.
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on May 27, 2019, 04:50:12 pm
Please, on machine 1 and 2 a "sysctl -a | grep carp", before, and after turning into mnt mode, and then when back.
On my side it looks good:

root@OPNsense1:~ # sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
root@OPNsense1:~ # sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 240
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
root@OPNsense1:~ # sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240






 root@OPNsense2:~ # sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
root@OPNsense2:~ # sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
root@OPNsense2:~ # sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: Wayne Train on May 27, 2019, 04:56:18 pm
Hi,
we use 11 CARP-VIPs., one for each VLAN.
Cheers,
Wayne
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on May 27, 2019, 04:57:53 pm
Hi,
we use 11 CARP-VIPs., one for each VLAN.
Cheers,
Wayne

sysctl like above from you too please. Can't track this down without debugging ...
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: katamadone [CH] on July 03, 2019, 09:35:39 pm
coming from

Before maintenance:
#1
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
#2
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240

#1 in persistent maintenance
#1
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 240
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240

#2
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240

#1 left perstitent maintenance
#1
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
#2
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240


 (https://forum.opnsense.org/index.php?topic=12943.0[/url)
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: katamadone [CH] on July 03, 2019, 09:38:05 pm
@mimugmail as I interpret you're looking at the *primary*
Code: [Select]
net.inet.carp.demotion: 240
so it should be the same at my side, as on your side. Did you check if Master / Backup was correctly display in the webui?
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: katamadone [CH] on July 04, 2019, 10:42:35 am
- currently it looks, like the secondary is every time master not backup as in 19.1.7
- Tried to edit Virtual IP Setting and re-save it
- HAVE TO CONFIRM THAT: both become master on this IP regardless of the advertising Frequency settings
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on July 06, 2019, 09:54:02 am
So, you left mnt mode and both were master and had the sysctl from above? Strange. Then go to CLI, do a clog -f /var/log/system.log and post the new lines when leaving mnt mode
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: bitmusician on July 10, 2019, 10:19:03 am
- currently it looks, like the secondary is every time master not backup as in 19.1.7
- Tried to edit Virtual IP Setting and re-save it
- HAVE TO CONFIRM THAT: both become master on this IP regardless of the advertising Frequency settings

We are facing the same issue right now. In the GUI it looks like both nodes are Master after setting the first node into persistant CARP Maintenance mode but when we reboot this node the other "Master" (the actual Backup node) doesn't answer the Requests to the VIPs.

Is there already a solution?
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on July 10, 2019, 10:48:50 am
You can help too:

https://forum.opnsense.org/index.php?topic=12832.msg61681#msg61681
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: bitmusician on July 10, 2019, 03:46:15 pm
So, you left mnt mode and both were master and had the sysctl from above? Strange. Then go to CLI, do a clog -f /var/log/system.log and post the new lines when leaving mnt mode


node01 (normally the MASTER):
when switching into maintenance mode:


Jul 10 12:53:30 node01 kernel: carp: demoted by 240 to 240 (sysctl)

when switching out of maintenance mode:

Jul 10 12:54:42 node01 kernel: carp: demoted by -240 to 0 (sysctl)

-----------------------------------------------------

node02 (normally the BACKUP):
when switching into maintenance mode:


Jul 10 12:53:30 node02 kernel: carp: 31@igb0: BACKUP -> MASTER (preempting a slower master)
Jul 10 12:53:31 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "xxx.xxx.xxx.xxx - VIP WAN (31@igb0)" has resumed the state "MASTER" for vhid 31
Jul 10 12:53:31 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Starting OpenVPN server instance on xxx.xxx.xxx.xxx - VIP WAN because of transition to CARP master.
Jul 10 12:53:31 node02 kernel: ovpns1: link state changed to DOWN
Jul 10 12:53:35 node02 kernel: ovpns1: link state changed to UP
Jul 10 12:53:35 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: OpenVPN server 1 instance started on PID 34541.
Jul 10 12:53:36 node02 opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ovpns1'
Jul 10 12:53:36 node02 opnsense: /usr/local/etc/rc.newwanip: Interface '' is disabled or empty, nothing to do.

when switching out of maintenance mode:


Jul 10 12:54:41 node02 kernel: carp: 31@igb0: MASTER -> BACKUP (more frequent advertisement received)
Jul 10 12:54:41 node02 kernel: ifa_maintain_loopback_route: deletion failed for interface igb0: 3
Jul 10 12:54:42 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "xxx.xxx.xxx.xxx - VIP WAN (31@igb0)" has resumed the state "BACKUP" for vhid 31
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: katamadone [CH] on July 11, 2019, 03:48:15 pm
I had to revert, couldn't leave that in production. Sorry.
I'll have to start over soon and try again.
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: bitmusician on July 12, 2019, 07:10:27 am
I had to revert, couldn't leave that in production. Sorry.
I'll have to start over soon and try again.

Did you have the problem already in 19.1.8 or only after updating to 19.1.10?
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on July 12, 2019, 07:22:23 am
So, you left mnt mode and both were master and had the sysctl from above? Strange. Then go to CLI, do a clog -f /var/log/system.log and post the new lines when leaving mnt mode


node01 (normally the MASTER):
when switching into maintenance mode:


Jul 10 12:53:30 node01 kernel: carp: demoted by 240 to 240 (sysctl)

when switching out of maintenance mode:

Jul 10 12:54:42 node01 kernel: carp: demoted by -240 to 0 (sysctl)

-----------------------------------------------------

node02 (normally the BACKUP):
when switching into maintenance mode:


Jul 10 12:53:30 node02 kernel: carp: 31@igb0: BACKUP -> MASTER (preempting a slower master)
Jul 10 12:53:31 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "xxx.xxx.xxx.xxx - VIP WAN (31@igb0)" has resumed the state "MASTER" for vhid 31
Jul 10 12:53:31 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Starting OpenVPN server instance on xxx.xxx.xxx.xxx - VIP WAN because of transition to CARP master.
Jul 10 12:53:31 node02 kernel: ovpns1: link state changed to DOWN
Jul 10 12:53:35 node02 kernel: ovpns1: link state changed to UP
Jul 10 12:53:35 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: OpenVPN server 1 instance started on PID 34541.
Jul 10 12:53:36 node02 opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ovpns1'
Jul 10 12:53:36 node02 opnsense: /usr/local/etc/rc.newwanip: Interface '' is disabled or empty, nothing to do.

when switching out of maintenance mode:


Jul 10 12:54:41 node02 kernel: carp: 31@igb0: MASTER -> BACKUP (more frequent advertisement received)
Jul 10 12:54:41 node02 kernel: ifa_maintain_loopback_route: deletion failed for interface igb0: 3
Jul 10 12:54:42 node02 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "xxx.xxx.xxx.xxx - VIP WAN (31@igb0)" has resumed the state "BACKUP" for vhid 31

I don't get it ... from reading the logs after switching off mnt mode second machine should be backup???
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: bitmusician on July 12, 2019, 08:13:41 am
I don't get it ... from reading the logs after switching off mnt mode second machine should be backup???

Yes when switching it off its Backup again. But when I turn on maintenance mode on the first node in the GUI both are shown as Master and nobody answers the requests to the VIP.
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on July 12, 2019, 09:40:34 am
I don't get it ... from reading the logs after switching off mnt mode second machine should be backup???

Yes when switching it off its Backup again. But when I turn on maintenance mode on the first node in the GUI both are shown as Master and nobody answers the requests to the VIP.

And when you revert to 19.1.7 it's working again?
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: katamadone [CH] on July 17, 2019, 04:43:17 pm
I had to revert, couldn't leave that in production. Sorry.
I'll have to start over soon and try again.

Did you have the problem already in 19.1.8 or only after updating to 19.1.10?

had the problem already from 19.1.7 -> 19.1.8
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: katamadone [CH] on July 25, 2019, 05:28:11 pm
I tend to try it again. I'm not so happy that I'm on OPNsense 19.1.7.

@mimugmail do you have some tips what I should check
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: katamadone [CH] on August 05, 2019, 07:20:56 am
#1
******************************************************
  NOMAINTENANCE   PRIMARY
******************************************************
sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
******************************************************
  NOMAINTENANCE   SECONDARY AFTER BOOT
******************************************************
sysctl -a | grep carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
#2
******************************************************
  MAINTENANCE   PRIMARY
******************************************************
<6>carp: 1@vmx2_vlan605: MASTER -> INIT (hardware interface up)
<6>carp: 1@vmx2_vlan605: INIT -> BACKUP (initialization complete)
<6>carp: 2@vmx2_vlan621: MASTER -> INIT (hardware interface up)
<6>carp: 2@vmx2_vlan621: INIT -> BACKUP (initialization complete)
<6>carp: 3@vmx2_vlan622: MASTER -> INIT (hardware interface up)
<6>carp: 3@vmx2_vlan622: INIT -> BACKUP (initialization complete)
<6>carp: 4@vmx2_vlan623: MASTER -> INIT (hardware interface up)
<6>carp: 4@vmx2_vlan623: INIT -> BACKUP (initialization complete)
<6>carp: 5@vmx2_vlan624: MASTER -> INIT (hardware interface up)
<6>carp: 5@vmx2_vlan624: INIT -> BACKUP (initialization complete)
<6>carp: 6@vmx2_vlan625: MASTER -> INIT (hardware interface up)
<6>carp: 6@vmx2_vlan625: INIT -> BACKUP (initialization complete)
<6>carp: 7@vmx2_vlan626: MASTER -> INIT (hardware interface up)
<6>carp: 7@vmx2_vlan626: INIT -> BACKUP (initialization complete)
<6>carp: 8@vmx2_vlan627: MASTER -> INIT (hardware interface up)
<6>carp: 8@vmx2_vlan627: INIT -> BACKUP (initialization complete)
<6>carp: 9@vmx2_vlan628: MASTER -> INIT (hardware interface up)
<6>carp: 9@vmx2_vlan628: INIT -> BACKUP (initialization complete)
<6>carp: 11@vmx2_vlan630: MASTER -> INIT (hardware interface up)
<6>carp: 11@vmx2_vlan630: INIT -> BACKUP (initialization complete)
<6>carp: 13@vmx2_vlan606: MASTER -> INIT (hardware interface up)
<6>carp: 13@vmx2_vlan606: INIT -> BACKUP (initialization complete)
<6>carp: 14@vmx2_vlan611: MASTER -> INIT (hardware interface up)
<6>carp: 14@vmx2_vlan611: INIT -> BACKUP (initialization complete)
<6>carp: 15@vmx2_vlan602: MASTER -> INIT (hardware interface up)
<6>carp: 15@vmx2_vlan602: INIT -> BACKUP (initialization complete)
<6>carp: 16@vmx2_vlan107: MASTER -> INIT (hardware interface up)
<6>carp: 16@vmx2_vlan107: INIT -> BACKUP (initialization complete)
<6>carp: 17@vmx2_vlan682: MASTER -> INIT (hardware interface up)
<6>carp: 17@vmx2_vlan682: INIT -> BACKUP (initialization complete)
<6>carp: 20@vmx2_vlan607: MASTER -> INIT (hardware interface up)
<6>carp: 20@vmx2_vlan607: INIT -> BACKUP (initialization complete)
<6>carp: 21@vmx0: MASTER -> INIT (hardware interface up)
<6>carp: 21@vmx0: INIT -> BACKUP (initialization complete)
<6>carp: 23@vmx1: MASTER -> INIT (hardware interface up)
<6>carp: 23@vmx1: INIT -> BACKUP (initialization complete)
<6>carp: 25@vmx2_vlan631: MASTER -> INIT (hardware interface up)
<6>carp: 25@vmx2_vlan631: INIT -> BACKUP (initialization complete)
<6>carp: 26@vmx2_vlan632: MASTER -> INIT (hardware interface up)
<6>carp: 26@vmx2_vlan632: INIT -> BACKUP (initialization complete)
<6>carp: 27@vmx2_vlan700: MASTER -> INIT (hardware interface up)
<6>carp: 27@vmx2_vlan700: INIT -> BACKUP (initialization complete)
<6>carp: 28@vmx2_vlan701: MASTER -> INIT (hardware interface up)
<6>carp: 28@vmx2_vlan701: INIT -> BACKUP (initialization complete)
<6>carp: 29@vmx2_vlan702: MASTER -> INIT (hardware interface up)
<6>carp: 29@vmx2_vlan702: INIT -> BACKUP (initialization complete)
<6>carp: 30@vmx2_vlan703: MASTER -> INIT (hardware interface up)
<6>carp: 30@vmx2_vlan703: INIT -> BACKUP (initialization complete)
<6>carp: 31@vmx2_vlan704: MASTER -> INIT (hardware interface up)
<6>carp: 31@vmx2_vlan704: INIT -> BACKUP (initialization complete)
<6>carp: 32@vmx2_vlan705: MASTER -> INIT (hardware interface up)
<6>carp: 32@vmx2_vlan705: INIT -> BACKUP (initialization complete)
<6>carp: 33@vmx2_vlan703: MASTER -> INIT (hardware interface up)
<6>carp: 33@vmx2_vlan703: INIT -> BACKUP (initialization complete)
<6>carp: 34@vmx2_vlan704: MASTER -> INIT (hardware interface up)
<6>carp: 34@vmx2_vlan704: INIT -> BACKUP (initialization complete)
<6>carp: 35@vmx2_vlan705: MASTER -> INIT (hardware interface up)
<6>carp: 35@vmx2_vlan705: INIT -> BACKUP (initialization complete)
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240
******************************************************
  NOMAINTENANCE   Secondary
******************************************************
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240

ENTERING & LEAVING Maintenance works
BUT    I'm very confused on the demotion on the master --> shouldn't it be 240?? (Last time it was)

***BOOTED***
******************************************************
  MAINTENANCE   PRIMARY  (after boot)
******************************************************
<6>carp: 30@vmx2_vlan703: INIT -> BACKUP (initialization complete)
<6>carp: 31@vmx2_vlan704: INIT -> BACKUP (initialization complete)
<6>carp: 32@vmx2_vlan705: INIT -> BACKUP (initialization complete)
<6>carp: 33@vmx2_vlan703: INIT -> BACKUP (initialization complete)
<6>carp: 34@vmx2_vlan704: INIT -> BACKUP (initialization complete)
<6>carp: 35@vmx2_vlan705: INIT -> BACKUP (initialization complete)
<6>carp: demoted by 240 to 240 (pfsync bulk start)
<6>carp: demoted by -240 to 0 (pfsync bulk done)
<118>>>> Invoking start script 'carp'
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240

------> LEAVE Perstistent
Primary stays backup on all VLANs

#3

******************************************************
  NOMAINTENANCE   PRIMARY
******************************************************
<118>>>> Invoking start script 'carp'
<6>carp: demoted by 0 to 0 (sysctl)
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240

******************************************************
  NOMAINTENANCE   Secondary
******************************************************
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1
net.pfsync.carp_demotion_factor: 240



Getting Back to the VMWARE snapshot with 19.1.7 firewall is switching back properly to MASTER

Things to mention:
- 29 CARP entries on separate VLANS
- Both Firewalls are on VSphere (ESXi) 6.5
- There's a separate Interface for pfSYNC
- And CARP traffic is allowed via Floating Rule (first match)




Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: katamadone [CH] on August 05, 2019, 07:26:43 am
some other logs:
https://pastebin.com/rSCZGWy2

I'm confused by the (master timed out) messages, but all seems working as expected at this moment.
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on August 05, 2019, 07:38:31 am
Are you sure you pushed button "Enter persitent maintainence mode" (would generate an other message like slower advertisements)?
For me it seems you pushed "Disable CARP" (master timeout).
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: katamadone [CH] on August 05, 2019, 08:10:27 am
nope, very sure - tried it multiple times
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on August 05, 2019, 09:15:40 am
System log of master when you entering mnt mode please
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on August 06, 2019, 03:24:37 pm
I found a HA cluster running 19.1.7 live, I compared sysctl before, while and after mnt mode .. on master and slave, for 19.1.7, 19.1.10 and 19.7.2. It worked as expected on all version, no stucks, nothing.

I can upload all files if needed.
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: katamadone [CH] on August 09, 2019, 07:37:37 am
damn.. thats so strange - currently have no idea
Is that HA cluster virtualized on ESXi?
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on August 09, 2019, 08:41:11 am
No, Hardware, old school :8
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: katamadone [CH] on August 09, 2019, 09:02:05 am
I've got the feeling that something did change with that whole
- ESXi
- Portgroup
- Net.ReversePathFwdCheckPromisc
- Promicious Mode
- LAG / no LAG
- LoadBalancing Hash based / Originating interface

But what I don't understand is what changed in that case with 19.1.8
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on August 09, 2019, 09:26:54 am
In 19.1.7 maintenance mode was set some way of disable carp, with this you had a 3 times packet loss within migration time. In 19.1.8 this was changed to higher the demotion on the master to force a switch, which works without packet loss.

If your demotions are not even from the beginning it wont work, so 99% of the times it's a config error.
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: mimugmail on August 27, 2019, 10:43:28 am
https://github.com/opnsense/core/issues/3671#issuecomment-525004560
Title: Re: CARP role doesn't switch properly after updating to 19.1.8
Post by: katamadone [CH] on September 16, 2019, 06:44:11 am
All these fixes until 19.7.4_1 did the trick. Testing a little bit more today but now it's looking good!
And even the necessary reboot is fixed now :)