OPNsense 21.1.7_1-amd64
I have two pairs of firewalls and they're both behaving in the same way and I'm not sure why. The primary has CARP interfaces with base:skew values of 1:0. Secondary is 1:100. Occasionally the primary shows CARP status of BACKUP and secondary shows MASTER. I see this in the log on the primary while in BACKUP status (newest on top):
2021-07-06T15:10:26 opnsense[54731] /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "192.168.1.254 - LAN CARP (1@em1)" has resumed the state "BACKUP" for vhid 1
2021-07-06T15:10:26 kernel em1: deletion failed: 3
2021-07-06T15:10:26 kernel carp: 1@em1: MASTER -> BACKUP (more frequent advertisement received)
2021-07-06T15:09:51 opnsense[23015] /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "10.2.0.1 - LDC01-TEST CARP (2@em0_vlan2)" has resumed the state "BACKUP" for vhid 2
2021-07-06T15:09:50 opnsense[76639] /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "10.100.2.1 - LDC01-TDC01-L2 (3@em0_vlan910)" has resumed the state "BACKUP" for vhid 3
2021-07-06T15:09:50 kernel em0_vlan2: deletion failed: 3
2021-07-06T15:09:50 kernel carp: 2@em0_vlan2: MASTER -> BACKUP (more frequent advertisement received)
2021-07-06T15:09:50 kernel em0_vlan910: deletion failed: 3
2021-07-06T15:09:50 kernel carp: 3@em0_vlan910: MASTER -> BACKUP (more frequent advertisement received)
Following the page at https://docs.opnsense.org/development/backend/carp.html (https://docs.opnsense.org/development/backend/carp.html), I see the following on the primary (while in BACKUP status):
root@LDC01A:~ # sysctl net.inet.carp.demotion
net.inet.carp.demotion: 1048576
root@LDC01A:~ # sysctl net.inet.carp.ifdown_demotion_factor
net.inet.carp.ifdown_demotion_factor: 240
root@LDC01A:~ # sysctl net.inet.carp.senderr_demotion_factor
net.inet.carp.senderr_demotion_factor: 240
root@LDC01A:~ # sysctl net.inet.carp_demotion_factor
sysctl: unknown oid 'net.inet.carp_demotion_factor'
root@LDC01A:~ # sysctl net.pfsync.carp_demotion_factor
net.pfsync.carp_demotion_factor: 240
Not sure what to make of it, I do this:
root@LDC01A:~ # configctl interface update carp service_status
OK
Within seconds, the primary firewall changes its CARP status to MASTER and the secondary to BACKUP. I see this in the log on the primary:
2021-07-07T10:44:02 opnsense[99061] /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "10.2.0.1 - LDC01-TEST CARP (2@em0_vlan2)" has resumed the state "MASTER" for vhid 2
2021-07-07T10:44:01 opnsense[54887] /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "192.168.1.254 - LAN CARP (1@em1)" has resumed the state "MASTER" for vhid 1
2021-07-07T10:44:00 opnsense[21941] /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "10.100.2.1 - LDC01-TDC01-L2 (3@em0_vlan910)" has resumed the state "MASTER" for vhid 3
2021-07-07T10:43:59 kernel carp: 2@em0_vlan2: BACKUP -> MASTER (preempting a slower master)
2021-07-07T10:43:59 kernel carp: 1@em1: BACKUP -> MASTER (preempting a slower master)
2021-07-07T10:43:59 kernel carp: 3@em0_vlan910: BACKUP -> MASTER (preempting a slower master)
2021-07-07T10:43:59 kernel carp: demoted by -1048576 to 0 (sysctl)
2021-07-07T10:43:59 carp[40586] carp promoted by 1048576 due to service recovery
- Why did CARP status swap in the first place?
- Why does it not swap back until I manually run that code in the shell?
You demotion factor was over a million, the reason for it would be of interest.
I don't see that setting in the CARP config page (attached). I have FRR installed but disabled, so maybe it came from there?
Do you have OSFP CARP support enabled in FRR?
Cheers,
Franco
Quote from: franco on July 07, 2021, 08:43:55 PM
Do you have OSFP CARP support enabled in FRR?
Yes. Does that have any effect with FRR disabled? Maybe FRR demoted CARP before I disabled FRR?
Please disable it and first get your CARP setup stable before you add dynamic routing
Is it necessary to disable all FRR options when FRR is disabled? Seems odd to have to wipe the whole FRR config rather than just disable FRR.