1
17.1 Legacy Series / CARP Randomly goes to Master on slave
« on: July 11, 2017, 02:03:30 pm »
Hi everyone, i'm experiencing a strange behaviour
My setup is two opnsense updated to 17.1.9 with 16 VIP.
Attached screenshot with ifconfig
I have correct state transfer between the two firewall (pfsync showing the same size of the state table), xmlrpc sync from the master to the slave and everything works fine. If i stop or unplug the master firewall the slave becomes master maintaining the opened connections. (1-5 packet loss during the swap)
During the day there are rare case of flapping BACKUP-> MASTER of the slave firewall, but in the night there are full swapping of all 16 VIP to MASTER (1 times per hour, i installed crontab to reboot slave every hour...), dmesg saying MASTER TIMED OUT. Sometimes only the LAN VIP swat from BACKUP -> MASTER saying master timed out.
The two firewall are two xen virtual machine in the same xenserver cluster, the switch of the LAN interface is a DELL powerconnect 5524.
What can i do to debug ? There is no auto recovery for this strange behaviour?
If it can be of help this morning on the master there was these dmesg log
And i think carp is working, because when i restart the slave firewall it says
Thank you!
My setup is two opnsense updated to 17.1.9 with 16 VIP.
Attached screenshot with ifconfig
I have correct state transfer between the two firewall (pfsync showing the same size of the state table), xmlrpc sync from the master to the slave and everything works fine. If i stop or unplug the master firewall the slave becomes master maintaining the opened connections. (1-5 packet loss during the swap)
During the day there are rare case of flapping BACKUP-> MASTER of the slave firewall, but in the night there are full swapping of all 16 VIP to MASTER (1 times per hour, i installed crontab to reboot slave every hour...), dmesg saying MASTER TIMED OUT. Sometimes only the LAN VIP swat from BACKUP -> MASTER saying master timed out.
The two firewall are two xen virtual machine in the same xenserver cluster, the switch of the LAN interface is a DELL powerconnect 5524.
What can i do to debug ? There is no auto recovery for this strange behaviour?
If it can be of help this morning on the master there was these dmesg log
Code: [Select]
carp: demoted by 240 to 240 (send error 55 on xn0)
carp: demoted by 240 to 480 (send error 55 on xn0)
carp: demoted by 240 to 720 (send error 55 on xn0)
carp: demoted by 240 to 960 (send error 55 on xn0)
carp: demoted by 240 to 1200 (send error 55 on xn0)
carp: demoted by 240 to 1440 (send error 55 on xn0)
carp: demoted by 240 to 1680 (send error 55 on xn0)
carp: demoted by 240 to 1920 (send error 55 on xn0)
carp: demoted by 240 to 2160 (send error 55 on xn0)
carp: demoted by 240 to 2400 (send error 55 on xn0)
carp: demoted by 240 to 2640 (send error 55 on xn0)
carp: demoted by 240 to 2880 (send error 55 on xn0)
carp: demoted by 240 to 3120 (send error 55 on xn0)
carp: demoted by 240 to 3360 (send error 55 on xn0)
carp: demoted by 240 to 3600 (send error 55 on xn0)
carp: demoted by 240 to 3840 (send error 55 on xn0)
carp: demoted by 240 to 4080 (send error 55 on xn0)
carp: demoted by 240 to 4320 (send error 55 on xn0)
carp: demoted by 240 to 4560 (send error 55 on xn0)
carp: demoted by 240 to 4800 (send error 55 on xn0)
carp: demoted by 240 to 5040 (send error 55 on xn0)
carp: demoted by 240 to 5280 (send error 55 on xn0)
carp: demoted by 240 to 5520 (send error 55 on xn0)
carp: demoted by 240 to 5760 (send error 55 on xn0)
carp: demoted by 240 to 6000 (send error 55 on xn0)
carp: demoted by 240 to 6240 (send error 55 on xn0)
carp: demoted by 240 to 6480 (send error 55 on xn0)
carp: demoted by 240 to 6720 (send error 55 on xn0)
carp: demoted by 240 to 6960 (send error 55 on xn0)
carp: demoted by 240 to 7200 (send error 55 on xn0)
carp: demoted by 240 to 7440 (send error 55 on xn0)
carp: demoted by 240 to 7680 (send error 55 on xn0)
carp: demoted by 240 to 7920 (send error 55 on xn0)
carp: demoted by 240 to 8160 (send error 55 on xn0)
carp: demoted by 240 to 8400 (send error 55 on xn0)
carp: demoted by 240 to 8640 (send error 55 on xn0)
carp: demoted by 240 to 8880 (send error 55 on xn0)
carp: demoted by 240 to 9120 (send error 55 on xn0)
carp: demoted by 240 to 9360 (send error 55 on xn0)
carp: demoted by 240 to 9600 (send error 55 on xn0)
carp: demoted by 240 to 9840 (send error 55 on xn0)
carp: demoted by 240 to 10080 (send error 55 on xn0)
carp: demoted by 240 to 10320 (send error 55 on xn0)
carp: demoted by 240 to 10560 (send error 55 on xn0)
carp: demoted by 240 to 10800 (send error 55 on xn0)
carp: demoted by 240 to 11040 (send error 55 on xn0)
carp: demoted by 240 to 11280 (send error 55 on xn0)
carp: demoted by 240 to 11520 (send error 55 on xn0)
And i think carp is working, because when i restart the slave firewall it says
Code: [Select]
xn0: performing interface reset due to feature change
xn0: backend features: feature-sg feature-gso-tcp4
xn0: performing interface reset due to feature change
xn0: backend features: feature-sg feature-gso-tcp4
xn0: 2 link states coalesced
xn0: link state changed to UP
xn1: performing interface reset due to feature change
xn1: backend features: feature-sg feature-gso-tcp4
xn1: performing interface reset due to feature change
xn1: backend features: feature-sg feature-gso-tcp4
xn1: 2 link states coalesced
xn1: link state changed to UP
xn2: performing interface reset due to feature change
xn2: backend features: feature-sg feature-gso-tcp4
xn2: performing interface reset due to feature change
xn2: backend features: feature-sg feature-gso-tcp4
xn2: 2 link states coalesced
xn2: link state changed to UP
xn2: promiscuous mode enabled
carp: 1@xn2: INIT -> BACKUP (initialization complete)
xn0: promiscuous mode enabled
carp: 2@xn0: INIT -> BACKUP (initialization complete)
carp: 3@xn2: INIT -> BACKUP (initialization complete)
carp: 4@xn0: INIT -> BACKUP (initialization complete)
carp: 5@xn2: INIT -> BACKUP (initialization complete)
carp: 6@xn0: INIT -> BACKUP (initialization complete)
carp: 7@xn2: INIT -> BACKUP (initialization complete)
carp: 8@xn0: INIT -> BACKUP (initialization complete)
carp: 9@xn2: INIT -> BACKUP (initialization complete)
carp: 10@xn0: INIT -> BACKUP (initialization complete)
carp: 11@xn2: INIT -> BACKUP (initialization complete)
carp: 12@xn0: INIT -> BACKUP (initialization complete)
carp: 13@xn2: INIT -> BACKUP (initialization complete)
carp: 14@xn0: INIT -> BACKUP (initialization complete)
carp: 15@xn2: INIT -> BACKUP (initialization complete)
carp: 16@xn0: INIT -> BACKUP (initialization complete)
carp: demoted by 240 to 240 (pfsync bulk start)
pflog0: promiscuous mode enabled
carp: demoted by -240 to 0 (pfsync bulk done)
Thank you!