OPNsense Forum

Archive => 17.1 Legacy Series => Topic started by: nutellinoit on July 11, 2017, 02:03:30 pm

Title: CARP Randomly goes to Master on slave
Post by: nutellinoit on July 11, 2017, 02:03:30 pm
Hi everyone, i'm experiencing a strange behaviour

My setup is two opnsense updated to 17.1.9 with 16 VIP.

Attached screenshot with ifconfig

I have correct state transfer between the two firewall (pfsync showing the same size of the state table), xmlrpc sync from the master to the slave and everything works fine. If i stop or unplug the master firewall the slave becomes master maintaining the opened connections. (1-5 packet loss during the swap)

During the day there are rare case of flapping BACKUP-> MASTER of the slave firewall, but in the night there are full swapping of all 16 VIP to MASTER (1 times per hour, i installed crontab to reboot slave every hour...), dmesg saying MASTER TIMED OUT. Sometimes only the LAN VIP swat from BACKUP -> MASTER saying master timed out.

The two firewall are two xen virtual machine in the same xenserver cluster, the switch of the LAN interface is a DELL powerconnect 5524.

What can i do to debug ? There is no auto recovery for this strange behaviour?

If it can be of help this morning on the master there was these dmesg log

Code: [Select]
carp: demoted by 240 to 240 (send error 55 on xn0)
carp: demoted by 240 to 480 (send error 55 on xn0)
carp: demoted by 240 to 720 (send error 55 on xn0)
carp: demoted by 240 to 960 (send error 55 on xn0)
carp: demoted by 240 to 1200 (send error 55 on xn0)
carp: demoted by 240 to 1440 (send error 55 on xn0)
carp: demoted by 240 to 1680 (send error 55 on xn0)
carp: demoted by 240 to 1920 (send error 55 on xn0)
carp: demoted by 240 to 2160 (send error 55 on xn0)
carp: demoted by 240 to 2400 (send error 55 on xn0)
carp: demoted by 240 to 2640 (send error 55 on xn0)
carp: demoted by 240 to 2880 (send error 55 on xn0)
carp: demoted by 240 to 3120 (send error 55 on xn0)
carp: demoted by 240 to 3360 (send error 55 on xn0)
carp: demoted by 240 to 3600 (send error 55 on xn0)
carp: demoted by 240 to 3840 (send error 55 on xn0)
carp: demoted by 240 to 4080 (send error 55 on xn0)
carp: demoted by 240 to 4320 (send error 55 on xn0)
carp: demoted by 240 to 4560 (send error 55 on xn0)
carp: demoted by 240 to 4800 (send error 55 on xn0)
carp: demoted by 240 to 5040 (send error 55 on xn0)
carp: demoted by 240 to 5280 (send error 55 on xn0)
carp: demoted by 240 to 5520 (send error 55 on xn0)
carp: demoted by 240 to 5760 (send error 55 on xn0)
carp: demoted by 240 to 6000 (send error 55 on xn0)
carp: demoted by 240 to 6240 (send error 55 on xn0)
carp: demoted by 240 to 6480 (send error 55 on xn0)
carp: demoted by 240 to 6720 (send error 55 on xn0)
carp: demoted by 240 to 6960 (send error 55 on xn0)
carp: demoted by 240 to 7200 (send error 55 on xn0)
carp: demoted by 240 to 7440 (send error 55 on xn0)
carp: demoted by 240 to 7680 (send error 55 on xn0)
carp: demoted by 240 to 7920 (send error 55 on xn0)
carp: demoted by 240 to 8160 (send error 55 on xn0)
carp: demoted by 240 to 8400 (send error 55 on xn0)
carp: demoted by 240 to 8640 (send error 55 on xn0)
carp: demoted by 240 to 8880 (send error 55 on xn0)
carp: demoted by 240 to 9120 (send error 55 on xn0)
carp: demoted by 240 to 9360 (send error 55 on xn0)
carp: demoted by 240 to 9600 (send error 55 on xn0)
carp: demoted by 240 to 9840 (send error 55 on xn0)
carp: demoted by 240 to 10080 (send error 55 on xn0)
carp: demoted by 240 to 10320 (send error 55 on xn0)
carp: demoted by 240 to 10560 (send error 55 on xn0)
carp: demoted by 240 to 10800 (send error 55 on xn0)
carp: demoted by 240 to 11040 (send error 55 on xn0)
carp: demoted by 240 to 11280 (send error 55 on xn0)
carp: demoted by 240 to 11520 (send error 55 on xn0)


And i think carp is working, because when i restart the slave firewall it says

Code: [Select]
xn0: performing interface reset due to feature change
xn0: backend features: feature-sg feature-gso-tcp4
xn0: performing interface reset due to feature change
xn0: backend features: feature-sg feature-gso-tcp4
xn0: 2 link states coalesced
xn0: link state changed to UP
xn1: performing interface reset due to feature change
xn1: backend features: feature-sg feature-gso-tcp4
xn1: performing interface reset due to feature change
xn1: backend features: feature-sg feature-gso-tcp4
xn1: 2 link states coalesced
xn1: link state changed to UP
xn2: performing interface reset due to feature change
xn2: backend features: feature-sg feature-gso-tcp4
xn2: performing interface reset due to feature change
xn2: backend features: feature-sg feature-gso-tcp4
xn2: 2 link states coalesced
xn2: link state changed to UP
xn2: promiscuous mode enabled
carp: 1@xn2: INIT -> BACKUP (initialization complete)
xn0: promiscuous mode enabled
carp: 2@xn0: INIT -> BACKUP (initialization complete)
carp: 3@xn2: INIT -> BACKUP (initialization complete)
carp: 4@xn0: INIT -> BACKUP (initialization complete)
carp: 5@xn2: INIT -> BACKUP (initialization complete)
carp: 6@xn0: INIT -> BACKUP (initialization complete)
carp: 7@xn2: INIT -> BACKUP (initialization complete)
carp: 8@xn0: INIT -> BACKUP (initialization complete)
carp: 9@xn2: INIT -> BACKUP (initialization complete)
carp: 10@xn0: INIT -> BACKUP (initialization complete)
carp: 11@xn2: INIT -> BACKUP (initialization complete)
carp: 12@xn0: INIT -> BACKUP (initialization complete)
carp: 13@xn2: INIT -> BACKUP (initialization complete)
carp: 14@xn0: INIT -> BACKUP (initialization complete)
carp: 15@xn2: INIT -> BACKUP (initialization complete)
carp: 16@xn0: INIT -> BACKUP (initialization complete)
carp: demoted by 240 to 240 (pfsync bulk start)
pflog0: promiscuous mode enabled
carp: demoted by -240 to 0 (pfsync bulk done)



Thank you!
Title: Re: CARP Randomly goes to Master on slave
Post by: Wayne Train on July 12, 2017, 10:28:30 am
Hi,
I also have some CARP trouble with my machines, but they're are physical and not virutalized. Maybe you read through my posts on the forum to see if we're experiencing the same issue. Furthermore, since you mentioned, that you use XEN, maybe the issue you experience is related to this:

https://doc.pfsense.org/index.php/CARP_Configuration_Troubleshooting

There are some special settings that need to fit when using carp in an virtual enviroment. I first also had problems when trying to simulate with virtualbox, and this article helped me a  lot.

Since you also have multiple VLANs on one interface: What happens if you shut down one vlan on the primary ? Does your setup failover completely to the backup node, or are you experiencing a split brain. If I do so on the LAN side, WAN moves over to the backup node, but some vlans still remain acive on the original masters LAN side. Do you experience the same issue ?

Best regards,
Wayne
Title: Re: CARP Randomly goes to Master on slave
Post by: nutellinoit on July 13, 2017, 09:20:56 am
If i shut down the master firewall the backup goes master correctly on all interfaces. If i start up the master firewall all interfaces in the master firewall stays backup (with the Disable preempt option enabled)  while the slave is master.  I will look into the dell switch if there are strange configuration. I have more problem on lan interfaces than wan.

Thank you wayne