CARP replay protection counter on *sense distros

Started by greg124816, February 14, 2019, 10:49:28 PM

Previous topic - Next topic
February 14, 2019, 10:49:28 PM Last Edit: February 14, 2019, 10:57:08 PM by greg124816
I was migrating a old pair of redundant firewalls w/pfsense to new hardware w/opnsense

What I discovered is that the CARP implementations are NOT compatible and both become MASTER (while trying to swap in the new B firewall while leaving the A firewall running).

After much confusion, the reason is that opnsense seems to have a properly incrementing replay protection counter in each CARP advertisement, while pfsense's counter is static and unchanging. Since the counters never match the packets get ignored by each firewall and both stay MASTER and continue to send their advertisements. There are 2 counter fields, I'm not sure which one tcpdump is showing, but whichever one it is is definitely not matching between opnsense and pfsense.

I verified the same incrementing replay protection counter behavior on several uCARP https://www.pureftpd.org/project/ucarp systems we have running.

Here are some example tcpdump traces, all done from a single pfsense host looking out two different interfaces

Looking at at Opnsense 19.1 host (initial post had the wrong trace for the opnsense host, it's corrected now):
[2.3.4-RELEASE][root@localhost]/root: tcpdump -T carp -ni em1 vrrp and host 192.168.100.2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em1, link-type EN10MB (Ethernet), capture size 65535 bytes
13:05:27.216731 IP 192.168.100.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=6491304834018196506
13:05:28.218369 IP 192.168.100.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=6491304834018196507
13:05:29.220678 IP 192.168.100.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=6491304834018196508
13:05:30.222307 IP 192.168.100.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=6491304834018196509


Looking at a Linux Centos host running ucarp 1.5.1:
[2.3.4-RELEASE][root@localhost]/root: tcpdump -T carp -ni em1 vrrp and host 192.168.100.7
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em1, link-type EN10MB (Ethernet), capture size 65535 bytes
12:57:37.999181 IP 192.168.100.7 > 224.0.0.18: CARPv2-advertise 36: vhid=5 advbase=1 advskew=120 authlen=7 counter=5446066235882559562
12:57:39.112282 IP 192.168.100.7 > 224.0.0.18: CARPv2-advertise 36: vhid=5 advbase=1 advskew=120 authlen=7 counter=5446066235882559563
12:57:40.999080 IP 192.168.100.7 > 224.0.0.18: CARPv2-advertise 36: vhid=5 advbase=1 advskew=120 authlen=7 counter=5446066235882559564
12:57:42.115312 IP 192.168.100.7 > 224.0.0.18: CARPv2-advertise 36: vhid=5 advbase=1 advskew=120 authlen=7 counter=5446066235882559565



Looking at its own CARP adverts (pfsense 2.3.4):
[2.3.4-RELEASE][root@localhost]/root: tcpdump -T carp -ni em0 vrrp and host 192.168.101.2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em0, link-type EN10MB (Ethernet), capture size 65535 bytes
12:59:24.369135 IP 192.168.101.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=16106432045254150054
12:59:25.370135 IP 192.168.101.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=16106432045254150054
12:59:26.371129 IP 192.168.101.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=16106432045254150054
12:59:27.372131 IP 192.168.101.2 > 224.0.0.18: CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 authlen=7 counter=16106432045254150054


I can see from a 2004 presentation on OpenBSD CARP that the replay detection counters were "not implemented yet" at that time.
See page 10 in this pdf: https://cyber-defense.sans.org/resources/papers/gsec/carp-free-fail-over-protocol-106433

The FreeBSD initial commit of CARP code seems to have been in 2005 and to have included the incrementing counter (sc_counter++) here:
https://github.com/freebsd/freebsd/blob/e1d22638d0a8257ed01b7f95d1b6d5cef74ebd07/sys/netinet/ip_carp.c#L747

The above code with sc_counter++ is what is present in opnsense github source.

Finally, this pfsense code on github is supposed to be from Freebsd with their changes:
https://github.com/pfsense/FreeBSD-src/blob/RELENG_2_2/sys/netinet/ip_carp.c#L715





Does anyone have any comments on the history of this?
Am I somehow confused or totally out of the loop and this is a known incompatibility?
opnsense must have switched to the FreeBSD version of ip_carp.c at some point, when the fork occurred?

I haven't looked at tcpdump code CARP parsing code yet, but it's possible it is parsing 1 of the two counters and pfsense is using the other one. Either way it doe snot seem to be compatible with opnsense or FreeBSD or Ucarp.

I have a couple more pairs to migrate, I dont see where even the latest version of pfsense has the incrementing counter (based on code shown on github). I'm not sure if upgrading to the latest pfsense would get me an incrementing counter so i could more easily migrate. I guess I'll have to bring one up in a VM and check.

If I'm right, this CARP incompatibility is an important thing to know if you are planning to "smoothly" migrate a redundant pair of firewalls from pfsense to opnsense... they will not cooperate on CARP so it's a hard cut from one to the other, not gracefully anyway.