[SOLVED] High-Availability + CARP IP + No Traffic

Started by romain, July 07, 2015, 06:11:27 PM

Previous topic - Next topic
July 07, 2015, 06:11:27 PM Last Edit: August 07, 2015, 10:05:26 AM by franco
Hello,

i continue to test deeply OpnSense but I encounter a trouble.

I have two identical boxes with 4x1Gbe Intel ports and 2x 10Gbe Emulex ports. I have a lagg configured as failover for the two 10Gbe ports. I have 5 tagged vlans going though this lagg. Everything working fine.

i tried to configured HA between the two boxes. I added my Carp VIPs on every vlans but I have strange behaviour with it.

The gateway of my every vlan subnets are my Carp VIPs. Everything seems to be okay on the OPNSense side. The master manages and deals the CARP IP and the backup is waiting for a failure of the master (when I reboot the master, the backup takes correctly the VIPs) However, if I try to ping or go through the CARP IP nothing works unless I use a machine on FreeBSD too. In that case it's works. If I take a windows machine plugged on the same switch with the same tag configuration, it's not working at all.

If I look deeper, I can see that every two firewall can ping and reach the windows machine through their own IP. if I do a ping -S VIP_ADDRESS IP_WINDOWS it's not working.

On the other side, if I try to ping the VIP of the subnet, I have a timeout. But if I look the arp table I can see the right mac address defined by the carp prototol (00:00:...:01).

I tried to deactivate the firewall to see if my issue was related to some missing rules but not.. it's not working better.

I'm pretty sure my CARP are okay because the WAN Side works well with a OpenVPN server. 

Does someone have idea of what going on and what I'm doing wrong ?

Please let me know if you need any more information ?

Romain

July 07, 2015, 09:02:40 PM #1 Last Edit: July 07, 2015, 09:05:15 PM by romain
I continue to debug. I found two things very strange :

The CARP Announcement packet have public IP inside.. I should not only have same subnet IP (my two firewall are in 172.28.11.101 and 172.28.11.102) ?

172.28.11.101 > vrrp.mcast.net: vrrp 172.28.11.101 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 3, prio 0, authtype none, intvl 1s, length 36, addrs(7): p4FE15735.dip0.t-ipconnect.de,251.222.243.34,66.146.73.124.broad.dynamic.hf.ah.cndata.com,127.76.101.79,251.40.1.5,36.138.207.21,sto95-4-88-178-136-1.fbx.proxad.net

I also note many many bad cksum 0 on different type of packet (CARP Announcement or ICMP) :

21:00:08.219352 IP (tos 0x10, ttl 255, id 46264, offset 0, flags [DF], proto VRRP (112), length 56, bad cksum 0 (->2ef9)!)


root@OPNSENSE:~ # tcpdump -i lagg0_vlan8 -vvv "carp"
tcpdump: listening on lagg0_vlan8, link-type EN10MB (Ethernet), capture size 65535 bytes
21:00:08.219352 IP (tos 0x10, ttl 255, id 46264, offset 0, flags [DF], proto VRRP (112), length 56, [b]bad cksum 0 (->2ef9)!)[/b]
    172.28.11.101 > vrrp.mcast.net: vrrp 172.28.11.101 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 3, prio 0, authtype none, intvl 1s, length 36, addrs(7): p4FE15735.dip0.t-ipconnect.de,251.222.243.28,15.sub-97-205-206.myvzw.com,adsl-75-24-212-125.dsl.pltn13.sbcglobal.net,dynamic.sdtv.net.tw,219.164.243.201,softbank126252245163.bbtec.net
21:00:09.220242 IP (tos 0x10, ttl 255, id 30484, offset 0, flags [DF], proto VRRP (112), length 56, bad cksum 0 (->6c9d)!)
    172.28.11.101 > vrrp.mcast.net: vrrp 172.28.11.101 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 3, prio 0, authtype none, intvl 1s, length 36, addrs(7): p4FE15735.dip0.t-ipconnect.de,251.222.243.29,143.126.25.101,149.104.16.164,c-68-36-70-172.hsd1.mi.comcast.net,slip139-92-30-202.fra.de.prserv.net,142.41.200.122
21:00:10.221352 IP (tos 0x10, ttl 255, id 15770, offset 0, flags [DF], proto VRRP (112), length 56, bad cksum 0 (->a617)!)
    172.28.11.101 > vrrp.mcast.net: vrrp 172.28.11.101 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 3, prio 0, authtype none, intvl 1s, length 36, addrs(7): p4FE15735.dip0.t-ipconnect.de,251.222.243.30,147.16.210.213,58.204.203.159,44.59.163.33,42.213.235.216,168.192.80.249
21:00:11.222240 IP (tos 0x10, ttl 255, id 14066, offset 0, flags [DF], proto VRRP (112), length 56, bad cksum 0 (->acbf)!)
    172.28.11.101 > vrrp.mcast.net: vrrp 172.28.11.101 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 3, prio 0, authtype none, intvl 1s, length 36, addrs(7): p4FE15735.dip0.t-ipconnect.de,251.222.243.31,ip-109-90-25-189.hsi11.unitymediagroup.de,51.192.163.97,55.187.192.51,118.201.211.21,64.16.244.104
21:00:12.223336 IP (tos 0x10, ttl 255, id 8975, offset 0, flags [DF], proto VRRP (112), length 56, bad cksum 0 (->c0a2)!)
    172.28.11.101 > vrrp.mcast.net: vrrp 172.28.11.101 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 3, prio 0, authtype none, intvl 1s, length 36, addrs(7): p4FE15735.dip0.t-ipconnect.de,251.222.243.32,30.28.37.249,202.5.199.134,169.62.123.150,236.221.81.16,133.206.52.220
21:00:13.224235 IP (tos 0x10, ttl 255, id 4142, offset 0, flags [DF], proto VRRP (112), length 56, bad cksum 0 (->d383)!)
    172.28.11.101 > vrrp.mcast.net: vrrp 172.28.11.101 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 3, prio 0, authtype none, intvl 1s, length 36, addrs(7): p4FE15735.dip0.t-ipconnect.de,251.222.243.33,163.0.108.130,softbank219053055176.bbtec.net,199.188.240.51,c-67-182-72-116.hsd1.ca.comcast.net,233.200.43.96
21:00:14.225353 IP (tos 0x10, ttl 255, id 32762, offset 0, flags [DF], proto VRRP (112), length 56, bad cksum 0 (->63b7)!)
^C
    172.28.11.101 > vrrp.mcast.net: vrrp 172.28.11.101 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 3, prio 0, authtype none, intvl 1s, length 36, addrs(7): p4FE15735.dip0.t-ipconnect.de,251.222.243.34,66.146.73.124.broad.dynamic.hf.ah.cndata.com,127.76.101.79,251.40.1.5,36.138.207.21,sto95-4-88-178-136-1.fbx.proxad.net


Any idea ?

I read that it can be from the TSO and LRO which are active on my network card. I can disable it by using ifconfig command. However, TSO6 always stay and when I rebook the firewall, options are coming back.

As I'm using lagg the options are set at the startup of the firewall. How can I be sure that these options are disabled permanently ?

I use oce.ko driver delivered by Emulex directly for FreeBSD 10.1

You can disable LRO and/or TSO in the GUI  System-> Settings-> Networking

LRO is known to cause issues with a lot of hardware, so you better disable it.
TSO usally works well, but if not disable it as well.



Thank. I already did that but I m not sure it s okay. HowHcan can I be sure?

I still have the option activated on my network card.

I also change the value in the sysctl

Hello,

I would like to setup the options of my network card permanently. How can I do that ? If I do a ifconfig oce0 -vlhwfilter it's works now but if I reboot these change are gone. As I'm using lagg on this interface I need to force the option before the boot.

Any idea to do it cleanly ?

Thank you !

We can add a knob for vlanhwfilter in the GUI. For now, you'll have to put the custom command it into e.g. /usr/local/etc/rc before the rc.bootup invoke. Please not this will get wiped on firmware updates as well. Ticket here:

https://github.com/opnsense/core/issues/252

Basically it's not only for vlanhwfilter it's for every option. I should be great if we have a place where we can give you the option to activate or not. In my case something like :

ifconfig oce0 -lro -tso -tso4 -tso6 -rxcsum -txcsum
ifconfig oce1 -lro -tso -tso4 -tso6 -rxcsum -txcsum

To configure the interface as I wanted

Thank you Franco for the tips. but I can't make it work. Here what I did :

/usr/local/etc/rc

#MODIF ROMAIN
echo -n "Modification ifconfig oce0..."
ifconfig oce0 -lro -tso -tso4 -tso6 -rxcsum -txcsum > /root/oce0.txt 2>&1
echo -n "Modification ifconfig oce1..."
ifconfig oce1 -lro -tso -tso4 -tso6 -rxcsum -txcsum > /root/oce1.txt 2>&1

# let the PHP-based configuration subsystem set up the system now
echo -n "Launching the init system..."
rm -f /root/lighttpd*
touch /var/run/booting
/usr/local/etc/rc.bootup
rm /var/run/booting


The file oc1.txt and oce0.txt are created. But if I do a ifconfig right after the boot, the removed options are still there :


root@TEST:~ # ifconfig oce1
oce1: flags=8043<UP,BROADCAST,RUNNING,MULTICAST> metric 0 mtu 1500
        options=507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO>
        ether 00:90:fa:9d:29:d8
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        status: active
root@TEST:~ # ifconfig oce0
oce0: flags=8043<UP,BROADCAST,RUNNING,MULTICAST> metric 0 mtu 1500
        options=507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO>
        ether 00:90:fa:9d:29:d8
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        status: active


Meh, ok. I'll take a closer look soon, thanks for testing.

Thank you. Please let me know it quit blocking today for my configuration.

Let me know if you want me to test anything

Sorry it was here.

Do you need me to test some fix ? I would like to be able to manage the options loaded on my network card before the lagg creation.

Hello Franco,

Did you have time for my bug ?

Thank you for your work anyway.

15.7.4 has a new option under "System: Settings: Networking", see attached screenshot. Could you try this and see if it helps your case?

Same trouble. I have checked and rebooted.

After the reboot, the option is still there :


ifconfig oce0
oce0: flags=8043<UP,BROADCAST,RUNNING,MULTICAST> metric 0 mtu 1500
        options=507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO>
        ether 00:90:fa:9d:29:d8
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        status: active


I'm trying to deactivate RXCSUM,TXCSUM and TSO too but can't find a way to do it properly