OPNsense Forum

Archive => 21.1 Legacy Series => Topic started by: sesquipedality on July 22, 2021, 12:50:43 pm

Title: Automatic CARP demotion on link failure
Post by: sesquipedality on July 22, 2021, 12:50:43 pm
Hello, I have two routers connected to different uplinks in a redundant CARP setup, one configured as primary CARP, the other as backup.  I would like the primary machine to demote itself when when its uplink fails and to re-establish itself as primary when the link comes back up.  Is there a way to do this with OpnSense? Thanks.
Title: Re: Automatic CARP demotion on link failure
Post by: mimugmail on July 22, 2021, 03:00:12 pm
If both machines are not connected on all ports to all interfaces it's not a supported CARP design, sorry.
Title: Re: Automatic CARP demotion on link failure
Post by: sesquipedality on July 22, 2021, 05:11:52 pm
That seems like a shame.  I'm currently considering firing up two OpnSense instances (in VMs) to act as pure WAN interfaces (WAN on one side, unfirewalled VPN on the other) so that I can configure my redundant firewall routers (also VMs) to talk to both interfaces via the unfirewalled VPN, but this seems like a very complex way to accomplish something that I think ought to be a lot simpler than this.
Title: Re: Automatic CARP demotion on link failure
Post by: sesquipedality on July 23, 2021, 07:22:53 pm
OK, so it turns out that this is in fact totally possible using monit.

In my design I have used a primary and backup router, but there may be appropriate tweaks for other setups.  CARP setup is the same for both routers, save that the skew on the backup machine is set to 50 (any number between 0 and the number used for demotion will do.  I have set monit to run every 5 seconds.  This does not seem to have a noticeable performance impact on my router.

The whole process is managed by a couple of shell scripts.  You will need the "bash" package installed to use them.  (Sorry, I do not do csh.)  Doubtless these could be much tidier and more robust than they are, but perhaps someone can use them as a starting point for their own more sensible scripts.  The "uplink-status" script monitors a named gateway, and dpinger monitoring will have to be configured for that gateway.  It ignores packet loss and other statuses, and stores the last (fully up or fully down) gateway status in /usr/local/var/run.  It will emit a non-zero return code when the link changes from its previous status to the opposite one.

Code: [Select]
#!/usr/local/bin/bash
#
# uplink-status - check whether we have internet connectivity
#

UPLINK_NAME="MY_GW"
UPLINK_STATE_FILE=/usr/local/var/run/uplink-status

LINK_UP=2
LINK_DOWN=1

# Cleanup
quit() {
    EC=$1

    exit $EC
}

checkrc() {
    PRC=$1; if [ $PRC -ne 0 ]; then
       echo "$0: Exiting due to error ($PRC)" 1>&2
       exit 0
    fi
}

GATEWAY_STATUS_JSON=`pluginctl -r return_gateways_status`
UPLINK_PRESENT=`echo $GATEWAY_STATUS_JSON | grep -c $UPLINK_NAME`

if [ "$UPLINK_PRESENT" -eq 0 ]; then
    CUR_STATUS="down"

else

    # Check dpinger status of the uplink
    CUR_STATUS=`echo $GATEWAY_STATUS_JSON | python3 -c "import sys, json; print(json.load(sys.stdin)['dpinger']['${UPLINK_NAME}']['status'])"`
    checkrc $?
fi

if [ ! -f "$UPLINK_STATE_FILE" ]; then
   echo "up" > $UPLINK_STATE_FILE
   checkrc $?
fi

LAST_STATUS=`cat $UPLINK_STATE_FILE`
checkrc $?

if [ "$LAST_STATUS" = "up" ] && [ "$CUR_STATUS" = "down" ]; then
    echo "down" > $UPLINK_STATE_FILE
    echo "down"
    quit $LINK_DOWN
elif [ "$LAST_STATUS" = "down" ] && [ "$CUR_STATUS" = "none" ]; then
    echo "up" > $UPLINK_STATE_FILE
    echo "up"
    quit $LINK_UP
else
    if [ "$CUR_STATUS" = "none" ]; then CUR_STATUS="up"; fi
    echo "no change ($CUR_STATUS)"
    quit 0
fi

When monit receives a non-zero return code from the above script, it then calls "set-carp-demotion" which will decrease the interface's carp priority if the link has gone down, and increase it if it has come back up.

Code: [Select]
#!/usr/local/bin/bash
#
# set-carp-demotion - demote/promote a link dependent on interface status
# Called by monit on link status change
#

DEMOTION_VALUE=100
LINK_STATUS=${MONIT_PROGRAM_STATUS}

LINK_UP=2
LINK_DOWN=1

# Cleanup
quit() {
    EC=$1

    if [ $EC -ne 0 ]; then echo "$0: Exiting due to error ($EC)"; fi
    exit $EC
}

checkrc() {
   PRC=$1; if [ $PRC -ne 0 ]; then quit $PRC; fi
}

CUR_DEMOTION=`sysctl -n net.inet.carp.demotion`
checkrc $?

if [ "$LINK_STATUS" -eq "$LINK_UP" ]; then
    if [[ "$CUR_DEMOTION" -gt "0" ]]; then
        sysctl -q net.inet.carp.demotion=-$DEMOTION_VALUE
        checkrc $?
    fi

elif [ "$LINK_STATUS" = "$LINK_DOWN" ]; then
    if [[ "$CUR_DEMOTION" -lt "$DEMOTION_VALUE" ]]; then
        sysctl -q net.inet.carp.demotion=$DEMOTION_VALUE
        checkrc $?
    fi

else
    echo "Invalid link status '$LINK_STATUS'."
    quit 1
fi

quit 0

While this solution is quite specific to my own situation, I hope that by posting it here, it may help someone else who is trying to have finer grained control over their CARP interface.