Show posts

Dear all,

we have a HA pair of OPNsenses, LAN trunk interface with around 10 Vlans on both machines, WAN and admin interfaces separate NICs. Matching carp interfaces on the nodes for the Vlans, Wan, admin.
We sync our config from node1 to 2, node1 is regulary master. Sync between the pair is done over a dedicated hardware interface with direct cable connection.

This setup has grown over the last 1 1/2 years, but worked like a charm with firmware updates, reboots, changes between master/backup mode, everything good - until this morning:
node1 was in state "BACKUP" since a few days - we've seen this happening before, but after rebooting the node, everything went back to normal in the past. So we checked for firmware updates in the morning, it showed one minor update, no reboot required, installed it one node1. And booted to get rid of the "backup" state. System took quite a long time to come up again, afterwards stayed in "backup" mode with GUI telling "system is booting, not all services started". This stayed for around another 15 minutes, then node1 became "master" for ~7 out of its 15 carp interfaces. We found out after few minutes that we had connectivity problems in some of the vlans, partially services not available, slow or broken internet connection and decided to take the "safe" way: and shut and turned off node1 completely. Node2 is master again since then, and everything is "fine" from the connectivity point of view. But of course, it can't stay like this.

So far, now to our question :) What's the safest way to get node1 back online again, to check its log files, status, and so on .. I suppose there must have been recent changes to our config which are the reason for the pair to behave like this, as it's never done so before. Maybe "force" the backup node2 to stay master, even if node1 comes back online? There is this button "enter persistent CARP Maintenance mode" on the backup node2 - I don't want to simply try it, never used it before and if I understand it right, it should normally be used on the regular master node before a system update/reboot? Any suggestions.. ?

thx a lot & best
Silke

Messages - isg-ek

High availability / Re: mixed master/backup problem, force one node to stay master?

High availability / mixed master/backup problem, force one node to stay master?