OPNsense
  • Home
  • Help
  • Search
  • Login
  • Register

  • OPNsense Forum »
  • English Forums »
  • 22.7 Legacy Series »
  • High availability sync appears to have stopped working but CARP still fine
« previous next »
  • Print
Pages: [1]

Author Topic: High availability sync appears to have stopped working but CARP still fine  (Read 725 times)

sesquipedality

  • Newbie
  • *
  • Posts: 43
  • Karma: 4
    • View Profile
High availability sync appears to have stopped working but CARP still fine
« on: September 23, 2022, 05:37:58 pm »
As of relatively recently my HA setting has stopped working.  When I try to access "High Availability" -> "Status" I get an error message:

    The backup firewall is not accessible or not configured.

CARP is still working fine, which I why I hadn't noticed and can't say when this begun.  There are no firewall rules preventing traffic on the direct ethernet link between the two firewalls.   Can anyone suggest how I might investigate / fix this?

Thanks
« Last Edit: October 21, 2022, 11:21:50 am by sesquipedality »
Logged

sesquipedality

  • Newbie
  • *
  • Posts: 43
  • Karma: 4
    • View Profile
Re: Sync appears to have stopped working
« Reply #1 on: October 21, 2022, 11:21:19 am »
Still looking for some help on this.  Even being pointed at where I might find some useful diagnostic logs as to why the link is not operative would be a help.
Logged

pmhausen

  • Hero Member
  • *****
  • Posts: 2544
  • Karma: 227
    • View Profile
Re: High availability sync appears to have stopped working but CARP still fine
« Reply #2 on: October 21, 2022, 12:27:49 pm »
Quote from: sesquipedality on September 23, 2022, 05:37:58 pm
There are no firewall rules preventing traffic on the direct ethernet link between the two firewalls.
But is there a firewall rule allowing all traffic on the direct ethernet link?

OPNsense like any reasonable firewall is "default deny".
Logged
Supermicro A2SDi-4C-HLN4F mainboard and SC101F chassis
16 GB ECC memory
Crucial MX300 275 GB SATA 2.5" plus
Crucial MX300 275 GB SATA M.2 (ZFS mirror)
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

sesquipedality

  • Newbie
  • *
  • Posts: 43
  • Karma: 4
    • View Profile
Re: High availability sync appears to have stopped working but CARP still fine
« Reply #3 on: October 21, 2022, 04:28:22 pm »
Thanks for the suggestion.  Yes, there is.  This is a previously working config that appears to have stopped working at some point.  I did have to reinstall the primary server at one point and did so using the USB stick config transfer method.  No passwords have changed.  The problem is that the diagnostic message I'm getting is so non-specific as to leave me lost as to how to even investigate what's not working.
Logged

pmhausen

  • Hero Member
  • *****
  • Posts: 2544
  • Karma: 227
    • View Profile
Re: High availability sync appears to have stopped working but CARP still fine
« Reply #4 on: October 21, 2022, 04:48:35 pm »
IP address and credentials of the secondary as configured on the master are definitely OK?
Web UI on the secondary enabled on the dedicated HA network?
Logged
Supermicro A2SDi-4C-HLN4F mainboard and SC101F chassis
16 GB ECC memory
Crucial MX300 275 GB SATA 2.5" plus
Crucial MX300 275 GB SATA M.2 (ZFS mirror)
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

b1t_r0t

  • Newbie
  • *
  • Posts: 4
  • Karma: 2
    • View Profile
Re: High availability sync appears to have stopped working but CARP still fine
« Reply #5 on: October 23, 2022, 01:10:38 pm »
I am seeing this same issue after upgrading from 22.1 to 22.7.6. It actually looks like everything is working still and the fail over works, its just something with the sync.

This seems to be related to this issue here:
https://forum.opnsense.org/index.php?topic=29521.0

I was able to repeat the issue rolling back to snapshots I had, happens every time I upgrade to 22.7.6.
« Last Edit: October 23, 2022, 01:39:20 pm by b1t_r0t »
Logged

b1t_r0t

  • Newbie
  • *
  • Posts: 4
  • Karma: 2
    • View Profile
Re: High availability sync appears to have stopped working but CARP still fine
« Reply #6 on: October 29, 2022, 10:02:30 am »
I rolled back again (22.1.10) and then upgraded again, and everything was still broken, but the error changed from the parsing error mentioned in the other post to host down now.

I disabled and renabled the interfaces on both the opnsense and vmware sides, and everything is working on 22.7.6.

Quote from: sesquipedality on October 21, 2022, 04:28:22 pm
Thanks for the suggestion.  Yes, there is.  This is a previously working config that appears to have stopped working at some point.  I did have to reinstall the primary server at one point and did so using the USB stick config transfer method.  No passwords have changed.  The problem is that the diagnostic message I'm getting is so non-specific as to leave me lost as to how to even investigate what's not working.

Log into the console and run this:
# /usr/local/etc/rc.filter_synchronize

Whats the output?
« Last Edit: October 29, 2022, 10:44:03 am by b1t_r0t »
Logged

sesquipedality

  • Newbie
  • *
  • Posts: 43
  • Karma: 4
    • View Profile
Re: High availability sync appears to have stopped working but CARP still fine
« Reply #7 on: December 03, 2022, 01:53:52 pm »
Sorry for the delayed reply - got busy with other stuff and this got put on the back burner.

The output is:

Code: [Select]
root@<host>:~ # /usr/local/etc/rc.filter_synchronize
send >>>
Host: 192.168.66.4
User-Agent: XML_RPC
Content-Type: text/xml
Content-Length: 117
Authorization: Basic cm9vdDpQaWJqSXBzSUxwVEFmNHlZOTZ4Uw==
<?xml version="1.0"?>
<methodCall>
<methodName>opnsense.firmware_version</methodName>
<params>
</params></methodCall>received >>>
error >>>
fetch error. remote host down?root@fenchurch:~ # send >>>
Missing name for redirect.
<methodName>opnsense.firmware_version</methodName>
<params>
</params></methodCall>received >>>
error >>>
fetch error. remote host down?

This did enable me to discover that I wasn't able to traceroute/ping the backup interface from the main interface.   I went through all my firewall rules to try to work out what was wrong, and the only difference I could find was that for entirely inexplicable reasons, some automatic outbound NAT rules were being generated for the backbone on the primary router (perhaps because the primary router is configured to route by the backbone if the primary internet goes down.  Anyway these happened after outbound NAT was manually disabled for the interface, and I checked that when disabling outbound rules the problem still existed.)

In any event having been through all that and disabled and re-enabled gateways I am now at a point where ping, ssh and http over the backbone are working again,  and so sync is back up and running.  Subsequent runs are not producing an error, and my sync menu is now back.   I do not know and probably never will know which traceroute over the backbone works on the secondary, but not the primary router.  Thanks for your help with this.  I do wish routers were a little less "black box" sometimes.
Logged

b1t_r0t

  • Newbie
  • *
  • Posts: 4
  • Karma: 2
    • View Profile
Re: High availability sync appears to have stopped working but CARP still fine
« Reply #8 on: December 31, 2022, 04:58:28 am »
Glad it worked out. *High Five*  :)
Logged

  • Print
Pages: [1]
« previous next »
  • OPNsense Forum »
  • English Forums »
  • 22.7 Legacy Series »
  • High availability sync appears to have stopped working but CARP still fine
 

OPNsense is an OSS project © Deciso B.V. 2015 - 2023 All rights reserved
  • SMF 2.0.19 | SMF © 2021, Simple Machines
    Privacy Policy
    | XHTML | RSS | WAP2