[Worky™️ now] 24.7 HA seemingly Noworky™️ ... (not sure why tho)

Started by Wolfspyre, July 10, 2024, 01:04:18 AM

Previous topic - Next topic
Howdy all!

I've updated my firewall pair to 24.7 by doing a config-import upgrade with the vga iso beta image.

Looks like there's gonna be some really cool stuff ahead!

I'm seeing an odd problem, however, with my hosts, in that the backup is no longer seen by the primary.

carp still seems to be working.

with sync compatibility either at 24.1 or 24.7, I don't experience a change in behavior.
I have the heartbeat crossover interface specified as the synchronization interface....

on the primary node, I have the synchronize peer ip set to the ip of the standby, and the
( synchronize config remote sytem username / remote system password ) values set.
These are empty on the standby.

I tried creating an API key for the sync user and adding it to the config on the standby, and using either of the components as the remote system password... but it didn't change the behavior at all

I can connect to the standby's heartbeat IP from the primary over the heartbeat interface via icmp, 80/443


primarynode# /usr/local/etc/rc.filter_synchronize
send >>>
Host: 10.18.100.2
User-Agent: XML_RPC
Content-Type: text/xml
Content-Length: 117
Authorization: Basic da-REDACTED-Q=
<?xml version="1.0"?>
<methodCall>
<methodName>opnsense.firmware_version</methodName>
<params>
</params></methodCall>received >>>
error >>>
fetch error. remote host down?
.


[root@primary /home/syncuser]# ssh 10.18.100.2 hostname
standby.mydomain.com
[root@primary /home/syncuser]#


I thought the problem might be SSL related, so I minted a trusted ssl cert which had the IPs of the hosts in the cert
but got the same issue....

so I disabled SSL on the web interface....

same issue.

I tried removing the synchronize IP so they try with multicast (with both 24.1 and 24.7 compatibility)
... but didn't notice a change in behavior


What diagnostics might I try to help identify the issue?

is this a known issue?




Hm.
well... maybe?
so I ran
opnsense-patch 07b96bc
on both nodes, validated by:


[root@primary /var/log]# grep -c allow_url_fopen  /usr/local/opnsense/service/templates/OPNsense/WebGui/php.ini
0
[root@standby /var/log]#  grep -c allow_url_fopen  /usr/local/opnsense/service/templates/OPNsense/WebGui/php.ini
0


and restarted allthethings™️, which exhibited no change in behavior.

with the HA mode configured to 24.7

in https://both-my-firewalls/system_advanced_admin.php

I adjusted the deployment type to 'development'
I disabled http compression,
I enabled access logging,

and kicked off

/usr/local/etc/rc.filter_synchronize

again on the primary node:


[root@primary /usr/local]# /usr/local/etc/rc.filter_synchronize

Deprecated: Creation of dynamic property SimpleXMLRPC_Client::$url is deprecated in /usr/local/etc/inc/XMLRPC_Client.inc on line 92
send >>>
Host: 10.18.100.2
User-Agent: XML_RPC
Content-Type: text/xml
Content-Length: 117
Authorization: Basic da-REDACTED-Q=
<?xml version="1.0"?>
<methodCall>
<methodName>opnsense.firmware_version</methodName>
<params>
</params></methodCall>received >>>

Deprecated: Creation of dynamic property IXR_Message::$currentTag is deprecated in /usr/local/opnsense/contrib/IXR/IXR_Library.php on line 239

Warning: Cannot modify header information - headers already sent by (output started at /usr/local/opnsense/contrib/IXR/IXR_Library.php:239) in /usr/local/opnsense/contrib/IXR/IXR_Library.php on line 464

Warning: Cannot modify header information - headers already sent by (output started at /usr/local/opnsense/contrib/IXR/IXR_Library.php:239) in /usr/local/opnsense/contrib/IXR/IXR_Library.php on line 465

Warning: Cannot modify header information - headers already sent by (output started at /usr/local/opnsense/contrib/IXR/IXR_Library.php:239) in /usr/local/opnsense/contrib/IXR/IXR_Library.php on line 466

Warning: Cannot modify header information - headers already sent by (output started at /usr/local/opnsense/contrib/IXR/IXR_Library.php:239) in /usr/local/opnsense/contrib/IXR/IXR_Library.php on line 467
<?xml version="1.0"?>
<methodResponse>
  <params>
    <param>
      <value>
      <struct>
  <member><name>base</name><value><struct>
  <member><name>version</name><value><string>24.7.b</string></value></member>
</struct></value></member>
  <member><name>firmware</name><value><struct>
  <member><name>version</name><value><string>24.7.b_114</string></value></member>
</struct></value></member>
  <member><name>kernel</name><value><struct>
  <member><name>version</name><value><string>24.7.b</string></value></member>
</struct></value></member>
</struct>
      </value>
    </param>
  </params>
</methodResponse>
error >>>
parse error. not well formed[root@primary /usr/local]#

I then set the deployment type of the standby node to 'production', leaving the primary node in 'development'

and switched the HA mode 24.1 and kicked off
/usr/local/etc/rc.filter_synchronize on the primary node again, which seemed to work!!!
or at least... it returned:


[root@primary /usr/local]# /usr/local/etc/rc.filter_synchronize

Deprecated: Creation of dynamic property SimpleXMLRPC_Client::$url is deprecated in /usr/local/etc/inc/XMLRPC_Client.inc on line 92

Deprecated: Creation of dynamic property IXR_Message::$currentTag is deprecated in /usr/local/opnsense/contrib/IXR/IXR_Library.php on line 239

Deprecated: Creation of dynamic property SimpleXMLRPC_Client::$url is deprecated in /usr/local/etc/inc/XMLRPC_Client.inc on line 92

Deprecated: Creation of dynamic property IXR_Message::$currentTag is deprecated in /usr/local/opnsense/contrib/IXR/IXR_Library.php on line 239

Deprecated: Creation of dynamic property SimpleXMLRPC_Client::$url is deprecated in /usr/local/etc/inc/XMLRPC_Client.inc on line 92

Deprecated: Creation of dynamic property IXR_Message::$currentTag is deprecated in /usr/local/opnsense/contrib/IXR/IXR_Library.php on line 239
[root@primary /usr/local]#



I checked the https://primary.node/status_habackup.php

page, and it showed the standby information...

I then switched the HA mode to 24.7...  and kicked off another synchronization and it still seemed to work

I re-enabled http compression.... and kicked off another synchronization....
it still works....

there was a crashreporter event I sent along from both the active/standby nodes, with a mention of this topic, in case it sheds any light...

HOPEFULLY this helps?

I dunno what the "fix" was...
perhaps there was some odd config value that switching to development mode sidestepped, and was remediated in synchronization... but I really dunno...

if there is anything log-wise or diagnostic-wise that would be helpful for me to provide, I'm happy to.
bizarre.