Red Square in /ui/core/hasync_status on OpenVPN instances but sync seems fine

Started by Zugschlus, November 06, 2025, 12:03:17 PM

Previous topic - Next topic
Hi,
I have a cluster of two OPNsense machines running 25.1.10 (I know, later). I have two OpenVPN instances configured. The OpenVPN instances seem to sync fine, so do the associated certificates seem to sync just fine. But in /ui/core/hasync_status, the two OpenVPN instances show a red square where all other services have a green arrow:
You cannot view this attachment.
That doesn't look nice. What is going on here and how can I make those two pieces of red vanish?
Greetings
Marc Haber
Marc 'Zugschlus' Haber - St. Ilgen, Germany
Freelance IT Insultant, Debian Developer, Railroad Addict

Did you explicitly specify the bind address for the instance as the CARP address on WAN? In that case the service cannot start on the standby until a failover happens. That's what the UI is telling you. Not "broken", just "stopped".

If you leave the bind address empty, everything should be green.

The HA implementation is pretty straightforward and does in general not mess with e.g. reconfiguring services on failover. The upside is it is really robust and easy to understand and debug.

Services should generally listen to INADDR_ANY (0.0.0.0) for robust binding to a socket and leave it to firewall rules to control accessability on various interfaces.

If that bothers you, I suggest binding OpenVPN to 127.0.0.1 and using NAT port forwarding from the WAN CARP address to that one.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: Patrick M. Hausen on November 06, 2025, 12:31:24 PMDid you explicitly specify the bind address for the instance as the CARP address on WAN?

I first though "of course, Idiot Me", but I didn't.

You cannot view this attachment.

Any other ideas?

By the way, your additional input that I didn't quote was wildly helpful for me to understand OPNsense's philosophy. Appreciated.

Greetings
Marc
Marc 'Zugschlus' Haber - St. Ilgen, Germany
Freelance IT Insultant, Debian Developer, Railroad Addict

And if you click on the obvious "start" button, nothing changes?

Then it's time to check the logs on the standby, I guess, for why the services fail to start.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: Patrick M. Hausen on November 06, 2025, 01:36:38 PMAnd if you click on the obvious "start" button, nothing changes?

Spinner for a while and then out.

Quote from: Patrick M. Hausen on November 06, 2025, 01:36:38 PMThen it's time to check the logs on the standby, I guess, for why the services fail to start.

Looks like the person who set up the NAT made the mistake of setting the Source to "Any" instead of the rfc1918 alias. So the outbound traffic of the standby got natted to the non-active Virtual IP which of course didn't work. This is a so common mistake that it's even on docs.opnsense.org. I fixed that and things are fine now.

Greetings
Marc
Marc 'Zugschlus' Haber - St. Ilgen, Germany
Freelance IT Insultant, Debian Developer, Railroad Addict

I configured another OpenVPN instance and had the same issue again. It solved itself after a while though.

It looks like the CRL for the CA for that new OpenVPN instance didn't get synced to the backup, at least OpenVPN on the backup complained about /var/etc/openvpn/server-SOMETHING that wasnt there on the backup but on the master despite having hit "Synchronize and reconfigure all" multiple times. Ten minutes later after a coffee the file was suddenly there.

Is it possible that hitting the "Synchronize and reconfigure all" button accidentally on the backup box helped?
Marc 'Zugschlus' Haber - St. Ilgen, Germany
Freelance IT Insultant, Debian Developer, Railroad Addict

Most certainly not - I hope you did not configure a synchronise config IP address on the backup node?

So the only thing I can picture is wrong timing - sync and restart OpenVPN first, then certificates, so OpenVPN cannot start on the backup. Syncing twice should fix that. But ten minutes ... no idea.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: Patrick M. Hausen on November 07, 2025, 01:27:39 PMMost certainly not - I hope you did not configure a synchronise config IP address on the backup node?

I dont quite understand that. What do you mean?

Quote from: Patrick M. Hausen on November 07, 2025, 01:27:39 PMSo the only thing I can picture is wrong timing - sync and restart OpenVPN first, then certificates, so OpenVPN cannot start on the backup. Syncing twice should fix that. But ten minutes ... no idea.

I did it more than twice. And there is nothing explicitly syncing certificates and CAs in the config:

You cannot view this attachment.

Greetings
Marc
Marc 'Zugschlus' Haber - St. Ilgen, Germany
Freelance IT Insultant, Debian Developer, Railroad Addict

1. System > HA > Settings --> the config sync IP address must be empty on the backup node
2. System > HA > Settings --> enable syncing of certificates
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: Patrick M. Hausen on November 07, 2025, 02:36:35 PM1. System > HA > Settings --> the config sync IP address must be empty on the backup node
2. System > HA > Settings --> enable syncing of certificates

I actually didnt have the sync IP address empty on the backup node. I didn't build that. Removed the IP address.

And I kind of expected that for everything listed in "Services" in ui/core/hasync there would be a corresponding Entry i /ui/core/hasync_status. That is not the case.

I have learned something from you. Thank you. Have a nice weekend.

Greetings
Marc
Marc 'Zugschlus' Haber - St. Ilgen, Germany
Freelance IT Insultant, Debian Developer, Railroad Addict