Home
Help
Search
Login
Register
OPNsense Forum
»
English Forums
»
High availability
»
<Solved>HA failover user connection interrupted
« previous
next »
Print
Pages: [
1
]
Author
Topic: <Solved>HA failover user connection interrupted (Read 3103 times)
i.schmidt
Newbie
Posts: 15
Karma: 0
<Solved>HA failover user connection interrupted
«
on:
September 04, 2023, 01:28:43 pm »
Hi all
We use 2 pfsense firewalls in HA setup, with CARP, state table sync and config sync (manual).
Opnsense Version 23.4.2
When I want to update, I activate "Persistent CARP Maintenance Mode" to switch to the secondary device. This works quite flawlessly except for two things:
1. User connections between devices seem to get interrupted
We use devices for recording working time. They are connected to a server via a TCP/IP connection. This connection is actively monitored by the server, to prevent manipulation or something. Every slight disruption causes the server to regard that device as offline.
When I switch to the secondary firewall, ALL of these devices still can be pinged and stay connected to the network, but they lose their server connection. So i guess this might be the secondary firewall not knowing the state of these connections.
How can i test and analyse this?
2. How to: Hot failover of WAN connection
How can i implement an automatic handover of pppoe connections to the secondary firewall? Assigning a CARP IP to these connections does not seem to work. I could not get it to work and frankly i found the information about WAN failover a little bit confusing and unclear. Maybe someone can help me out?
«
Last Edit: September 12, 2023, 11:36:25 am by i.schmidt
»
Logged
Monviech (Cedrik)
Global Moderator
Hero Member
Posts: 1601
Karma: 176
Re: HA failover user connection interrupted
«
Reply #1 on:
September 04, 2023, 02:46:56 pm »
Did you make sure that the protocol pfsync isn't blocked by the firewall default deny?
On the interface that sends and receives the pfsync packets, you have to create a firewall rule that allows protocol pfsync.
https://docs.opnsense.org/manual/how-tos/carp.html#terminology
Also it's best to leave the "Synchronize Peer IP" in System: High Availability: Settings: General settings empty on both firewalls. "The default is directed multicast" option to 224.0.0.240 works best in my opinion.
You can troubleshoot it by going into SSH shell on both firewalls, and tcpdump on your pfsync interface. You can see the states getting exchanged in clear text. You can also go into Firewall / Diagnostic / States and look there.
«
Last Edit: September 04, 2023, 02:53:14 pm by Monviech
»
Logged
Hardware:
DEC740
i.schmidt
Newbie
Posts: 15
Karma: 0
Re: HA failover user connection interrupted
«
Reply #2 on:
September 04, 2023, 06:01:13 pm »
Thanks!
pfsync runs on a dedicated interface, which has an "allow everything" rule.
I will check the 2 other points tomorrow.
Logged
i.schmidt
Newbie
Posts: 15
Karma: 0
Re: HA failover user connection interrupted
«
Reply #3 on:
September 07, 2023, 08:46:43 am »
Soooooo, thanks very much for the suggestions. Yesterday i got sidetracked @work, but now I found something that might be suspicious.
Config summary:
We use a hardware interface called vtnet0 for pfsync. On this interface, there is a "allow everything everywhere" rule.
vtnet0 is connected via direct cable connection to the equivalent interface on the secondary device, no switch involved.
IP on that interface is 10.0.0.1 and on secondary it is 10.0.0.2.
On the primary, synchronize peer is 10.0.0.2 and on secondary, sync peer is 10.0.0.1
I did a packet capture on pfsync0 via
Code:
[Select]
tcpdump -i pfsync0
I immediately noticed, that there are a whole lot more packets captured on the primary device, than there are on the secondary.
This doesn't make sense, because packets outgoing on primary should also be captured incoming on secondary and vice versa. Packet count should therefore be equal on both devices, right?
So i did a capture to pcap file to analyse it better, but WTH?
"The file "pfsync.pcap" contains record data that wireshark doesn't support. (pcap: network type 246 unknown or unsupported)
So I'm a bit stuck on detailed analysis.
Logged
Patrick M. Hausen
Hero Member
Posts: 6810
Karma: 572
Re: HA failover user connection interrupted
«
Reply #4 on:
September 07, 2023, 09:05:13 am »
vtnet looks suspiciously like this is a virtualised setup? Maybe your vSwitch configuration is to blame?
Logged
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do.
(Isaac Asimov)
i.schmidt
Newbie
Posts: 15
Karma: 0
Re: HA failover user connection interrupted
«
Reply #5 on:
September 07, 2023, 09:56:26 am »
Yep, opnsense runs in a VM on that machine. Proxmox is the host OS. This makes it a heck of a lot easier to backup, restore, update and handle it.
It's the only VM on that host, because this machine is dedicated for firewall for obvious reasons.
Sadly, i can't pass-through this interface directly into the VM, because it is one port of two onboard network ports. I would loose access to the Host, if i did this.
The Onboard Controller is a Broadcom NetXtreme BCM5720 2-port Gigabit Ethernet
The other network ports for WAN an VLAN are dedicated hardware and passed through.
For completeness: There are no firewall rules applied to the VM on the host system (service deactivated. Proxmox supports firewalling)
Is it likely that the visualization layer is the culprit though?
I'm currently thinking about how i can test this... live... without making a mess during office hours
LOL i could put a USB network adapter and passthrough that and see what happens.
Logged
i.schmidt
Newbie
Posts: 15
Karma: 0
Re: HA failover user connection interrupted
«
Reply #6 on:
September 07, 2023, 01:31:10 pm »
I think I found the issue.
When we first set up the HA pair, opnsense was on version 21.x something, or even version 20.
We updated, as versions rolled along and at some point I fiddled too much with the secondary device. I had to set it up entirely new.
Somewhere between major updates, the naming scheme for devices seems to have changed. So our primary device has interfaces with the old naming scheme, while the secondary device has these network devices set up with the new names.
pfsync is clearly unable to assign the synced connections to the appropriate interfaces.
So, how can i change the names of the interfaces? Do I have to delete every interface, create a new one and reassign it? Will I be able to keep all the firewall rules?
Thats really nasty :'(
Logged
newsense
Hero Member
Posts: 1037
Karma: 77
Re: HA failover user connection interrupted
«
Reply #7 on:
September 07, 2023, 01:42:10 pm »
Make a copy of /conf/config.xml, edit, import edited file
Logged
i.schmidt
Newbie
Posts: 15
Karma: 0
Re: HA failover user connection interrupted
«
Reply #8 on:
September 12, 2023, 11:36:04 am »
I have successfully reconfigured all the interfaces via the config file.
It looks like, now the states are synced properly.
Thanks!
Logged
Print
Pages: [
1
]
« previous
next »
OPNsense Forum
»
English Forums
»
High availability
»
<Solved>HA failover user connection interrupted