[SOLVED] No upload speed to backup router in HA setup due to Firewall

Started by Underpay6703, May 25, 2024, 04:21:25 AM

Previous topic - Next topic
TLDR: Was not related to HA, but to asynchronous routing between servers.

Hello! I have a pretty unique HA firewall problem, I'm hoping someone can help me out with this.

The issue summarized
Iperf tests revealed that my upload speed to the backup router is 0.00 bits per second. Download speed is unaffected.
Turning off the firewall does give me the expected speeds.
So it must be a firewall issue. But I can't figure out the cause. I am suspecting state violations, but I am not experienced enough to know for sure.

The setup
I drew a basic overview of my network in the attachment.
I have two ISPs and therefore 2 IPs, not sufficient for a standard CARP, but sufficient for a "poor man's CARP".
In addition to the hardware redundancy,  I created a "gateway" between the routers for the Gateway group configuration, so that I can access and fall back to the ISP of the backup router.
Due to circumstances, I cannot place a dumb switch between the ISP connections to allow both routers to get each other's ISP address. Which is why I opted for this setup.

Setup disclaimer
I am 100% certain that my CARP and Gateway configurations are valid. All hardware has been switched out and works as intended.
If requested, I will provide screenshots of the configuration.

The odd behavior
When using the ISP of my backup router, I noticed that my upload speed went down the drain: 0.01 bits per second on online speedtests.
Checking locally with iperf, the results are similar. Connecting with my laptop (VLAN100) to any ip (except the VLAN100 ip) of the backup router gave the following result.


[  5] local 10.10.100.150 port 44652 connected to 10.10.10.254 port 40744
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   128 KBytes  1.05 Mbits/sec    2   1.41 KBytes       
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes       
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes       
[  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes       
[  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes       
[  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes       
[  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   128 KBytes   105 Kbits/sec    5             sender
[  5]   0.00-10.00  sec  0.00 Bytes  0.00 bits/sec                  receiver


From what I understand, iperf defaults to uploading data from the client to the server, thus testing upload speed.
If I use the -R (reverse) argument to test download speed, I get the expected speed.

The cause: The firewall... somehow
I got the hint from an old forum post https://forum.opnsense.org/index.php?topic=35157.0
Sadly, the user never replied back with a solution. But indeed, when I turned off the Master router's firewall, the upload speed went back to normal.

It also explains why the speed is normal when performing iperf tests on the backup router's VLAN100 address, because the traffic does not pass through the master router's firewall. 

A new hint?
I have not created any firewall rules that prevent communication between VLANs 100 and 10. So I suspected some sort of state violation that may cause the packets to be dropped.
In the live view, I found 1 such state violation entry. But this entry only appeared in the Live View a couple seconds after the iperf test concluded.