Archive > 21.7 Legacy Series

HA cluster, IPv6 CARP and router advertisements - best practice?

<< < (3/4) > >>

bimbar:
I have another interesting link for that: https://datatracker.ietf.org/doc/html/rfc7157 "IPv6 Multihoming without Network Address Translation"

It talks about many of the same problems, namely gateway selection and source address selection. Sadly, not a lot of solutions.

bimbar:

--- Quote from: pmhausen on October 14, 2021, 08:24:57 pm ---Hi all,

I have a pair of OPNsense firewalls and we are dual-stack throughout the entire data center. For IPv6 everything is routed, no NAT taking place. The DMZ depicted in the network overview has got a single "permit anything out" rule. From outside to the DMZ certain selected services to certain hosts are permitted, but as I said no NAT, port forwarding, just firewall rules.


--- Code: ---                     +--------------------------------------------------------------+                       
                     |                                                              |                       
                     |                            Uplink                            |                       
                     |                                                              |                       
                     +--------^--------------------------------------------^--------+                       
                              |                                            |                               
                              |                                            |                               
                              |                                            |                               
                              |                                            |                               
                     +-----------------+                          +-----------------+                       
                     |                 |       HA-Interface       |                 |                       
                     |   OPNsense 1    |--------------------------|   OPNsense 2    |                       
                     |                 |                          |                 |                       
                     +-----------------+                          +-----------------+                       
                              |                                            |                               
                              |                    CARP                    |                               
 2a00:b580:a000:4000::252/64  +-------> 2a00:b580:a000:4000::254/64 <------+   2a00:b580:a000:4000::253/64 
 fe80::f690:eaff:fe00:6501/64 |                      |                     |   fe80::f690:eaff:fe00:6507/64
                              |                      |                     |                               
                              |                      |                     |                               
            #-----------------v----------------------v---------------------v-------------------#           
                                        DMZ 2a00:b580:a000:4000::/64                                       
--- End code ---

We use SLAAC for host configuration in the DMZ and I configured radvd as pictured in the screenshot. What I would have expected as a result is that the CARP address is announced as the default router.

What happens instead is that the link-local address of the interface is announced. OK, this makes perfect sense in a single unit setup. But in our case both the active and the backup node announce their respective link-local addresses.

This leads to intermittent drops of TCP connections and possibly other problems which we have not yet clearly identified if a client with two default routes decides to switch the gateways in the middle of a long lived connection.

Questions:

* Why isn't the global unicast CARP address announced instead if the link local ones?
* Even with link-local, shouldn't pfSync take care of keeping the state tables in sync so it should not happen that a packet hits the "default deny" rule?
* When I manually disable radvd on the backup, things work reliably - shouldn't the HA mechanism take care of toggling the service on/off depending on the role of the node?
* Related but different topic: what happens when I enable dhcpd in a HA setup? Shouldn't the HA mechanism disable the backup?
* What's considered the best practice in this scenario?
DHCPv6 isn't of any use here, because it doesn't send a default gateway to the client systems. This is only sent via RA. I could configure all host statically in the DMZ, but once we get to the LAN, which at the moment uses SLAAC, too - because "what else" - that is out of the question. Too many devices coming and going.

Workaround: exempt "DHCPv6" from HA sync and disable RA on the backup node. But that means in case of a failover a manual intervention is necessary to get IPv6 working again.

So ... is there a solution?

Kind regards,
Patrick

--- End quote ---

I just found out: https://github.com/opnsense/core/pull/5185 should be exactly what you need, combined with a link-local CARP address. As far as I know, you can not use a GUA as next-hop.

Patrick M. Hausen:
Awesome! Thanks!

tomstephens89:
Same issue here. Thanks to @pmhausen for confirming the problem.

https://forum.opnsense.org/index.php?topic=25243.msg121205#msg121205

Until the ability to specify the source address of radvdis implemented, the only way that this works is to keep the ra daemon on the backup stopped until it needs to become master. Then the RA Daemon may be started on the now master, and you must ensure it is stopped on the now backup.

tomstephens89:
What was the outcome of this? I see there was a lot of activity on GitHub?

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version