I'm after a bit of Multi-WAN advice.
I have 2 WAN connections, Primary is 1Gb PPPoE with 1508 MTU (1500 pppoe), and Secondary is 1Gb DOCSIS cable.
My primary (ZEN) has both IPv4 and IPv6, but my secondary (VIRGIN) has only IPv4.
Gateway Group is ZEN (Tier 1) and VIRGIN (Tier 2) with a trigger level of "Member Down"
When ZEN fails, the gateway goes down (good) and apart from a connection glitch from clients (Failover States is checked) they continue on VIRGIN, so all good.
The failover works perfectly for IPv4.
However, what I'm struggling to get my head around is IPv6.
At the minute, I have not enabled IPv6 via RA to my LAN, because I want to try and understand something.
If I start squirting out IPv6 routes to my LAN using OPNsense Router Advertisements, it seems to me that even if ZEN is down, this will continue to do so.
That means clients may try and still use IPv6 with a gateway that is currently down, potentially causing a black hole.
I am wondering if anyone else out there has come across this, and how they managed to deal with it?
Normally I would do some real testing myself, but my ISP still hasnt enabled IPv6 properly, so can only ask theoretical questions.
It may not even be an issue, and IPv4 failback might happen automatically, but I'm not convinced.
I have toyed with the idea of somehow programatically changing RA using Monit if ZEN goes down to set lifetime=0, or perhaps automating a firewall rule to block IPv6.
Basically, I'm trying to make the failover as efficient and quick as possible without IPv6 black holing.
Has anyone any experience with this scenario, and thank you for taking the time to read/respond.
You are absolutely right. See https://blog.ipspace.net/2023/01/dc-ipv6-small-site-multihoming/ .
So it sounds like I need to use NAT66 rather than NPTv6. (IPv6 doesnt want you to use NAT66 right....well tough...it's needed)
I can squirt ULA's to my LAN (fd76:xx:Xx) and use a hybrid outbound NAT rule to do NAT66
This way when the gateway goes down "Failover States" should activate and kill all existing states.
The issue I then have, is new connections trying to use IPv6 before it fails over to IPv4.
I wonder if I could automate "When primary goes down, block IPv6 on the LAN interface"
By the way this issue isn't inherent to IPv6. If you would have a real IPv4 range, it would also not work without an additional mechanism like either BGP, or NAT.
Per se, the issue is who holds the current identity. With NAT you fake the identity on the router for all devices behind it, thats why its an essential part of seamless high availability in both IP protocols.
I get that. In many ways, NAT makes the multi-homed issues a lot easier to deal with.
I dont believe any residential providers in the UK offer BGP..., so we make do with what we have.
I'm going to try ULA on the LAN, NAT66 on WAN1.
When it fails, either Monit or syshook to use pf to alter an alias on a firewall rule to reject IPv6 on the LAN interface.
Alias will be empty when WAN1 is up, and ::/0 when its down. That should provide rapid fallback to IPv4.
We'll see how it goes....
You should be able to do NPTv6 on the secondary interface. You will probably have to decide which uplink is the primary and use the GUA addresses from that one, then NPTv6 on the other uplink.
The whole point is the secondary does not have IPv6....at all
I'm trying to achieve IPv4 and IPv6 on primary, but if primary fails, rapid state terminate, and use only IPv4 for secondary. No black hole.
I also have IPv6 on my primary WAN, and a 5G failover connection that only has IPv4.
I did not configure anything special, when the failover happens the clients happy eyeball towards IPv4 quickly and I don't notice much somehow.
https://en.wikipedia.org/wiki/Happy_Eyeballs
Quote from: Monviech (Cedrik) on Today at 02:18:25 PMI also have IPv6 on my primary WAN, and a 5G failover connection that only has IPv4.
I did not configure anything special, when the failover happens the clients happy eyeball towards IPv4 quickly and I don't notice much somehow.
https://en.wikipedia.org/wiki/Happy_Eyeballs
That is what I want to achieve.....
Perhaps I'll squirt out GUAs now, and test the failover and see how long it takes....