I have been fighting with my IPv6 configuration on my OpnSense box. It would work when I reconfigured the interface, but about 30 minutes later IPv6 would break. I wanted to write up my observations so that hopefully someone in a similar situation can have a useful search result when trying to figure out why their IPv6 is breaking after their router advertisements expire.
The C3000Z Configuration:
I get a /29 IPv4 subnet from my provider and a /56 IPv6 subnet delegation. My modem runs ethernet to my fiber provider's ONU on the wan side, handles authentication, and presents the /29 on the LAN side. Out of the /56 delegation I define one /64 to live on the same subnet as my public IPv4 /29. The LAN network on my modem is the same network segment that is on the WAN side of my OPNsense box.
There are some limitations on what I can control on the C3000Z so I'm kind of at the mercy of how I can get it to function. Between DHCPv6 and stateless I configured it in stateless mode with a 30 minutes RA expiration. I selected a static IP in the /64 network and created a static route entry pointing a /60 prefix to it.
The OPNsense Configuration:
On my OPNsense box I setup my WAN interface IPv6 in static IP mode using the address I defined as the next hop target for the /60 subnet on the C3000Z. I also take one of the /64 networks from the /60 delegation and configure that with DHCPv6 on the LAN interface.
Notably when I configured the IPv6 on the WAN interface I was having trouble getting the modem to tell me what its IPv6 address was on that interface so I just saved and activated the configuration with the "IPv6 Upstream Gateway" set to "Auto-detect" and my IPv6 started working.
It all seemed to work, but there was a catch
I thought at this point that I had it all working. My computers were getting IPv6 addresses and everything was working. However, the next morning I wasn't able to connect to IPv6 hosts anymore. In my continued testing every time I would touch the configuration and apply it to the interface the IPv6 would start working, but I started to notice that it stopped about 30 minutes later. I figured this was because it was getting a router advertisement and then forgetting it. I couldn't work out why it wasn't just continuing to respect the RA it was getting. When it was working I saw a default route in the output in System -> Routes -> Status, but the default would eventually disappear from the list of ipv6 routes.
As I was trying to diagnose what the trouble was and lamenting that I couldn't just set a static IP on the C3000Z to use as a static IP for my gateway I looked at the routing table again and noticed that the default route was just pointing to the fe80::/64 local network autoconfig address that is based on my modem's mac address.
The thing I had been worried about was the possibility that the upstream address in the router advertisement could change and that the gateway address could also change as a result, but I stopped worrying when I saw that it was the local address based on the mac. Once I realized this I created an upstream gateway and set my router's fe80:: autoconfig address as the next hop and my IPv6 has continued to work even after the normal RA expiration time when it was breaking before.
What I suspect was the problem
I think it is just a little bit of luck that made it work in the first place. My modem gives me precious little information and I struggled to identify if it was even binding any kind of an address in the /64 IPv6 subnet I had defined. I suspect that when OPNsense was configuring my WAN interface it respected a router advertisement it received while configuring the interface, but when it eventually expired no additional advertisements were accepted because the interface was in static IPv6 configuration mode. By finding a reasonable next hop address for my modem and configuring that as the IPv6 Upstream Gateway on my WAN interface I believe that it is just asserting a correct gateway route instead of temporarily getting a next hop and forgetting once it expires.
Isn't it always great when you learn some fundamentals while actually just fixing an issue? :)
Quote from: skruger on January 25, 2024, 11:56:14 PM
I looked at the routing table again and noticed that the default route was just pointing to the fe80::/64 local network autoconfig address that is based on my modem's mac address.
Router Advertisements never explicitly contain a gateway address. Instead, the RA's source address is considered to be the gateway address. And the RA's source address is the link-local address of the interface that sent it. So gateways advertised via RAs are always link-local.
Quote from: skruger on January 25, 2024, 11:56:14 PM
Once I realized this I created an upstream gateway and set my router's fe80:: autoconfig address as the next hop and my IPv6 has continued to work even after the normal RA expiration time when it was breaking before.
That's exactly how you're supposed to configure it. Dynamic gateways on static interfaces aren't really supported in OPNsense. You'd have to set the interface to SLAAC mode if you'd want to learn the gateway dynamically. But in your case, I'd go with a fully static setup, too.
May I ask why you're using that Zyxel box in the first place? You keep mentioning "modem", but it seems you're only using it as an additional router. Why not connect OPNsense directly to the ONU?
Cheers
Maurice
I'm using the Zyxel box because it makes my life easier if I have to contact Centurylink support. Having their box terminate the final handoff of the fiber connection means that it is the hardware they supplied and provisioned that is doing the PPPoE authentication. As a matter of convenience and ease of support I want that device owning the gateway for my public subnet so that everything else can simply be a static IP configuration in IPv4 or IPv6.
I have found that trying to replace the service provider's modem instead of letting it be the PPPoE terminator and public address space gateway just leads to headaches. Many years ago I tried to help someone with their pfSense that was trying to own the PPPoE instead of letting the provider's equipment do it and there was nothing but suffering and finger pointing about whose equipment was doing it wrong.