OPNsense Forum

Archive => 19.1 Legacy Series => Topic started by: cpw on May 23, 2019, 06:39:10 am

Title: Dual WAN IPV6 DHCP is broken
Post by: cpw on May 23, 2019, 06:39:10 am
Hi
So, documenting this for others to find afterwards..

If you have a multi-wan setup, with both WAN providers sending DHCP configurations for IPV6, exactly one will receive an IPV6 address, the others will not.

The problem lies in the dhcp6c code - specifically, it's binding the wildcard network. Each WAN interface requesting dhcp v6 will have it's own copy and configuration of dhcp6c.

Code: [Select]
# sockstat -l | fgrep :546
root     dhcp6c     87111 6  udp6   *:546                 *:*
root     dhcp6c     80193 9  udp6   *:546                 *:*

Code: [Select]
# ps aux | fgrep dhcp6c
root    80193   0.0  0.0 1057796  2788  -  Is   17:13      0:00.07 /usr/local/sbin/dhcp6c -Dn -c /var/etc/dhcp6c_opt2.conf -p /var/run/dhcp6c_pppoe0.pid pppoe0
root    87111   0.0  0.0 1057796  2784  -  Is   22:33      0:00.04 /usr/local/sbin/dhcp6c -Dn -c /var/etc/dhcp6c_opt1.conf -p /var/run/dhcp6c_re1.pid re1

What happens is that one of these processes (lower PID?) will always receive all configuration, even though the other instance is requesting it:

Code: [Select]
May 22 23:30:36 dhcp6c[87111]: Sending Solicit
May 22 23:30:36 dhcp6c[80193]: unexpected interface (2)

Investigating the code, the problem seems to lie around here: https://github.com/opnsense/dhcp6c/blob/101d0ee06a58c6ea099e736ebb5ea237f9e84e15/dhcp6c.c#L259

Specifically, I think that getaddrinfo(NULL,...) is going to bind wildcard. Unfortunately, because the whole point of the bind is to receive configurations like "what's my address", it seems we can't easily bind a specific address. On Linux, they use sockopt(SO_BINDTODEVICE) to force binding to a specific device. Sadly, not an available option on the BSD stack.

I would file an issue on the github, but I'm not sure what utility it would have, unless someone has an idea how to fix this (I can't see anything, it seems like a bit of an impass to me).

I hope this helps someone stop scratching their head after many hours of frustration.
Title: Re: Dual WAN IPV6 DHCP is broken
Post by: franco on May 23, 2019, 10:23:39 am
It's not broken. It never worked in the first place. Sorry.


Cheers,
Franco
Title: Re: Dual WAN IPV6 DHCP is broken
Post by: cpw on May 23, 2019, 04:49:23 pm
Well, that's one solution to the "How do I do multihome with IPV6". Perhaps we should add it to the list at RFC7157 ;)

In seriousness, I've been digging and investigating. I wondered if there was a hint in the dhclient code - but nope, it looks like it relies on the SO_BINDTODEVICE on linux and is generally considered broken elsewhere. Except, it's not really.

One of the tricks they use in the dhclient code is that they rebind once they have an address. Perhaps that's the solution here, too? Once an interface gets an address, it's dhcp6c client reopens the socket, bound to it (this seems to be what dhclient does). I'm not 100% familiar with everything DHCP, but I think it should work, maybe? Is there a reason to keep listening on the wildcard interface, after you have an assigned address?

The only other solution I can think of is some sort of multiplexing - either rewrite the dhcp6c code to support managing multiple interfaces simultaneously (that looks like a complicated mess), or else write a proxy dispatcher that somehow can route to the right dhcp6c instance (that looks almost as complicated. Hmmm).

One thing I saw mention of, is that BSD has a concept of "routing table" per process instance. Perhaps that might be another solution as well - I'm not familiar with BSD specific network programming, so I don't have much insight, other than what I can google up.

Of course, all of these solutions are probably a significant development effort, so I completely understand why it's unlikely to be fixed anytime soon.

One super janky solution I thought of was to just kill the "younger" process off occasionally. I'm presuming dhcp6 is pretty fault tolerant. So perhaps every hour or so, just kill the one with the lower pid, which seems to be the one getting the traffic. Does OPNSense restart a killed dhcp6c process?
Title: Re: Dual WAN IPV6 DHCP is broken
Post by: franco on May 23, 2019, 07:24:40 pm
Hi,

I didn't mean it as a solution. It's the current status quo. I know that Martin has voiced his interest in this particular subject and maybe the time has come to pull it off. The hard part is to get the people with the right kind of setup for this. Unfortunately, I'm only on a single line IPv6 and that's already a huge achievement in Germany. But that's another story entirely...

Quote
The only other solution I can think of is some sort of multiplexing - either rewrite the dhcp6c code to support managing multiple interfaces simultaneously (that looks like a complicated mess), or else write a proxy dispatcher that somehow can route to the right dhcp6c instance (that looks almost as complicated. Hmmm).

I would think that is more or less the right direction: since the DUID is somehow bound to a dhcp6c process the question is can we get away with one DUID for both connections or does it need to split into two processes or shall we push the DUID into the configuration for a particular upstream interface and let one process handle both links.

We have the dhcp6c code ready to work with due to some other changes that Martin and his team did a while back so in general there is no problem digging into dhcp6c internals to make that happen.

https://github.com/opnsense/dhcp6c

What's your skillset in this regard? I'll try to get Martin to provide feedback and maybe we have all the ingredients here to make it a reality. :)


Cheers,
Franco
Title: Re: Dual WAN IPV6 DHCP is broken
Post by: cpw on May 24, 2019, 05:53:09 am
Understood, I was being silly in my initial response. I'm happy to lend what knowledge I have. I have apparently got IPV6 on both of my internet connections (they're both from the same upstream provider - Teksavvy, via Cable and DSL).

As to the code - I did review that github. I won't pretend to be a serious expert in networking, IPV6 (OPNSense was my attempt to finally jump on the IPV6 bandwagon!) or C, but I'm happy to deploy tests and patches.

I am interested if I can make a janky solution "sorta" work in the meantime - by killing each dhcp6c client process in turn, to allow the other to work for a bit.. This assumes that the dhcp6c automatically restarts and it's "first on the UDP port, wins", not last.
Title: Re: Dual WAN IPV6 DHCP is broken
Post by: cpw on June 19, 2019, 10:43:41 pm
OK. Hardware has been upgraded, now I want to revisit this. I've been doing some research, I believe that the proper solution is something based around setfib. It seems that FIBs are specifically designed to help with this kind of issue - in this case, we would need to somehow specify that each dhcp6c client would run with its own FIB pointing at each WAN interface. The hope is that this would provide the isolation each dhcp6c needs to not see the other guy on the other interface.
Title: Re: Dual WAN IPV6 DHCP is broken
Post by: marjohn56 on June 21, 2019, 10:14:19 am
First thing to note is that you cannot have multiple dhcp6c instances, dhcp6c will do multiple WAN interfaces but the trick is in setting up the dhcp6c config correctly, currently doing it manually is the only option, but it can be done. When I get some time, and real work has taken all of my time for the last nine months ( sigh! ) I'll start looking at adding the extra stuff needed to the GUI and handling code.
Title: Re: Dual WAN IPV6 DHCP is broken
Post by: cpw on June 21, 2019, 09:42:56 pm
Could it be a matter of just merging the two configs that are generated (one for each interface?)
Title: Re: Dual WAN IPV6 DHCP is broken
Post by: marjohn56 on June 22, 2019, 09:50:13 am
Sort of.. but not quite. ;)


I'll hopefully have some time soon, I also now have the hardware where I can emulate the dual dhcp6 WAN setup.