No IPv6 address assigned on WAN interface / Router Advertisement ignored

Started by glasi, February 13, 2021, 10:02:23 PM

Previous topic - Next topic
Quote from: marjohn56 on February 22, 2021, 01:57:32 AM
dhcp6c doesn't use RA messages. dhcp6c sends out a multicast solicit and waits for a dhcpv6 unicast advertise response from the server, it then sends a request to the server.  RS/RA is SLAAC, dhcp6 is a different beast altogether. If sometimes fails if the ISP does not have the 'other configuration' flag set in the original chit chat. dhcp6c is actually started at roughly the same time as rtsold, I'm talking milliseconds. Unlike pfsense the dhcp6c daemon used by opnsense is effectively resident and we rarely shut it down completely, having modified it heavily to allow us to use sighup to add/remove interfaces or config parameters.

Thank you for the detailed explanations.

Quote from: schnipp on February 23, 2021, 06:30:43 PM
For me it looks like there is a sequence of actions without proper synchronization during the dial-in procedure. But, currently it's ony a guess. I can try to investigate a little bit more and in case there is an issue raise an appropriate github ticket.

As mentioned OPNsense is doing a lot stuff during startup. According to my oberservations it seems that the accept_rtadv flag will be set quite late for the pppoe interface. Maybe a reason why OPNsense misses out on ISP's RA messages?

I look forward to your analysis.

the acceptrtadv flag is set before rtsold sends out it's first solicit. It's in the same function in interfaces.inc, so that's not what's causing your issue. Some ISPs send out RAs before they get an RS, so don't get fooled by that.
OPNsense 24.7 - Qotom Q355G4 - ISP - Squirrel 1Gbps.

Team Rebellion Member

had the same problem, can confirm the patch works. thanks a lot!

As already mentioned the following two scerarios don't have any causal dependency, thus we need to handle them separately.


  • unsolicited router advertisements (immediately) sent by the ISP after the CPE link comes up (pppoe dial-in)
  • solicited router advertisements sent by the ISP after requested by router solicit messages

Regarding the first scenario router advertisement messages received by the opnsense are ignored for a couple of minutes after booting the machine. This is caused by the opnsense itself because pppoe dial-in is performed before the network is fully set-up. Internal tests showed up that the "ACCEPT_RTADV" flag is added to the pppoe interface around 2 minutes after the dial-in to the ISP has been performed. The better way is to finish network setup before the link to the ISP will be established. This prevents loosing unsolicited router advertisement messages.

The second sceanrio has something to do with the rtsol daemon (rtsold) and its behavior in sending router solicit messages to the correct interface. So far I haven't looked into that. But again, both scenarios are independent to each other, but have the same goal. Therefore, both of them contribute to obtain an IPv6 address prefix.

I think, improving this situation is needed, because at least in Germany a lot of DSL connections still use pppoe encapsulation and some of them also suffer from 24h disconnects initiated by their ISP. Finally, we need to create appropriate github tickets. But before I'd like to discuss the basics in this forum.

Release: OPNsense 21.1.1-amd64
OPNsense 24.7.11_2-amd64

Scenario 1 is likely caused by deferring the IPv6 setup until the IPv4 connectivity is ready because "Use IPv4 connectivity" is checked.

https://github.com/opnsense/core/blob/70f856bf2fa9d2b1e5fc11e0b7e5bbfae04be92e/src/etc/inc/interfaces.inc#L2455

I don't recall that this ever was different. WAN comes up during boot but is not allowed to go through because it messes with the boot sequence. The WAN events are handled after boot is complete, but for PPPoE that is "too late", at least the ISP sees it that way.

https://github.com/opnsense/core/blob/70f856bf2fa9d2b1e5fc11e0b7e5bbfae04be92e/src/etc/rc.newwanip#L44
https://github.com/opnsense/core/blob/70f856bf2fa9d2b1e5fc11e0b7e5bbfae04be92e/src/etc/rc.newwanipv6#L43
https://github.com/opnsense/core/blob/master/src/etc/rc.syshook.d/start/10-newwanip

We can't assume the ISP will send an RA as soon as the IPv4 connectivity comes up or do we? How should we and/or the ISP know we are ready for it?


Cheers,
Franco


Quote from: franco on February 25, 2021, 07:28:22 PM
Scenario 1 is likely caused by deferring the IPv6 setup until the IPv4 connectivity is ready because "Use IPv4 connectivity" is checked.

Thanks for the links to the code snippets. In my eyes we should at first discuss this topic how it should look like from functional perspective.

Quote from: franco on February 25, 2021, 07:28:22 PM
I don't recall that this ever was different. WAN comes up during boot but is not allowed to go through because it messes with the boot sequence. The WAN events are handled after boot is complete, but for PPPoE that is "too late", at least the ISP sees it that way.

The description of "Use IPv4 connectivity" and its explanation in the docs is one thing which confuses me. IPv6 requests are not sent over the IPv4 connectivity layer but encapsulated in ppp. It may sound a little bit pedantic but the description can lead to misunderstanding. As far as I understood by activating this function IPv6 packets are sent over the ppp connectivity link and otherwise directly over the parent interface (e.g. the corresponding ethernet interface). Did I understand correctly?

I don't understand what do you mean with "messes the boot sequence". The WAN link should only come up after the interface itself is successfully prepared. Otherwise it's clear that we'll loose control messages for establishing the ip link.

Quote from: franco on February 25, 2021, 07:28:22 PM
We can't assume the ISP will send an RA as soon as the IPv4 connectivity comes up or do we? How should we and/or the ISP know we are ready for it?

I believe this question cannot be answered with either 'yes' or 'no' for all situations. I know some fiber modems are capable of doing ethernet link monitoring on customer's side perfomed by the ISP. Whether it is used to control the network connection is unknown. I have observed that ISPs like Deutsche Telekom and 1&1 are doing something similar on the pppoe layer. They immediately send RA messages after ppp dial-in succeeds.

So, we should assume they (ISPs) can. In case an ISP sends unsolicited RA messages we are happy, otherwise it doesn't matter and we do a fallback to router solicit messages. We can only win  :).
OPNsense 24.7.11_2-amd64


Quote from: schnipp on February 26, 2021, 05:22:13 PM
The description of "Use IPv4 connectivity" and its explanation in the docs is one thing which confuses me. IPv6 requests are not sent over the IPv4 connectivity layer but encapsulated in ppp. It may sound a little bit pedantic but the description can lead to misunderstanding. As far as I understood by activating this function IPv6 packets are sent over the ppp connectivity link and otherwise directly over the parent interface (e.g. the corresponding ethernet interface). Did I understand correctly?

Correct. It's one of those things that is neither wrong or right and thus may lead to false interpretation.

Quote from: schnipp on February 26, 2021, 05:22:13 PM
I don't understand what do you mean with "messes the boot sequence". The WAN link should only come up after the interface itself is successfully prepared. Otherwise it's clear that we'll loose control messages for establishing the ip link.

When we talk about DHCP and the boot sequence obtaining an address and configuring the services (routing, firewall) for it takes an synchronous event which messes with the boot sequence that cannot tolerate undefined concurrent behaviour. For this reason pfSense a long time ago prohibited many async paths during boot sequence using a file designating that the boot sequence is in progress.

Quote from: schnipp on February 26, 2021, 05:22:13 PM
I believe this question cannot be answered with either 'yes' or 'no' for all situations. I know some fiber modems are capable of doing ethernet link monitoring on customer's side perfomed by the ISP. Whether it is used to control the network connection is unknown. I have observed that ISPs like Deutsche Telekom and 1&1 are doing something similar on the pppoe layer. They immediately send RA messages after ppp dial-in succeeds.

So, we should assume they (ISPs) can. In case an ISP sends unsolicited RA messages we are happy, otherwise it doesn't matter and we do a fallback to router solicit messages. We can only win  :).

I revised the patch a bit so that basically we get ready for dealing with SLAAC as soon as PPPoE is "dialling" and created its interface. If that is not soon enough we are still screwed since we cannot push the accept_rtadv flag to a nonexistent interface, but I expect some sort of grace period for the ISP RA so testing the theory is the best approach.

https://github.com/opnsense/core/commit/a0248c7e


Cheers,
Franco

FYI - From a reboot to complete configuration on dual wan system, one dhcp/dhcp6 one PPPoE/dhcp6, One LAN and three VLANs all tracking respective WANs ( 3 on dhcp WAN, one on PPPoE WAN ) takes my test system 21 seconds.
OPNsense 24.7 - Qotom Q355G4 - ISP - Squirrel 1Gbps.

Team Rebellion Member

Quote from: franco on February 26, 2021, 08:26:06 PM

I revised the patch a bit so that basically we get ready for dealing with SLAAC as soon as PPPoE is "dialling" and created its interface. If that is not soon enough we are still screwed since we cannot push the accept_rtadv flag to a nonexistent interface, but I expect some sort of grace period for the ISP RA so testing the theory is the best approach.

https://github.com/opnsense/core/commit/a0248c7e


I have updated to the latest version of opnsense (21.1.2-amd64) and applied both patches (943db279 and a0248c7e). The first test looks fine. Now, the IPv6 address is immediately assigned to the pppoe interface after dial-up has succeeded. Also the respective IPv6 Dyndns are updated in time. I do some further tests and I'll report report here in case there is something new.

@Franco: Thanks for the patch  :).
OPNsense 24.7.11_2-amd64

Quote from: marjohn56 on February 27, 2021, 11:52:37 AM
FYI - From a reboot to complete configuration on dual wan system, one dhcp/dhcp6 one PPPoE/dhcp6, One LAN and three VLANs all tracking respective WANs ( 3 on dhcp WAN, one on PPPoE WAN ) takes my test system 21 seconds.

Thank you for the information. The startup time mainly depends on the specific system configuration and used modules. So, it's not really comparable. Startup time of my opnsense takes several minutes. I habe seen that much time is consumed by unbound (updating its blacklists?) and squid (updating its blacklists?). This need some further investigation the next time and can be discussed in separate threads if needed.
OPNsense 24.7.11_2-amd64

@schnipp, thanks a lot! landed as an official commit via https://github.com/opnsense/core/commit/f1afe998ad

We will do a bit of regression testing but it can go into a later stable update, maybe 21.1.4 or 21.1.5.


Cheers,
Franco

Hi Franco,

I wanna test the new behavior, but not sure which patch(es) to apply on my OPNsense 21.1.2? All of them (943db279 and a0248c7e)?

Regards
Torsten