After three years of pretty much flawless operation I encountered an issue I'm not able to solve myself.
Since my upgrade to 22.7 I keep losing IPv6, intermittently.
Service is dslite, upon release/reload of the interface, manually restarting dhcpd6, IPv6 is working fine, but after a while it stops
start:
Mon Aug 15 12:47:48 PM CEST 2022 64 bytes from 2001:4860:4860::8888: icmp_seq=1 ttl=120 time=9.35 ms
stop:
Mon Aug 15 01:44:10 PM CEST 2022 64 bytes from 2001:4860:4860::8888: icmp_seq=3280 ttl=120 time=7.74 ms
These are the events I see in the log at the given time:
2022-08-15T13:44:17 1 Error opnsense 11210 /usr/local/etc/rc.resolv_conf_generate: The command '/sbin/route add -host -'inet6' '2001:4860:4860::8888' 'xxxx' returned exit code '71', the output was 'route: xxxx: Name does not resolve'
2022-08-15T13:44:16 1 Error opnsense 99065 /usr/local/etc/rc.resolv_conf_generate: The command '/sbin/route add -host -'inet6' '2001:4860:4860::8888' 'xxxx' returned exit code '71', the output was 'route: xxxx: Name does not resolve'
2022-08-15T13:44:12 1 Notice dhcp6c 8274 dhcp6c RELEASE on igb0 - running dns reload
2022-08-15T13:44:11 1 Notice dhcp6c 94544 dhcp6c RELEASE on igb0 - running dns reload
2022-08-15T13:44:11 1 Notice dhcp6c 89645 RTSOLD script - Sending SIGHUP to dhcp6c
2022-08-15T13:44:10 1 Notice opnsense 64287 plugins_configure hosts (execute task : unbound_hosts_generate())
2022-08-15T13:44:10 1 Notice opnsense 64287 plugins_configure hosts (execute task : dnsmasq_hosts_generate())
2022-08-15T13:44:10 1 Notice opnsense 64287 plugins_configure hosts ()
2022-08-15T13:44:10 1 Error opnsense 64287 /usr/local/etc/rc.newwanip: The command '/sbin/route add -host -'inet6' '2001:4860:4860::8888' 'xxxx' returned exit code '71', the output was 'route: xxxx: Name does not resolve'
2022-08-15T13:44:10 1 Error opnsense 64287 /usr/local/etc/rc.newwanip: On (IP address: 100.xxx.xxx.xxx) (interface: WAN[wan]) (real interface: igb0).
2022-08-15T13:44:10 1 Error opnsense 64287 /usr/local/etc/rc.newwanip: IPv4 renewal is starting on 'igb0'
Then after 30 minutes of no IPv6, this happens, once rc.newwanipv6 is done, IPv6 is working again, the gateway is online
2022-08-15T14:14:15 1 Error opnsense 52540 /usr/local/etc/rc.newwanipv6: On (IP address: xxxx) (interface: WAN[wan]) (real interface: igb0).
2022-08-15T14:14:15 1 Error opnsense 52540 /usr/local/etc/rc.newwanipv6: IPv6 renewal is starting on 'igb0'
2022-08-15T14:14:14 1 Notice dhcp6c 34434 dhcp6c REQUEST on igb0 - running newipv6
2022-08-15T14:14:12 1 Notice dhcp6c 16796 RTSOLD script - Sending SIGHUP to dhcp6c
2022-08-15T14:14:10 1 Notice dhcp6c 11598 RTSOLD script - Sending SIGHUP to dhcp6c
2022-08-15T14:14:10 1 Notice opnsense 89525 plugins_configure hosts (execute task : unbound_hosts_generate())
2022-08-15T14:14:10 1 Notice opnsense 89525 plugins_configure hosts (execute task : dnsmasq_hosts_generate())
2022-08-15T14:14:10 1 Notice opnsense 89525 plugins_configure hosts ()
2022-08-15T14:14:10 1 Error opnsense 89525 /usr/local/etc/rc.newwanip: The command '/sbin/route add -host -'inet6' '2001:4860:4860::8888' 'xxxx' returned exit code '71', the output was 'route: xxxx: Name does not resolve'
2022-08-15T14:14:10 1 Error opnsense 89525 /usr/local/etc/rc.newwanip: On (IP address: 100.115.53.238) (interface: WAN[wan]) (real interface: igb0).
2022-08-15T14:14:10 1 Error opnsense 89525 /usr/local/etc/rc.newwanip: IPv4 renewal is starting on 'igb0'
2022-08-15T14:14:09 1 Notice dhclient 84371 Creating resolv.conf
It's a multi-wan setup with dpinger for gateway monitoring.
This behavior is really strange. Any pointers or assistance would be most appreciated.
Hi,
So what's 'xxxx' here. It is supposed to be a gateway but looks like a FQDN?
For dslite, do you use any custom patching?
Cheers,
Franco
Quote from: franco on August 15, 2022, 03:22:39 PM
So what's 'xxxx' here. It is supposed to be a gateway but looks like a FQDN?
For dslite, do you use any custom patching?
I sanitized the log output. It's the public address of my IPv6 gateway, which I string replaced with xxxx.
No custom patching, one interface to my ISP's NT, IPv4 DHCP, IPv6 DHCPv6 and "Use IPv4 connectivity".
Would you mind sending this to me privately for inspection? Just that "The command '/sbin/route add -host [...]" line. Maybe a stray character in there or something else I'm missing, but impossible to decide on next step without it.
Cheers,
Franci
Quote from: franco on August 15, 2022, 03:36:29 PM
Would you mind sending this to me privately for inspection? Just that "The command '/sbin/route add -host [...]" line. Maybe a stray character in there or something else I'm missing, but impossible to decide on next step without it.
Thanks for the initiative. I sent you a PM with the details.
Hi, thanks :)
It misses the interface after scope sign "%" in this particular case.
I'm assuming this is the following code failing:
https://github.com/opnsense/core/blob/068ef7106dbc593542b710e8cab4bb5169eb96aa/src/etc/inc/system.inc#L369
But at first glance I don't see why $intf shouldn't be populated.
Are you able to edit the file yourself and try to find out? The file is /usr/local/etc/inc/system.inc
I think that /usr/local/etc/rc.resolv_conf_generate script should trigger the behaviour.
Cheers,
Franco
I tried to review this a bit, but quickly reached my skill ceiling. I can edit the files and test, but I definitely need help with the script trouble shooting.
Quote from: CrackalackingZ on August 15, 2022, 03:33:44 PM
No custom patching, one interface to my ISP's NT, IPv4 DHCP, IPv6 DHCPv6 and "Use IPv4 connectivity".
Just a wild guess: "Use IPv4 connectivity" is only meant for PPP, which you don't seem to use. Any improvement if you disable this?
Cheers
Maurice
Quote from: Maurice on August 15, 2022, 06:03:46 PM
Quote from: CrackalackingZ on August 15, 2022, 03:33:44 PM
No custom patching, one interface to my ISP's NT, IPv4 DHCP, IPv6 DHCPv6 and "Use IPv4 connectivity".
Just a wild guess: "Use IPv4 connectivity" is only meant for PPP, which you don't seem to use. Any improvement if you disable this?
Cheers
Maurice
Cheers, I disabled it. Gateway came up, but I doubt this will stick.
I looked at the log, after the change and once again saw:
2022-08-15T18:20:15 1 Error opnsense 90129 /usr/local/etc/rc.newwanipv6: The command '/sbin/route add -host -'inet6' '2001:4860:4860::8888' '<my-ipv6-gw>%'' returned exit code '71', the output was 'route: <my-ipv6-gw>%: Name does not resolve'
It seems that one of the provisioning steps either needlessly leaves the % or fails to add the interface string, because rc.newwanipv6 adds this to the log just a second later:
2022-08-15T18:20:16 1 Error opnsense 90129 /usr/local/etc/rc.newwanipv6: Adding static route for monitor 2001:4860:4860::8888 via <my-ipv6-gw>%igb0
I think there is an overlap here between gateway monitoring and DNS routes most likely...
Here is a test script to see if the DNS routes is the issue with the missing interface:
Quote# cat test.php
<?php
include 'config.inc';
include 'util.inc';
include 'interfaces.inc';
include 'system.inc';
var_dump(get_nameservers(null, true));
Quote# php -f test.php
[output to send via PM]
Cheers,
Franco
Cheers, Franco. Output sent via PM.
Found it :>
https://github.com/opnsense/core/commit/c9bdc3d1624
# opnsense-patch c9bdc3d1624
So the bug itself was added in https://github.com/opnsense/core/commit/a6340f80321 back in 2019 but was dormant until we started using it for get_nameservers for a consistent output... previously different host routes were added in a non-deterministic fashion but now this is easer to reproduce. I suppose since you have DNS servers with gateway and the gateway monitoring using the same DNS servers the order flipped from gateway host route wins to DNS server host route wins which was broken for quite a bit.
Cheers,
Franco
Nice! Thanks for the effort, Franco :D
Maurice's suggestion seems to helped too. I left a ping running after disabling "Use IPv4 connectivity" for the problematic WAN interface.
ping 2001:4860:4860::8888 | while read l; do echo `date` $l; done
Mon Aug 15 06:26:37 PM CEST 2022 64 bytes from 2001:4860:4860::8888: icmp_seq=1 ttl=120 time=9.63 ms
I haven't lost IPv6 since, it's still going strong ...
Mon Aug 15 10:29:05 PM CEST 2022 64 bytes from 2001:4860:4860::8888: icmp_seq=14532 ttl=120 time=8.36 ms
EDIT: next day, still stable
Tue Aug 16 08:49:13 AM CEST 2022 64 bytes from 2001:4860:4860::8888: icmp_seq=51692 ttl=120 time=7.28 ms
Weird, I probably had "Use IPv4 connectivity" enabled since I setup this install a couple of years ago.