OPNsense Forum

Archive => 22.1 Legacy Series => Topic started by: MenschAergereDichNicht on January 17, 2022, 04:49:34 PM

Title: [SOLVED] Multi-WAN
Post by: MenschAergereDichNicht on January 17, 2022, 04:49:34 PM
Hi,

i am not sure if my problem is specific to the release candidate or works as it should as this is the first time i try a multi-wan configuration (using the RC1 obviously).

That said i'll try to describe the situation.

I have a multi-wan setup with one WAN interface connected by means of fibre wire. This WAN interface uses a static IPv4 and DHCP for IPv6.
I have a second WAN interface that is intended to work together with a nano-router that connects to the LTE connection of my mobile by means of Wifi. This interface gets a dynamic IPv4 only.

If all is connected it seems to work fine. But if i power off the nano-router and the corresponding WAN2 connection loses the ethernet connection to the nano-router the WAN2_GW is "defunct" and all references to the WAN2_GW are removed from e.g. the "System-Settings-General" DNS-Server settings where i had specified the Gateway for each entry.
I guess that the gateway is re-activated if i enable the nano-router. But what about the DNS gateway settings? I exported(backup) the configuration and looked into the .xml. It seems that the entries are completely wiped out... .

Do i have to make those changes (assign WAN2_GW to DNS server entries) every time i activate the nano-router?
Title: Re: Multi-WAN
Post by: MenschAergereDichNicht on January 17, 2022, 05:31:06 PM
Update:

I tried to reproduce the behaviour. But i am currently not able to do so. Because i try lots of things i am not sure about the exact workflow where the problem occured.
If it happens again i report back.
Title: Re: Multi-WAN
Post by: franco on January 17, 2022, 05:44:49 PM
I think this might be related to https://forum.opnsense.org/index.php?topic=26341.0

Can you add a system: settings: tunable "net.route.multipath" with value "0"? Best to reboot to avoid the situation when you already have two default routes stuck in the system.


Cheers,
Franco
Title: Re: Multi-WAN
Post by: MenschAergereDichNicht on January 17, 2022, 06:43:15 PM
Thanks for the feedback. I will try the tunable when i can reboot the router without getting my wife mad at me.
Title: Re: Multi-WAN
Post by: franco on January 17, 2022, 07:49:28 PM
> without getting my wife mad at me

If you manage to find a way please do tell.  :)


Cheers,
Franco
Title: Re: Multi-WAN
Post by: chemlud on January 17, 2022, 08:03:58 PM
Quote from: franco on January 17, 2022, 07:49:28 PM
> without getting my wife mad at me

If you manage to find a way please do tell.  :)


Cheers,
Franco

Reboot time 03:00 a.m. is frequently a good choice in this situation...
Title: Re: Multi-WAN
Post by: MenschAergereDichNicht on January 17, 2022, 08:06:07 PM
Well... Actually i was able to reboot a little bit earlier than that :-)

Ok. I did the following:


After that the load went up and Unbound was leading the CPU usage.
So it seems not to help.

The Unbound log file is still empty afterwards but the general system log contains the following entries:

2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: Choose to bind WAN2_DHCP on 0.0.0.0 since we could not find a proper match.
2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: Adding static route for monitor 1.1.1.1 via 192.168.69.1
2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: Removing static route for monitor 1.1.1.1 via 192.168.69.1
2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: Adding static route for monitor 2606:4700:4700::1111 via fe80::eadf:70ff:fe7a:23da%igb3
2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: Removing static route for monitor 2606:4700:4700::1111 via fe80::eadf:70ff:fe7a:23da%igb3
2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: ROUTING: keeping current default gateway 'fe80::eadf:70ff:fe7a:23da%igb3'
2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: ROUTING: setting IPv6 default route to fe80::eadf:70ff:fe7a:23da
2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: ROUTING: IPv6 default gateway set to wan
2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: ROUTING: keeping current default gateway '192.168.69.1'
2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: ROUTING: setting IPv4 default route to 192.168.69.1
2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: ROUTING: IPv4 default gateway set to wan
2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: ROUTING: entering configure using 'wan'
2022-01-17T19:48:46 Error opnsense /usr/local/etc/rc.newwanipv6: The command '/sbin/route add -host -'inet6' '2606:4700:4700::1111' 'fe80::eadf:70ff:fe7a:23da%'' returned exit code '71', the output was 'route: fe80::eadf:70ff:fe7a:23da%: Name does not resolve'
2022-01-17T19:47:59 Error opnsense /usr/local/etc/rc.newwanipv6: The command '/sbin/route add -host -'inet6' '2a05:fc84::42' 'fe80::eadf:70ff:fe7a:23da%'' returned exit code '71', the output was 'route: fe80::eadf:70ff:fe7a:23da%: Name does not resolve'
2022-01-17T19:47:59 Error opnsense /usr/local/etc/rc.newwanipv6: On (IP address: <IPv6-Address>) (interface: WAN[wan]) (real interface: igb3).
2022-01-17T19:47:59 Error opnsense /usr/local/etc/rc.newwanipv6: IPv6 renewal is starting on 'igb3'
2022-01-17T19:47:56 Error opnsense /usr/local/etc/rc.linkup: Warning! dhcpd_radvd_configure(auto) found no suitable IPv6 address on igb1_vlan13
2022-01-17T19:47:55 Error opnsense /usr/local/etc/rc.linkup: ROUTING: skipping IPv6 default route
2022-01-17T19:47:55 Error opnsense /usr/local/etc/rc.linkup: ROUTING: IPv6 default gateway set to wan
2022-01-17T19:47:55 Error opnsense /usr/local/etc/rc.linkup: ROUTING: creating /tmp/igb3_defaultgw using '192.168.69.1'
2022-01-17T19:47:55 Error opnsense /usr/local/etc/rc.linkup: ROUTING: creating /tmp/igb3_defaultgw using '192.168.69.1'
2022-01-17T19:47:55 Error opnsense /usr/local/etc/rc.linkup: ROUTING: removing /tmp/igb3_defaultgw
2022-01-17T19:47:55 Error opnsense /usr/local/etc/rc.linkup: ROUTING: setting IPv4 default route to 192.168.69.1
2022-01-17T19:47:55 Error opnsense /usr/local/etc/rc.linkup: ROUTING: IPv4 default gateway set to wan
2022-01-17T19:47:55 Error opnsense /usr/local/etc/rc.linkup: ROUTING: entering configure using 'wan'
2022-01-17T19:47:55 Error opnsense /usr/local/etc/rc.linkup: Accept router advertisements on interface igb3
2022-01-17T19:47:54 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for dynamic wan(igb3)
2022-01-17T19:47:51 Error dhcp6c transmit failed: Network is down
2022-01-17T19:47:49 Error dhcp6c transmit failed: Network is down
2022-01-17T19:47:49 Error dhcp6c transmit failed: Network is down
2022-01-17T19:47:48 Error opnsense /usr/local/etc/rc.linkup: Clearing states for stale wan route on igb3
2022-01-17T19:47:48 Error dhcp6c transmit failed: Network is down
2022-01-17T19:47:48 Error dhcp6c transmit failed: Network is down
2022-01-17T19:47:48 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for dynamic wan(igb3)


Title: Re: Multi-WAN
Post by: franco on January 17, 2022, 08:07:09 PM
Do you have Unbound set to listen on specific interface? Do you have RSS turned on manually?

The tunable was mainly for the concern about mult-WAN not working since FreeBSD 13 by default let's you create multiple default routes now in the same routing table, but obviously will only use the latest one.


Cheers,
Franco
Title: Re: Multi-WAN
Post by: franco on January 17, 2022, 08:09:25 PM
> Choose to bind WAN2_DHCP on 0.0.0.0 since we could not find a proper match.

FWIW, this looks like the router or modem doesn't want to give you a lease. Does it want a different MAC address?


Cheers,
Franco
Title: Re: Multi-WAN
Post by: MenschAergereDichNicht on January 17, 2022, 08:21:01 PM
Unbound is set to listen on all interfaces (default).

Regarding RSS i guess the answer is no. The only other tunables i adapted are "vm.pmap.pti" and "hw.ibrs_disable" because the APU does not have any spare resources and i don't think it is that important on a router.

Regarding the lease: The WAN2 is not available all the time. It is a nano-router that i would only activate (power on) in case the WAN interface goes down. Therefore it is "normal" that currently there is no IP-Address available. But it should automatically be configured when i plug-in the power cable.

For Multi-WAN i created a gateway group that consists of the WAN_GW (Tier 1) and WAN2_DHCP (Tier 2). I created only a IPv4 Multi-WAN setup because the nano-router (or my Android hotspot) does not hand out IPv6.

In the above situation the Tier 2 gateway is deactivated because the nano-router has no power.
Title: Re: Multi-WAN
Post by: MenschAergereDichNicht on January 17, 2022, 08:42:45 PM
I also have the following entry in the General system log:

2022-01-18T00:04:55 Error opnsense /usr/local/etc/rc.linkup: The command '/usr/local/opnsense/scripts/dns/unbound_dhcpd.py --domain 'localdomain'' returned exit code '1', the output was 'Unable to lock on the pidfile. Traceback (most recent call last): File "/usr/local/opnsense/site-python/daemonize.py", line 91, in start fcntl.flock(lockfile, fcntl.LOCK_EX | fcntl.LOCK_NB) BlockingIOError: [Errno 35] Resource temporarily unavailable During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/opnsense/scripts/dns/unbound_dhcpd.py", line 237, in <module> daemon.start() File "/usr/local/opnsense/site-python/daemonize.py", line 96, in start pidfile.write(old_pid) UnboundLocalError: local variable 'old_pid' referenced before assignment'


Maybe a concurrency problem?
Title: Re: Multi-WAN
Post by: MenschAergereDichNicht on January 17, 2022, 11:39:16 PM
After an attempt to start the bufferbloat browser test the WAN connection loses packets and i see the following entries inside the General log:


...
2022-01-17T23:31:15 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for dynamic wan(igb3)
2022-01-17T23:31:10 Error dhcp6c transmit failed: Network is down
2022-01-17T23:31:09 Error dhcp6c transmit failed: Network is down
2022-01-17T23:31:09 Error dhcp6c transmit failed: Network is down
2022-01-17T23:31:08 Error opnsense /usr/local/etc/rc.linkup: Clearing states for stale wan route on igb3
2022-01-17T23:31:08 Error dhcp6c transmit failed: Network is down
2022-01-17T23:31:07 Error opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for dynamic wan(igb3)


It looks like somehow a detach and attach is triggered by inducing some load.
Regular traffic (browsing the forum, youtube,...) can work perfectly fine for some time.

Title: Re: Multi-WAN
Post by: MenschAergereDichNicht on January 18, 2022, 09:06:22 AM
I found the following inside the dmesg log. Maybe someone knows if this is important/critical or not:

igb0: link state changed to UP
debugnet_any_ifnet_update: Bad dn_init result from igb0 (ifp 0xfffff8000367a800), ignoring.
igb1: link state changed to UP
debugnet_any_ifnet_update: Bad dn_init result from igb1 (ifp 0xfffff800034ba000), ignoring.
igb3: link state changed to UP
debugnet_any_ifnet_update: Bad dn_init result from igb3 (ifp 0xfffff800038ef800), ignoring.

arpresolve: can't allocate llinfo for 192.168.69.1 on igb3
Title: Re: Multi-WAN
Post by: MenschAergereDichNicht on January 18, 2022, 09:44:54 AM
What helps is removing all DNS Blocklist entries. Afterwards i still get those strange "rc.linkup: DEVD: " detach messages when running the bufferbloat test but the system recovers faster and does not disrupt the regular WAN access so much.
Title: Re: Multi-WAN
Post by: MenschAergereDichNicht on January 19, 2022, 02:40:59 PM
Finally i found something that seem to really improve my stability problems.

Before installing the RC1 i also upated the Bios to the latest version (4.15.0.2 at this time).

Because the "dn_init"-messages above made me think about a hardware/bios related problem i tried an older Bios version.

After installing 4.15.0.1 until now the Wan connection is stable.
There are still those strange "DEVD: Ethernet detached event" messages inside the log but it does not interrupt the system (at least not to the extend before the bios downgrade).

Some things that are still "strange"" are:


Takeaway:
For a APU4D4 BIOS version 4.15.0.1 is better than version 4.15.0.2.
And a classic (epic) failure of changing too many things at once.

Todo: Try an older Bios version and check if it works even better.
Title: Re: Multi-WAN
Post by: MenschAergereDichNicht on January 19, 2022, 03:07:39 PM
I am back to BIOS version 4.14.0.6 for now.
Title: Re: Multi-WAN
Post by: MenschAergereDichNicht on January 19, 2022, 04:53:12 PM
I found the solution to the remaining problems.

Currently i have two APU4D4 connected to a Fritzbox that is my uplink to the fiber provider.
One is the production system with version 21.7 and the other runs the RC1.

The Fritzbox is configured to use 1GB/s on two ports and 100Mb/s on the other two.

One GB-Port of the Fritzbox is the upstream port to the fiber media converter. The other is connected to the production APU.
The RC1 was connected to one of the 100MB/s ports.

This configuration produced the described problems. The moment i disconnected the production system from the fritzbox and put the line of the RC1 box into the 1GB port evrything suddenly works.

I have currently no idea of the underlying reason. The production system has a different IPv4 address space and should not cause any problems.

Both systems are configured to use DHCPv6 and to request only an IPv6 prefix. Is it possible that this can cause address collisions? I could get a 56-er prefix but set "Prefix delegation size" to 60. My assumption was that i should be given a different prefix for each system.
Title: Re: Multi-WAN [Solved]
Post by: MenschAergereDichNicht on January 19, 2022, 05:07:42 PM
The final solution was to adjust the third port of the Fritzbox to 1Gbit/s. It is a little bit strange that 100Mbits/s are not working (either Fritzbox or APU problem) but i can live with that.
Title: Re: [SOLVED] Multi-WAN
Post by: franco on January 20, 2022, 04:17:28 PM
Thanks for following up on these. Like with software a number of changes at once make it difficult to see what changed what.

BIOS issue is certainly not nice but good it can be fixed with a downgrade. :)


Cheers,
Franco