OPNsense Forum

Archive => 23.1 Legacy Series => Topic started by: axsdenied on February 15, 2023, 04:42:28 PM

Title: Multi-WAN no graceful recovery
Post by: axsdenied on February 15, 2023, 04:42:28 PM
Before you beat me up on this topic being brought up before, in various forms; you're right.  However, in my searching, I found ZERO resolutions outside of "rebooting" which is not what I'd want to do with an enterprise class option like OPNsense.

Internet Provider Context:
PROBLEM: After an outage or issue with the Cox Cable connection (WAN), OPNsense fails over to T-mobile (WAN2) pretty gracefully.
After Cox corrects their issue, I'm unable to get a valid IP from the Cox Cable modem unless I reboot the OPNsense router.

ATTEMPTED STEPS TO RESOVLE: If I go to INTERFACES: OVERVIEW, and select the WAN interface (wan, igb0) I do see both "Reload" and "Release" as options for DHCP.

If I attempt either, usually one of 2 things happens:
I can do try these options for many times and the result is always the same.

The ONLY thing that allows me to receive an IP again is rebooting OPNsense.  This tells me, given its inability to communicate properly to obtain a refreshed IP without rebooting is clearly an OS/Driver/OPNsense issue.

Anyone know why this could be?
Title: Re: Multi-WAN no graceful recovery
Post by: axsdenied on February 15, 2023, 08:32:00 PM
Updating with relevant log files:

2023-02-15T08:31:29-07:00   Error   opnsense   /usr/local/etc/rc.configure_interface: The command '/sbin/dhclient -c '/var/etc/dhclient_wan.conf' -p '/var/run/dhclient.igb0.pid' 'igb0'' returned exit code '15', the output was 'DHCPREQUEST on igb0 to 255.255.255.255 port 67 DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 19 DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 38 DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 4 DHCPOFFER from 10.82.240.1 DHCPREQUEST on igb0 to 255.255.255.255 port 67 DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 17 DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 17 DHCPOFFER from 10.82.240.1 DHCPREQUEST on igb0 to 255.255.255.255 port 67'   
2023-02-15T08:31:29-07:00   Error   dhclient   connection closed   
2023-02-15T08:29:12-07:00   Error   dhclient   short write: wanted 20 got 0 bytes   
2023-02-15T08:29:12-07:00   Error   dhclient   My address (x.x.x.x) was deleted, dhclient exiting   
2023-02-15T06:53:54-07:00   Error   dhclient   connection closed   
2023-02-15T06:40:23-07:00   Error   dhclient   connection closed   
2023-02-15T06:39:58-07:00   Error   dhclient   short write: wanted 20 got 0 bytes   
2023-02-15T06:39:58-07:00   Error   dhclient   My address (x.xx.xx.xx) was deleted, dhclient exiting
Title: Re: Multi-WAN no graceful recovery
Post by: skavoovie on February 17, 2023, 03:04:53 AM
You indicated that you have enabled blocking private/bogon networks on your WAN interface. From the log output, it looks like your WAN interface is the gb0 interface, and it looks like the source IP of the DHCP server response is 10.82.240.1. That is an RFC1918 IP -- "private networks" -- which you've configured to be blocked.

Have you tried disabling the block of private / bogon networks to see if that resolves your issue? Looks to me like the DHCPOFFER response is being blocked by your OPNsense configuration because the source IP is an RFC1918 IP address.

Title: Re: Multi-WAN no graceful recovery
Post by: axsdenied on February 17, 2023, 08:40:41 PM
If that was the source of the block, would that not ALSO apply after a reboot?  Rebooting corrects the issue.  And FYI that is a default and recommended block on most internet facing WAN interfaces.
Title: Re: Multi-WAN no graceful recovery
Post by: axsdenied on February 22, 2023, 08:15:38 PM
Additional info:

This condition occurs after OPNsense defines a gateway as "Defunct"; Prior to that, it fails back to Tier 1 Gateway if its recovered prior to that.

After the Gateway has been defined as defunct, I can't bring it back up unless I reboot.
Title: Re: Multi-WAN no graceful recovery
Post by: skavoovie on May 05, 2023, 07:39:24 PM
Quote from: axsdenied on February 17, 2023, 08:40:41 PM
If that was the source of the block, would that not ALSO apply after a reboot?  Rebooting corrects the issue.  And FYI that is a default and recommended block on most internet facing WAN interfaces.

True, however generally for a WAN interface, where that option is used, the expectation is that NO communication w/ an RFC1918 / non-routable IP address would be necessary / desired.

Just a guess, but I could envision a reboot working due to something like the order of operations differing at boot time. For example, perhaps at boot time, the DHCP lease request is sent before that WAN rule is loaded. It would probably take a code review to determine the order of operations, and how it might differ during the boot cycle vs. when an interface is downed/upped to know for sure.


As a test / possible fix, I suggest you disable the block RFC IPs on WAN option, and instead add a custom WAN rule that blocks all RFC1918 IP sourced traffic to your WAN interface, but explicitly allow traffic to/from your ISP's DHCP server RFC1918 IP you mentioned, source or dest ports of UDP 67 and 68 (for IPv4).

If that does not fix the issue, then something else is at play (in which case I would look at the advanced DHCP request settings, as some ISPs can be very finnicky on the request parameters. Start by searching your preffered search engine for other customers of the same ISP w/ non-standard edge routers / firewalls).

Ultimately though, since you know that you need to be able to receive traffic from an RFC1918 IP in order to receive your DHCP lease from your ISP, it seems logical that you would not be able to leverage the option for blocking all RFC1918 IP space into your WAN interface, and leaving it enabled should be expected to lead to inconsistent results / problems. GL!