23.1.7_1 broke my Firewall (Fixed)

Started by My_Network, May 05, 2023, 02:49:34 PM

Previous topic - Next topic
May 12, 2023, 11:11:13 PM #30 Last Edit: May 12, 2023, 11:34:02 PM by My_Network
Hi Franco,

Finally was abe to get something out of opnsense-log | grep refusing. Dont know if this give you any clues on what could me my issue? Oh and intresting fact, just after the upgrade to 23.1.7_3, the issue arise only if I reload the "routing" service out of any other services!

<11>1 2023-05-12T16:43:28-04:00 OPNsense.localdomain opnsense 98415 - [meta sequenceId="92"] /usr/local/etc/rc.routing_configure: ROUTING: refusing to set inet gateway on addressless lan

Other intresting general log entry:

<13>1 2023-05-12T16:31:59-04:00 OPNsense.localdomain opnsense 74858 - [meta sequenceId="124"] /usr/local/etc/rc.linkup: Chose to bind CISCO_WAN_INT on 192.168.15.1 since we could not find a proper match.

Look like 192.168.12.0/24 is not being consired at all! 

+

If we look on the log of onw of Struppi awnsers! We see the same error but with the WAN interface:
- opnsense-badcase.log: this is the full log after the update and reboot (23.1.7_3)

<11>1 2023-05-10T12:33:04+02:00 OPNsense.dimo.nil opnsense 8359 - [meta sequenceId="32"] /usr/local/etc/rc.routing_configure: ROUTING: refusing to set inet gateway on addressless wan


Thank you,

Nic

Look in /conf/config.xml for <gateway_item/> with the same name... I think you have more than one which causes this issue. It looks to be the same as struppie's issue now.


Cheers,
Franco

Hi Franco,

So after further investigation, it seem's that i dont have the same issue. There is no duplicate gateways. But the issue seems to be with: /usr/local/etc # cat rc.routing_configure. In 23.1.7_3 this script changes and seem to interfere somehow with ''FAR GATEWAYS'' that dont have physical interface.

Im out of ideas here.

Nic

Hi Guys,

I'll weigh in here, since I seem to be having a similar problem as Nic with the exception that my IPv4 outbound continues to work but it affects my outbound VPN and IPv6 outbound connectivity.

I've been ripping my hair out a bit with this issue and in the end have set myself up with a virtual version of my usually physical firewall so I could perform quick changes and roll back without impacting my main prod too much. It definitely appears to be related to far gateways that dont have physical interfaces since after upgrading thats exactly what I experience.

The specific GW causing the trouble seems to be my WireGuard VPN which uses a far GW and doesn't have it's own physical interface, and as a side impact also seems to impact my IPv6 GW which is created by my PPPoE connection over IPv4, although this is not a far GW.

Problem first appeared for me after upgrading from 23.1 to 23.1.7_3 from which I had to rollback to get things working again properly.

When I run: opnsense-log | grep refusing

I see in the log last entry is:
2023-05-14T16:50:30+01:00 myopnsensename.local opnsense 47013 - [meta sequenceId="4"] /usr/local/etc/rc.routing_configure: ROUTING: refusing to set inet6 gateway on addressless wan

Might be worth knowing, my ISP doesn't offer DHCPv6 or PPPoEv6, so I've always had it set to a static IPv6 within my allocated /64 which has always worked correctly before and still does when I roll back.

I have checked my config.xml and as far as I can tell have the exact same number of GW (3) as are supposed to be starting at boot.

Now I'm running virtual, testing should be minimally impactfull and my IPv4 outbound continues to work, I just lose my IPv6 and VPN outbound connectivity, so happy to try and perform testing to assist if needed.

Thanks so much to everybody for all the hard work on OPNsense, it's really a great piece of work.

Gareth

QuoteI see in the log last entry is:
2023-05-14T16:50:30+01:00 myopnsensename.local opnsense 47013 - [meta sequenceId="4"] /usr/local/etc/rc.routing_configure: ROUTING: refusing to set inet6 gateway on addressless wan

Might be worth knowing, my ISP doesn't offer DHCPv6 or PPPoEv6, so I've always had it set to a static IPv6 within my allocated /64 which has always worked correctly before and still does when I roll back.

The code itself looks at ifconfig if there is *any* address set on the device, which should already be true for link-local alone. But again here seems to be the problem that the gateway section in the config.xml might have a duplicated gateway on different interfaces which breaks the setup now when before it didn't.


Cheers,
Franco

Hi Franco,

Thanks for coming back on this.

Just to be double sure I wasn't going crazy, I've looked at my config.xml again and double checked. There are no duplicate gateways/names on different interfaces and the number and spec on the gateways matches exactly, unlike the example showed by Struppie where there were clearly 2x entries in the gateways section of the config file and only a single gateway visible in the GUI.

I could only find one section in the config.xml that contained gateway info, looking as below(Some IP's altered for privacy):

<gateways>
    <gateway_item>
      <interface>wan</interface>
      <gateway>dynamic</gateway>
      <name>WAN_PPPOE</name>
      <priority>254</priority>
      <weight>1</weight>
      <ipprotocol>inet</ipprotocol>
      <interval/>
      <descr>Interface WAN_PPPOE Gateway</descr>
      <monitor>8.8.8.8</monitor>
      <defaultgw>1</defaultgw>
    </gateway_item>
    <gateway_item>
      <interface>wan</interface>
      <gateway>face:face:face:face::254</gateway>
      <name>WAN_GW</name>
      <priority>254</priority>
      <weight>1</weight>
      <ipprotocol>inet6</ipprotocol>
      <interval/>
      <descr>Interface WAN_GW Gateway</descr>
      <monitor>2001:4860:4860::8844</monitor>
    </gateway_item>
    <gateway_item>
      <interface>opt8</interface>
      <gateway>dynamic</gateway>
      <name>WAN_PIAGW_IPv4</name>
      <priority>255</priority>
      <weight>1</weight>
      <ipprotocol>inet</ipprotocol>
      <interval/>
      <descr/>
      <fargw>1</fargw>
    </gateway_item>
  </gateways>

The above section also matches exactly the gateways that appear in the GUI.

It's entirely possible I'm missing something obvious, but could only find one section that contained the above info.

Thanks for your help.

Gareth

Good evening Gazd25,

Look like we have the exact same issue! In my config file there is no duplicate gateway's, iv'e looked and looked again. The issue is really regarding routing to ''far gateway's'' that's not working anymore in 23.1.7_3, but in 23.1.6 it works fine. Do you have static routes that points to that specific gateway? In my environnement I have 5 static routes that point to vlan's on my cisco router downstream with it's WAN interface. Hoping to get this sorted out. Would be more that happy to share more details if needed.

Regard  :o

Nic

Hi Nic,

Just to make matters more confusing, i'm afraid in my case I'm not using any static routes no.

The far gateway in use in my case is dynamically generated by the fingerlessgloves script to connect to my PIA VPN, found here:

https://github.com/FingerlessGlov3s/OPNsensePIAWireguard

This script does however pull routes which I can see in the routing table, it doesn't do this immediately at boot though. instead using a Cron job to add them or refresh every 5 minutes, so you will often see the dpinger as red for a couple of minutes right after booting while it waits for the script to kick in. This I believe to be normal behaviour when using this script.

It does use/enable a dynamic far gateway though because this is what's needed to route the traffic to PIA over wireguard after established, so it's likely somewhat similar.

It's interesting to note though that after booting and before the script runs, its already knocked out the IPv6 routing on my main internet connection and is also showing the error around refusing to add a default gateway which was in Shuppie's example.

That does give me something to compare later to see if it also fails to add the normal route after upgrading to 23.1.7_3, but off to work now so will have to figure that out later.

Thanks

Gareth

Let's look at this differently... I'm assuming the bad change is https://github.com/opnsense/core/commit/a8e9862b410073 and it may work again if it's reverted?

# opsense-patch a8e9862b410073

The commit does two things: it adds IP address family specific reload functionality, but it should not matter for when e.g. rc.configure_routing is called which is what WireGuard is doing.

The other thing is it tries to verify that the gateway selected for default gateway use does have a matching interface with at least one address in it (the equivalent of calling ifconfig to see if that has an address). The latter one is easy to try... I do think that at least one address must be present anyway, but perhaps if it's a tunnel device the address might not show up correctly?

Looking forward to verification that the patch is the issue...


Cheers,
Franco

There are two patches to help with diagnose..

https://github.com/opnsense/core/commit/8beb293c5
https://github.com/opnsense/core/commit/48855143b

This is on a clean 23.1.7, opnsense-revert used to make sure:

# opnsense-revert opnsense && opnsense-patch 48855143b 8beb293c5
# /usr/local/etc/rc.routing_configure
# opnsense-log | grep refusing

In the last log line there is a hint of the interface and device being used, e.g.:

> ROUTING: refusing to set inet gateway on addressless wan(igb1)

For the device is parenthesis run:

# pluginctl -D igb1

Depending on this output the log line is generated and the route refused. If data is there we might be looking at a timing issue, if not then it's something more fundamental.


Cheers,
Franco

Quote from: franco on May 15, 2023, 09:58:09 AM
Let's look at this differently... I'm assuming the bad change is https://github.com/opnsense/core/commit/a8e9862b410073 and it may work again if it's reverted?

# opsense-patch a8e9862b410073

The commit does two things: it adds IP address family specific reload functionality, but it should not matter for when e.g. rc.configure_routing is called which is what WireGuard is doing.

The other thing is it tries to verify that the gateway selected for default gateway use does have a matching interface with at least one address in it (the equivalent of calling ifconfig to see if that has an address). The latter one is easy to try... I do think that at least one address must be present anyway, but perhaps if it's a tunnel device the address might not show up correctly?

Looking forward to verification that the patch is the issue...


Cheers,
Franco

Hi Franco,

So first things first, I've spent a few minutes running tests again and it does in fact look as if the wireguard vpn is coming up after I upgrade to 23.1.7_3, I have double checked and it does appear to be working as expected.

The failure is on the IPv6 gateway which refuses to come online and therefore stops all IPv6 traffic. I guess this would be consistent with the error in the log around refusing to apply an inet6 GW.

I tried to apply the patch you mentioned in the above post and got the below error output:

root@OPNSense:~ # opnsense-patch a8e9862b410073
Fetched a8e9862b410073 via https://github.com/opnsense/core
1 out of 2 hunks failed while patching etc/rc.syshook.d/monitor/10-dpinger
root@OPNSense:~ # opnsense-patch a8e9862b410073
Found local copy of a8e9862b410073, skipping fetch.
1 out of 2 hunks failed while patching etc/rc.syshook.d/monitor/10-dpinger

I tried it a couple of times just to be sure, but the half a patch doesn't resolve or make any impacts.

I'll move on to the next set of tests to try and feedback further

Thanks

Gareth

May 15, 2023, 06:51:52 PM #41 Last Edit: May 15, 2023, 07:50:40 PM by gazd25
Quote from: franco on May 15, 2023, 10:24:41 AM
There are two patches to help with diagnose..

https://github.com/opnsense/core/commit/8beb293c5
https://github.com/opnsense/core/commit/48855143b

This is on a clean 23.1.7, opnsense-revert used to make sure:

# opnsense-revert opnsense && opnsense-patch 48855143b 8beb293c5
# /usr/local/etc/rc.routing_configure
# opnsense-log | grep refusing

In the last log line there is a hint of the interface and device being used, e.g.:

> ROUTING: refusing to set inet gateway on addressless wan(igb1)

For the device is parenthesis run:

# pluginctl -D igb1

Depending on this output the log line is generated and the route refused. If data is there we might be looking at a timing issue, if not then it's something more fundamental.


Cheers,
Franco

Hi Franco,

Ok, next set of tests as requested above (slightly edited for privacy):

root@OPNSense:~ # opnsense-revert opnsense && opnsense-patch 48855143b 8beb29                                                                                                        3c5
Updating OPNsense repository catalogue...
OPNsense repository is up to date.
All repositories are up to date.
The following packages will be fetched:

New packages to be FETCHED:
        opnsense: 23.1.7_3 (4 MiB: 100.00% of the 4 MiB to download)

Number of packages to be fetched: 1

The process will require 4 MiB more space.
4 MiB to be downloaded.
Fetching opnsense-23.1.7_3.pkg: 100%    4 MiB   4.4MB/s    00:01
opnsense-23.1.7_3: already unlocked
Updating OPNsense repository catalogue...
OPNsense repository is up to date.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The following 1 package(s) will be affected (of 0 checked):

Installed packages to be REINSTALLED:
        opnsense-23.1.7_3

Number of packages to be reinstalled: 1
[1/1] Reinstalling opnsense-23.1.7_3...
[1/1] Extracting opnsense-23.1.7_3: 100%
Stopping configd...done
Resetting root shell
Updating /etc/shells
Unhooking from /etc/rc
Unhooking from /etc/rc.shutdown
Updating /etc/shells
Registering root shell
Hooking into /etc/rc
Hooking into /etc/rc.shutdown
Starting configd.
>>> Invoking update script 'refresh'
Writing firmware setting...done.
Writing trust files...done.
Configuring login behaviour...done.
Configuring system logging...done.
=====
Message from opnsense-23.1.7_3:

--
I'm no chicken
Fetched 48855143b via https://github.com/opnsense/core
Fetched 8beb293c5 via https://github.com/opnsense/core
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|From 48855143b0c5e2d3f70a29a841e80a45210d74e2 Mon Sep 17 00:00:00 2001
|From: Franco Fichtner <franco@opnsense.org>
|Date: Wed, 10 May 2023 14:37:38 +0200
|Subject: [PATCH] system: add 'if' to message in case of mismatch
|
|PR: https://forum.opnsense.org/index.php?topic=33864.0
|---
| src/etc/inc/system.inc | 2 +-
| 1 file changed, 1 insertion(+), 1 deletion(-)
|
|diff --git a/src/etc/inc/system.inc b/src/etc/inc/system.inc
|index 722900df88..7666a0e740 100644
|--- a/src/etc/inc/system.inc
|+++ b/src/etc/inc/system.inc
--------------------------
Patching file etc/inc/system.inc using Plan A...
Hunk #1 succeeded at 619 (offset -2 lines).
done
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|From 8beb293c53e3d14c5256cd648b3a834667595c2d Mon Sep 17 00:00:00 2001
|From: Franco Fichtner <franco@opnsense.org>
|Date: Mon, 15 May 2023 10:11:38 +0200
|Subject: [PATCH] pluginctl: add an ifconfig mode for easier debugging and
| later use
|
|PR: https://forum.opnsense.org/index.php?topic=33864.0
|---
| src/sbin/pluginctl | 7 +++++--
| 1 file changed, 5 insertions(+), 2 deletions(-)
|
|diff --git a/src/sbin/pluginctl b/src/sbin/pluginctl
|index afa7e674ce..eb531b8e97 100755
|--- a/src/sbin/pluginctl
|+++ b/src/sbin/pluginctl
--------------------------
Patching file sbin/pluginctl using Plan A...
Hunk #1 succeeded at 63.
Hunk #2 succeeded at 78.
done
All patches have been applied successfully.  Have a nice day.
root@OPNSense:~ # /usr/local/etc/rc.routing_configure
Setting up routes...done.
Setting up gateway monitors...done.
Configuring firewall.......done.
root@OPNSense:~ # opnsense-log | grep refusing
<11>1 2023-05-15T18:36:30+01:00 OPNSense.domain.local opnsense 301 - [meta sequenceId="12"] /usr/local/etc/rc.bootup: ROUTING: refusing to set inet6 gateway on addressless wan
<11>1 2023-05-15T18:36:44+01:00 OPNSense.domain.local opnsense 4898 - [meta sequenceId="43"] /usr/local/etc/rc.routing_configure: ROUTING: refusing to set inet6 gateway on addressless wan
root@OPNSense:~ # pluginctl -D igb2
{
    "igb2": {
        "flags": [
            "up",
            "broadcast",
            "running",
            "simplex",
            "multicast"
        ],
        "capabilities": [
            "rxcsum",
            "txcsum",
            "vlan_mtu",
            "vlan_hwtagging",
            "jumbo_mtu",
            "vlan_hwcsum",
            "tso4",
            "tso6",
            "lro",
            "wol_ucast",
            "wol_mcast",
            "wol_magic",
            "vlan_hwfilter",
            "vlan_hwtso",
            "netmap",
            "rxcsum_ipv6",
            "txcsum_ipv6",
            "nomap"
        ],
        "options": [
            "vlan_mtu",
            "jumbo_mtu",
            "nomap"
        ],
        "macaddr": "a0:36:9f:7d:55:7b",
        "ipv4": [],
        "ipv6": [],
        "supported_media": [
            "autoselect",
            "1000baseT",
            "1000baseT full-duplex",
            "100baseTX full-duplex",
            "100baseTX",
            "10baseT/UTP full-duplex",
            "10baseT/UTP"
        ],
        "mtu": "1500",
        "media": "1000baseT <full-duplex>",
        "media_raw": "Ethernet autoselect (1000baseT <full-duplex>)",
        "status": "active"
    }
}


Hopefully offers you something useful, apologies on the earlier mistake, which I've now corrected in this post. There doesn't seem to be any info attached to the igb2 interface above.

It did lead me to consider one thing which I thought I'd mention, in the grep for refusing, it shows the interface as wan but no hint. I know the wan interface is tied to igb2 on my virtual firewall (VMware Passthrough, Intel Card) as it is on my physical one.

But ultimately it's a PPPoE interface which is then linked to igb2, so wonder if it's not picking it up because of this?

In any case, hope the output is of some use, going to rollback again for now, let me know if there is anything further you would like me to do.

Thanks

Gareth

May 16, 2023, 12:45:23 AM #42 Last Edit: May 16, 2023, 03:30:11 AM by My_Network
Hi Franco,

I think I found what causing our issues with "FAR GATEWAYS". By removing this from src/etc/inc/filter.inc. I think you broke the ability to reach or find any other static route that point to where that gateway is. Would explain the issue that Gazd25 is having also.

            }
            $default_gw = $fw->getGateways()->getDefaultGW($down_gateways, $ipprotocol);
            if ($default_gw !== null && !empty($default_gw['gateway'])) {
                system_default_route($default_gw['gateway'], $default_gw['if'], isset($default_gw['fargw']));
            }

And then this bit of code also prevents FAR GATEWAYS that are "DOWN" from being a canditate to be gateways.

    foreach (['inet', 'inet6'] as $ipproto) {
        /* determine default gateway without considering monitor status */
        $gateway = $gateways->getDefaultGW([], $ipproto);
        $logproto = $ipproto == 'inet' ? 'IPv4' : 'IPv6';
        if ($gateway != null) {
            log_msg("ROUTING: {$logproto} default gateway set to {$gateway['interface']}", LOG_INFO);
            if ((empty($interface) || $interface == $gateway['interface']) && !empty($gateway['gateway'])) {
                log_msg("ROUTING: setting {$logproto} default route to {$gateway['gateway']}");
                system_default_route($gateway['gateway'], $gateway['interface'], isset($gateway['fargw']));
            } else {
                log_msg("ROUTING: skipping {$logproto} default route");


Thank you,

Nic




May 16, 2023, 07:32:53 AM #43 Last Edit: May 16, 2023, 07:36:58 AM by franco
@Gareth

> ROUTING: refusing to set inet6 gateway on addressless wan
> ROUTING: refusing to set inet6 gateway on addressless wan

It looks like the newly added log message didn't trigger on the reload here, not sure why or otherwise it would have said "wan(xxx)" to point to the device it's been using.

Looking further and hearing about PPPoE I think you don't have "Use IPv4 connectivity" set for your WAN IPv6 configuration? That's why it wants to use igb2 instead of pppoeX and here we can se igb2 does not have an address indeed.

If, however, the option is set then at least we know where to look further.


Cheers,
Franco

PS: Which WAN IPv6 mode are you using?

@Nic

Far gateway behaviour was not changed at all. I don't see the immediate issue from your post.


Cheers,
Franco