[Solved] Strange gateway behaviour (question to devs)

Started by hbc, October 17, 2019, 09:35:31 AM

Previous topic - Next topic
Hi!

I have running several networks, connected by various gateways. Some gateways resist in the same subnet, but each routes to other subnets.

Now I have e.g. gateway A (192.168.1.254) in subnet 192.168.1.0/24 used as destination in three routes:

  • 10.10.1.0/24 --> 192.168.1.254
  • 10.10.2.0/24 --> 192.168.1.254
  • 10.10.3.0/24 --> 192.168.1.254

As soon as I add gateway B (192.168.1.10) also located in subnet 192.168.1.0/24, it takes over all routes of gateway A. I did not even add the route for gateway B, but as soon as it gets added, my routing is damaged.

The strange thing is the routing table: netstat -rn is not changed. Gateway A (192.168.1.254) is still shown as gateway. But when doing a traceroute, gateway B is used.
I a next step, I added a linux pc with 192.168.1.100) as 2nd gateway and run tcpdump. And again, the 192.168.1.100 takes over all routes of gateway A and tcpdumps shows the routing traffic getting into the linux pc.

Is there a shadow routing tables that gets overwritten when adding a gateway in the same subnet as an existing one? For me netstat -rn was the only place for routes. Entries in this table are used, but now I have a situation where this routing table is not in sync with the effective used one.

Is this a FreeBSD bug or what commands are issued when adding/deleting/enabling/disabling a gateway?

Usually it should no problem with gateways in same subnet and a routing table like this:


  • 10.10.1.0/24 --> 192.168.1.254
  • 10.10.2.0/24 --> 192.168.1.254
  • 10.10.3.0/24 --> 192.168.1.254
  • 10.20.3.0/24 --> 192.168.1.10
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (24 cores)
256 GB RAM, 300GB RAID1, 3x4 10G Chelsio T540-CO-SR

Rephrased my problem since no feedback yet. I need a solution, since I have to add this route and its gateway without overwriting existing routes.
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (24 cores)
256 GB RAM, 300GB RAID1, 3x4 10G Chelsio T540-CO-SR

October 23, 2019, 03:11:38 PM #2 Last Edit: October 23, 2019, 03:15:22 PM by bootstrap
Hm. Your networks should probably be /24, not /8 (since you specified three octets).
Otherwise one 10.0.0.0/8 should do. Typo?
If you're really adding all networks as /8, you're probably breaking your routing right there.
At least if all your routes on all your devices always use /8...


Right. Should be /24 in this example. I fixed it. The real networks use public ips and the masks and routing is correct and works ... as long as I do not add a second gateway.
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (24 cores)
256 GB RAM, 300GB RAID1, 3x4 10G Chelsio T540-CO-SR


October 24, 2019, 08:10:51 AM #5 Last Edit: October 24, 2019, 08:50:42 AM by hbc
Right. WAN is the only interface with explicitly set gateways. All other interfaces have auto-detect.

But gateway A is not the default/upstream gateway for wan. It is an internal gateway on lan, neither ticked as upstream, nor explicitely assigned as upstream in any interface.
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (24 cores)
256 GB RAM, 300GB RAID1, 3x4 10G Chelsio T540-CO-SR

Sorry for repeating, again, do you have in ANY interface an upstream gateway or everywhere auto-detect?
With upstreamg gateway set you have some pf magic which can go unexpected, that's why I'm asking :)

Yes, WAN interface has ipv4 and ipv6 gateway set.

How else does opnsense find my default gateway if not specified at least for wan?
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (24 cores)
256 GB RAM, 300GB RAID1, 3x4 10G Chelsio T540-CO-SR

You can have a gateway and mark it as upstream in System : Gateways : Single, or you set upstream gateway in Interface : WAN. Please test/use first option

October 25, 2019, 09:16:09 AM #9 Last Edit: October 25, 2019, 09:23:47 AM by hbc
I tried both settings:

  • WAN Gateway: auto-detect
  • Gateway:Single:WAN (uplink option checked)

  • WAN Gateway: WAN-Gateway
  • Gateway:Single:WAN (uplink option unchecked)

Same result: As soon as adding/activating gateway B, gateway A is overwritten.

But I recognized something about the (active) state.

When WAN gateway is marked as uplink, the text (active) is added behind the gateway name. And this text was only added to my WAN gateway (so I thought this is someting indicating active default gw). The other gateways - even activated and green online state - do not have this (active) text.

Now when I uncheck the uplink option in WAN, my lan gateway A gets this (active) tag - and as soon as I activate/add gateway B, gateway B gets this (active) tag.

What does this (active) behind a gateway name mean? I think this is the key to my problem, since when no gateway is marked as uplink, this (active) tag indicates the used lan gateway (A or B).

There is only one (active) gateway per protocol (IPv4 and IPv6). So I only have a maximum of two gateways with (active) tag. Even the routing tables is not altered, the the (active) LAN gateway gets all routes on Interface LAN_NET_2

WAN Gateways marked as upstream, GATEWAY_B disabled: (Routing as expected)
GW_WAN_IPv6 (active) WAN IPv6 255 (upstream) Online
GW_WAN_IPv4 (active) WAN IPv4 255 (upstream) Online
GATEWAY_B LAN_NET_2 IPv4 255 Pending
GW_LAN_IPv6 LAN_NET_2 IPv6 255 Online
GATEWAY_A LAN_NET_2 IPv4 255 Online



WAN Gateways used in WAN interface and no upstream option, GATEWAY_B disabled: (Routing as expected)
GW_WAN_IPv6 (active) WAN IPv6 255  Online
GW_WAN_IPv4 WAN IPv4 255  Online
GATEWAY_B LAN_NET_2 IPv4 255 Pending
GW_LAN_IPv6 LAN_NET_2 IPv6 255 Online
GATEWAY_A (active) LAN_NET_2 IPv4 255 Online


WAN Gateways used in WAN interface and no upstream option,  GATEWAY_B enabled: (all routes on interface LAN_NET_2 are taken by GATEWAY_B)

GW_WAN_IPv6 (active) WAN IPv6 255 Online
GW_WAN_IPv4 WAN IPv4 255 Online
GATEWAY_B (active) LAN_NET_2 IPv4 255 Online
GW_LAN_IPv6 LAN_NET_2 IPv6 255 Online
GATEWAY_A LAN_NET_2 IPv4 255 Online
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (24 cores)
256 GB RAM, 300GB RAID1, 3x4 10G Chelsio T540-CO-SR

[quoted instead of modified]
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (24 cores)
256 GB RAM, 300GB RAID1, 3x4 10G Chelsio T540-CO-SR

I think the active means that it is the default route being used by OPNsense

if wanA is active vs wanB... connection will use wanA.

you can also decrease (lower is prioritized, I believe) the link priority for it to be "active"

I don't think so. Entries made in System:Gateways do not alter my routing table.

Not matter which gateway is shown (active), the default route is still the WAN gateway which is correct.

Altering priorities does not change anything. The most strange thing is the corrrect shown routing table, but the wrong routing done.

My routing tables show gateway A as used, but when activating gateway B, all A routes are routed via B.

So I want to find out what adding a gateway in System:Gatways actually does. For me it was just a place to define a gateway that can be used in interfaces and routes later.

But the definition of a gatway already changes something. I never thought that a pc would use other routes than defined in the kernel routing table.
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (24 cores)
256 GB RAM, 300GB RAID1, 3x4 10G Chelsio T540-CO-SR

in my case, I have wanA and wanB, whenever I change priorities or which ever is "active". that is the one being used as the Internet connection. but both my WAN have different subnets and IP though.

Well, actually it does not concern WAN nor internet connection. It is just an unpredictable behaviour when I add a new internal gateway which prevents me from using OPNsense as router.

On every other PC, L3 switch, router I can add additional gateways in one subnet without crashing the existing routing logic.
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz (24 cores)
256 GB RAM, 300GB RAID1, 3x4 10G Chelsio T540-CO-SR