Multipath with FRR + OSPF + ECMP

Started by MarceloAlm, May 25, 2022, 03:14:08 PM

Previous topic - Next topic
Hello,

I am creating a network between two offices, with two separate links, and using two opnsense connected via GRE tunnel: I created a route for each link, using the corresponding gateway, and distributed the routes with FRR+OSPF. This part is working fine, I can see both routes with "netstat -r". The problem is that the system chooses one of the routes as preferred and does not balance between them by ECMP.

I only managed to balance it through a rule in the firewall, but this invalidates the use of OSPF, and I did not intend to use opnsense as a firewall on this network, only as a router, since all types of traffic are authorized in this network.

Is there a setting I'm missing?

Check if net.route.multipath is enabled - it should be. Documentation is still sparse, you could try to also enable net.route.hash_outbound. Both with sysctl.

Edit: and BTW - when testing - multipath is per flow, not per packet!
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

checked the conf and they are enabled:

# sysctl net.route
net.route.netisr_maxqlen: 256
net.route.ipv6_nexthop: 1
net.route.multipath: 1
net.route.hash_outbound: 1

I am monitoring the traffic for a few hours, and it is not alternanting the ECMP route


Quote from: mimugmail on May 25, 2022, 08:39:10 PM
Routing table please :)

making some tests, I found that is some cache of the GW used to access the hosts based on source:

root@rt-wan:/ # traceroute 10.70.70.1
traceroute to 10.70.70.1 (10.70.70.1), 64 hops max, 40 byte packets
1  172.16.102.2 (172.16.102.2)  19.743 ms  21.152 ms  19.547 ms
root@rt-wan:/ # traceroute 10.70.70.4
traceroute to 10.70.70.4 (10.70.70.4), 64 hops max, 40 byte packets
1  172.16.101.2 (172.16.101.2)  27.260 ms  31.595 ms  32.289 ms
root@rt-wan:/ # traceroute 10.70.70.4
traceroute to 10.70.70.4 (10.70.70.4), 64 hops max, 40 byte packets
1  172.16.101.2 (172.16.101.2)  21.710 ms  21.000 ms  26.908 ms
root@rt-wan:/ # traceroute 10.70.70.1
traceroute to 10.70.70.1 (10.70.70.1), 64 hops max, 40 byte packets
1  172.16.102.2 (172.16.102.2)  21.986 ms  20.134 ms  21.027 ms

removed some unnecessary data:

# netstat -r4
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            10.x.x.x         UGS         em0
rt-wan         link#1             UHS         lo0
10.20.0.161        link#1             UH          lo0
10.20.0.162        link#1             UH          lo0
10.70.0.0/16       172.16.102.2       UG1        gre1
10.70.0.0/16       172.16.101.2       UG1        gre0
10.70.0.151        10.20.0.50         UGHS        em0
10.70.0.152        10.20.0.90         UGHS        em0
localhost          link#4             UH          lo0
172.16.101.1       link#7             UHS         lo0
172.16.101.2       link#7             UH         gre0
172.16.102.1       link#8             UHS         lo0
172.16.102.2       link#8             UH         gre1



No, my traffic is not balanced between tunnels, only one is being used.

And you do have multiple streams from multiple source IP addresses?
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: pmhausen on May 26, 2022, 12:29:46 AM
And you do have multiple streams from multiple source IP addresses?

yes, about 100 computers connected on each router