My experience with opnsense

Started by ishan, October 09, 2023, 12:04:30 AM

Previous topic - Next topic
Hey, I have been using router os for a few years. In the recent times, the device was failing to keep up with my network and I upgraded to a x64 machine(n305, 6x i226v). I installed proxmox on it and virtualized opnsense. I was using opnsense 23.7.5(latest stable as of 2023/10/09)

I initially didn't like the UI but I have grown to like it over the last week of testing. There are other issues, some that _maybe_ addressed some that require a lot more work.

Issue #1:
Wireguard tooling is not optimized to handle IPv6 addresses

opnsense os-wireguard package tries to add a ipv6 route with `route -4` flag and fails. I checked the source coded for this package in github.com/opnsense/plugins repository and the code there looks okay so maybe this has been fixed and the fix may be included in future releases.

> /usr/local/opnsense/scripts/Wireguard/wg-service-control.php: The command '/sbin/route -q -n add '-4' '<ipv6-address>' -iface 'wg3'' returned exit code '68', the output was 'route: bad address: <ipv6-address>


Issue #2:
Multi WAN failover does not work

I have 2 WANs, WAN1 and WAN2. In Gateways -> Single, WAN1 has priority 1 and WAN2 has priority 2. There is a group which puts WAN1 in tier1 and WAN2 in tier2.

There is a Firewall rule which sends traffic from all LAN networks, destined for _not_ a private network over the WAN1_WAN2_FAILOVER Gateway.

When WAN1 goes down, It does not add a default route to point to WAN2! It does not do so until I adjust the priority for WAN2 to be higher or equal to WAN1. So multi wan in this case does not work at all. When this happens, I am left with no connectivity until the route is added on the router.

Issue #3:
PPPoE stuck in stale state.

My ISP does not use Customer VLANs. To simulate an outage, I added a random VLAN(on the bridged ONT) so the router can not communicate with the BNG anymore. Immediately after adding the random vlan, The router started showing packet loss on this gateway and I verified WAN1 was unreachable. Even after this, Opensense took roughly 2 minutes to realize WAN1 was unavailable and to mark it as such!

```
2023-10-09T02:12:24+05:30 router.home.arpa ppp[90523] [wan] IFACE: Up event
2023-10-09T02:12:24+05:30 router.home.arpa ppp[90523] [wan] IFACE: Rename interface ng0 to pppoe0
2023-10-09T02:15:36+05:30 router.home.arpa ppp[90523] [wan_link0] LCP: rec'd Terminate Request #84 (Opened)
2023-10-09T02:15:36+05:30 router.home.arpa ppp[90523] [wan_link0] LCP: state change Opened --> Stopping
2023-10-09T02:15:36+05:30 router.home.arpa ppp[90523] [wan_link0] Link: Leave bundle "wan"
```

After WAN1 was marked as unavailable/down, I ran into issue #2 where it would not add default route over WAN2 because WAN2 had lower priority. For context, Back in router os, It would disable dead pppoe tunnels in less than 2-3 seconds.

Issue #4:
Multi WAN failover has been implemented poorly.

In the previous routers I have used, Multi wan failover was configured with recursive routes. It was much much faster(less than a second to failover and recover!).

In opnsense, This has been implemented very differently where a separate process monitors the WAN interfaces and swaps default routes. This step can fail and when that fails, the network will be left with no connectivity to outside and this is much slower than using recursive routes. I don't see a way to do recursive routing in opnsense so I am guessing it's probably *bsd that does not support this.


Issue #5:
IPv6 assignment in non standard setups

If the ISP(WAN2 in my case which is a 5G modem) assigns dynamic /64 blocks with SLAAC, You can set LAN to track interface WAN2 and clients will get IPv6 connectivity. Router will use NDP to sort this.

If the ISP(WAN1, a fiber isp) assigns dynamic /64 blocks with dhcpv6, It asks to configure ipv6 prefix id, does not use NDP and you can not use the dynamic /64 prefix on any thing more than a single local network.

I did not look too much into this one so it's possible I am missing the right way to configure it.


Issue #6:
Firewall rules and opening up ports on the router is just a mess.

I have a public static IP on WAN1 and WAN2 is behind CGNAT. I opened up the ports 51820, 51377 for wireguard and I confirmed they were open. I wanted wireguard tunnels on the router to use these ports and yet it was NAT-ing this traffic for no reason!

It was sending this traffic as WAN1:<random-port> -> WAN2:51377(???? Why is it NAT-ing this connection to WAN2, no reason for it to do this) -> Remote Peer:51377.

On the remote peer, I could see this traffic was originating from <random-port> from the router and I had to add another NO NAT rule to stop this from happening. Image: <see attached image>


Overall, There have been far too many paper cuts(other than what I remembered to write in this post) and this is just not stable enough for me to continue using it.

It seems like most of your issues are simple misunderstandings, which is no shame when you're new to OPNsense. People here (mostly) don't bite, asking for help before throwing in the towel might have been a good idea.

#1 If you can provide steps how to reproduce this, open an issue on GitHub. The developers appreciate substantiated bug reports.

#2 A gateway group never changes the system's default route. The switch happens in the "route-to" option of the firewall rule. You can check this in Firewall: Diagnostics: Statistics: rules. If you want failover for services running on OPNsense itself, you can enable default gateway switching in System: Settings: General.

#3 #4 Not exactly sure how you would prefer this to work. Can you elaborate?

#5 A single /64 delegated via DHCPv6 PD won't support more than one LAN. That's how IPv6 works, not a limitation of OPNsense. If your ISP doesn't give you more than a /64, you'd have to use IPv6 NAT for additional LANs (ough).

#6 Needs more details. What did you create, firewall rules or port forward rules? What's the goal? An inbound rule on an interface behind CGNAT doesn't really make sense to me. Your ISP most likely won't allow inbound connections anyway.

Cheers
Maurice
OPNsense virtual machine images
OPNsense aarch64 firmware repository

Commercial support & engineering available. PM for details (en / de).

October 09, 2023, 01:22:55 PM #2 Last Edit: October 09, 2023, 01:37:03 PM by Monviech
Quote from: Maurice on October 09, 2023, 11:57:27 AM
#5 A single /64 delegated via DHCPv6 PD won't support more than one LAN. That's how IPv6 works, not a limitation of OPNsense. If your ISP doesn't give you more than a /64, you'd have to use IPv6 NAT for additional LANs (ough).

Maybe they meant an NDP Proxy.
I wouldnt use an ISP who'd give me only one /64 Prefix.
Hardware:
DEC740

> Maybe they meant an NDP Proxy.

Correct!


> I wouldnt use an ISP who'd give me only one /64 Prefix.

You would if they were the only option.  ::)

Quote from: Maurice on October 09, 2023, 11:57:27 AM
It seems like most of your issues are simple misunderstandings, which is no shame when you're new to OPNsense. People here (mostly) don't bite, asking for help before throwing in the towel might have been a good idea.

#1 If you can provide steps how to reproduce this, open an issue on GitHub. The developers appreciate substantiated bug reports.

Looking at this script, https://github.com/opnsense/plugins/blob/master/net/wireguard/src/opnsense/scripts/Wireguard/wg-service-control.php I believe this issue has been fixed. Probably not upstreamed into a release yet.

The issue was, it was trying to add a ipv6 route with `route -4 add ...` which fails.


Quote from: Maurice on October 09, 2023, 11:57:27 AM
#2 A gateway group never changes the system's default route. The switch happens in the "route-to" option of the firewall rule. You can check this in Firewall: Diagnostics: Statistics: rules. If you want failover for services running on OPNsense itself, you can enable default gateway switching in System: Settings: General.

Got it! From the docs, I got the idea this was for cases when there were multiple gateways on a WAN and there was need for an option to switch between them ? Thanks for the correction


Quote from: Maurice on October 09, 2023, 11:57:27 AM
#3 #4 Not exactly sure how you would prefer this to work. Can you elaborate?

#3: It should failover much faster than it does. It takes forever to mark a pppoe connection as down(10+ seconds)

#4: This is probably just different way of doing things. I prefer recursive routes but opnsense and the NOS I am on now, vyos implement this differently where a separate service monitors a WAN connection and marks it active/failed. In mikrotik, I was doing this with recursive routes and it was much faster.


Quote from: Maurice on October 09, 2023, 11:57:27 AM
#5 A single /64 delegated via DHCPv6 PD won't support more than one LAN. That's how IPv6 works, not a limitation of OPNsense. If your ISP doesn't give you more than a /64, you'd have to use IPv6 NAT for additional LANs (ough).

If my ISP assigns a single dynamic /64 via slaac, I can use NDP and I can use this prefix from various networks in LAN. If they assign a dynamic /64 via dhcpv6, I can't use it on more than 1 lan interface. This was the problem here. It'll be nice if it can use ndp in both situations.

Quote from: Maurice on October 09, 2023, 11:57:27 AM
#6 Needs more details. What did you create, firewall rules or port forward rules? What's the goal? An inbound rule on an interface behind CGNAT doesn't really make sense to me. Your ISP most likely won't allow inbound connections anyway.

The goal here was, I open a port for wireguard. It should communicate to the other peer over this port. But it was using some random port and the entries in conntrack were just really weird. (please see the image above)

The inbound rule was on the WAN interface group. 1 WAN has static public IP and another is behind CGNAT.
This is more complicated to explain and we can honestly just skip past it because I moved away from this setup and I won't be able to replicate it and help in fixing it.


Regards

Ishan Jain

October 21, 2023, 01:09:25 PM #5 Last Edit: November 12, 2024, 02:35:42 PM by Monviech (Cedrik)
Well there seems to be an ndproxy(4) kernel module available for FreeBSD 13:

https://man.freebsd.org/cgi/man.cgi?query=ndproxy&apropos=0&sektion=4&manpath=FreeBSD+13.2-RELEASE+and+Ports&arch=default&format=html

I think it could be manually loaded (with kldload) and then configured from the shell. But there is no GUI implementation for it. Neither PFsense nor OPNsense seem to support it in their GUI.

EDIT:
https://github.com/opnsense/plugins/pull/4348
Hardware:
DEC740