Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - anomaly0617

#1
Hi folks,

Hopefully a straightforward question here....

I have a location where there are dual WAN connections, one Fiber (WAN_FIBER), and one Coaxial (WAN_COAX).

The location has a few site-to-site Wireguard tunnels to other locations.

I want to:

  • Route the Wireguard traffic primarily over the Fiber line
  • Route everything else (internet for users, for instance) over the Coaxial line

I can do the second one primarily with Gateway Groups I have established. But the first one... I haven't found a way to bind Wireguard to a specific network interface like I could with OpenVPN and IPSec.

Am I missing something obvious?

Oh, just to cover the base... I have a firewall rule in the WAN_FIBER interface for incoming Wireguard traffic. The problem is the outgoing traffic. I'm trying to figure out how to define the interface the Wireguard traffic leaves out of, should this location be the initiating peer.

Thanks, in advance!
#2
Hi all,

In the past if I wanted to connect two buildings that had the overlapping internal subnet(s), I could use a 1:1 NAT mapping to deal with this problem. For instance:

Problem Scenario:





Building ABuilding B
Local Network| 192.168.1.0/24| 192.168.1.0/24 (Uh oh!)
Remote Network| 1.2.3.4/30| 4.3.2.1/30

In order to make this VPN tunnel work, I need to do something like this:




Building ABuilding A (Masq.)Building B (Masq.)Building B
Local Network| 192.168.1.0/24| 172.16.1.0/24| 172.16.2.0/24 | 192.168.1.0/24
Remote Network| 1.2.3.4/30| 4.3.2.1/30

And now from Building A, if I ping 172.16.2.1, I get responses from the Building B firewall.

And from Building B, if I ping 172.16.1.1, I get responses from the Building A firewall.

The magic here was in the Phase 2 VPN tunnel, there was an "Manual SPD entries" field that let me specify the masquerade network. And then under Firewall >> NAT >> One-to-One, I'd create a custom mapping that converted, say, 172.16.2.26 into 192.168.1.26 in Building B, or 172.16.1.52 to 192.168.1.52 in Building A.

With me so far?

I'm migrating all of my VPN tunnels over to the new IPSec VPN Connections mechanism. And I've got 100+ new successful tunnels under my belt, so I'm fairly confident at this point that I'm doing it correctly. But this is the first time I've run into a conflict of networks.

So my question is, how do I achieve this under the new Connections mechanism of IPSec?

Is it under VPN >> IPSec >> Virtual Tunnel Interfaces, or
Is it under VPN >> IPSec >> Security Policy Database >> Manual >> Add Manual SPD?

Are there examples somewhere to reference?

Thanks, in advance!
#3
High availability / ACME Client does not sync
March 25, 2024, 01:28:35 AM
Has anyone mentioned that the ACME client does not stay synchronized together with HA?

I see where some settings come over, but specifically certificates are not being copied, so if one server has the certificates and the other doesn't, when they flip-flop, suddenly a bunch of sites come up with non-existant/expired certificates. This is happening using the HAProxy Reverse Proxy solution. HAProxy is sync'ing up, but ACME-Client isn't.
#4
Might be a long shot, but under Firewall >> Rules >> [Interface], do you have a rule to allow "IPv4 CARP" traffic from the "[Interface] net" to any port, any destination, any gateway, on any schedule?

If so, what happens if you create that same rule but under Firewall >> Rules >> Floating and check it for all interfaces other than your WAN interface (because you really don't want CARP traffic from the internet)?

When I saw this, it turned out I wasn't thinking in the right perspective on how the firewalls were communicating the Synchronization statuses, and after doing the floating rule and it all of a sudden worked like a charm, I deleted it, started creating individual rules for each interface (the copy/clone button is a godsend), and then disabling them one by one until I figure out what was going on.

Hope this helps!
#5
Update 3: Got it!

Here's what has been working since 17:31 yesterday (it's now 14:00 here).

Under Firewall >> NAT >> Outbound, create a rule:

  • Interface: WAN
  • TCP/IP Version: IPv4
  • Protocol: ESP
  • Source Address: This Firewall *This seems to be the REALLY important part!
  • Destination Address: any
  • Translation/Target: [Your CARP Virtual IP WAN Address you want to use for VPN]
  • Description: IPSec ESP Traffic Out
  • Save It.
Move this rule to the top of your manual rule stack.

Clone it. For this one, here are the parts that change:

  • Protocol: UDP
  • Destination Port: ISAKMP
  • Description: IPSec ISAKMP Traffic Out
  • Save It.
This rule defaults to the second from the top, so no need to move it.

Now Clone it (3rd rule).

  • Destination Port: NAT-T
  • Description: IPSec NAT-T Traffic Out
  • Save It.
This rule should now be the third down in the stack, so no need to move it.

Now:

  • Apply Changes
  • Sync your HA servers!
  • Now restart Your IPSec services on the HA Firewall.
  • Verify that all tunnels come back up.
#6
Update 2: Still not working properly. As of this morning we have this in the logs from my satellite office:

2024-03-08T08:40:13-05:00   Informational   charon   09[NET] <353> sending packet: from xxx.xxx.xxx.185[500] to xxx.xxx.xxx.157[47289] (36 bytes)   
2024-03-08T08:40:13-05:00   Informational   charon   09[IKE] <353> no IKE config found for xxx.xxx.xxx.xxx...xxx.xxx.xxx.157, sending NO_PROPOSAL_CHOSEN

157 and 158 are the actual WAN addresses for the individual firewalls. They should never appear. This should always say the traffic is coming from the CARP address, 146.

Any ideas? I'm fresh out of them.
#7
Update: We suspect we've found the cause of this and the resolution.

The fix is likely to put an Manual Outbound NAT rule in place that says "Interface=WAN, Source=IPSec Net, NAT Address=(CARP WAN IP that you want). Be sure to position the rule such that it makes sense because a lot of traffic is going to go through this rule and if they are processed sequentially from top to bottom, you don't want it going through 20 rules to find a match every time.

The cause seems to be that High Availability is cycling between the two OPNsense firewalls. When this happens AND there isn't a rule in place as mentioned above, the IP address of the firewall changes, which throws the firewall it connects to off in a major way. Once the rule above was in place AND we cycled the Strongswan service to reset all the tunnels, the problem (so far) has disappeared. Only time will tell if it remains gone.
#8
Edit: It occurred to us that we neglected to mention the version of OPNsense we're working with here. Every device is running at least OPNsense 24.1.2 or newer. /edit

Hi all,

Well, we thought we had this problem resolved (see my previous post if you're confused) but it turns out, maybe not.

We're testing the new Strongswan IPSec Connections before we roll them out to all of our partners.

We have a handful of test sites that are single ISP, single OPNsense firewall locations. The IPSec VPN tunnels between these sites seem to work beautifully. Generally no issues.

We colocate in a datacenter site that has multiple ISPs. They are backed by 60+ carriers, and they use some form of OSPF/BGP/RIP advertisements to switch us all dynamically across routes as necessary. There are some large Fortune 500 companies in the same datacenter. They do not go down. Ever. OK, maybe ALMOST never. But it hasn't happened in 4+ years of having servers there. And we've never experienced any issues with IPSec or OpenVPN tunnels thus far. So I doubt their routes have anything to do with this problem.

At this datacenter site, we have two OPNsense firewalls running on identical Dell PowerEdge R240 servers. They are in a high availability (HA) cluster.

We're having challenges with the satellite locations' IPSec VPN tunnels staying alive to the datacenter site with the High Availability Cluster. Every 4 hours the Phase 1 seems to rekey/renegotiate successfully, but the Phase 2 often seems "broken." Like, it appears that there is a Phase 2 connection being made, but the traffic is only one way. No "bytes in" at the High Availability location. The satellite location records "bytes out", but we don't see them reflected at the datacenter site.

So far, we've tried a bunch of the suggestions we've seen, such as:
On the Phase 1 side:

  • We've disabled MOBIKE in the Phase 1 for all sites that connect to this HA cluster
  • We've "dumbed it down" so that the only IP listed for each tunnel at that site on the local side is the primary CARP Virtual IP, let's call it ".146".
  • We've "dumbed down" the satellite sites so that the only IP they connect to is ".146"
  • We've switched to all IP addresses for local and remote IPs, so no name resolution is required.
  • We've played around with the DPD value on the tunnels going to that site. In general the DPD is set to 1, but Franco and I had a discussed ages ago about DPD and its negative effects on Voice over IP traffic, so we've generally been wary of DPD since then. I tried setting it to 0, and we've tried it set to 86400 (a day). It does not seem to make a difference.
  • We set a continuous ping on each firewall, pinging the other one to see if it would keep the tunnel alive. It didn't.
  • We made sure under Firewall -> Rules -> WAN we have three rules: one for ESP, one for ISAKMP, and one for NAT-T. Originally we had them limited by source and destination, but the most recent configuration has them set to Any/Any for the Source/Destination set. This is what got tunnels up last time (more on that in a minute). We'd like to lock this down. Just having it this way makes us twitchy.
  • We're using EC521 certificate keypairs, not pre-shared keys
On the Phase 2 side:

  • We've tried it with and without Policies checked. Thing is, we were never able to get a tunnel online with Policies unchecked, so we've been leaving it checked.
  • Start action is Trap + Start
  • Close action is Start
  • DPD action is Start
  • We've vacillated between ESP Proposals. Ideally we want to use AES256-SHA512-ECP521, but we've had to switch to AES256-SHA512-MODP2048 on multiple occasions. The Phase 1 is consistently AES256-SHA512-ECP521
  • Local Subnets only include /24 subnets that are at the datacenter location. No exceptions.
  • Remote Subnets only include /24 subnets that are at the given satellite location. No exceptions.

But then:

We dug into the logs. Remember how everything should be coming in and going out from ".146"? What we noticed was that traffic exiting the firewall is going out with it's Non-CARP (Real WAN) IP address, ".158" in this case.

11[NET] <4496d3d2-82a6-4b82-bf9c-3d0b78a3096a|375> sending packet: from xxx.xxx.xxx.158[4500] to xxx.xxx.xxx.92[4500] (96 bytes)

And it would appear that the satellite firewalls are responding to that traffic in kind, because we have log traffic that looks like this:

12[NET] <4496d3d2-82a6-4b82-bf9c-3d0b78a3096a|375> received packet: from xxx.xxx.xxx.92[4500] to xxx.xxx.xxx.158[4500] (96 bytes)

So, this is just NAT, right? We should be able to redirect that NAT traffic using an outbound NAT rule, I would assume. Just like we tell the firewall to send traffic from a server inside the datacenter out using a different IP, say, ".152", we should be able to tell the firewall to take any traffic from strongswan and route it out the door using ".146".

But is the source IP on that the LAN IP address, or the WAN IP address? I could make arguments for it being both.

We thought we'd try it just by specifying IPSec as the interface, but that did not work.

And, that might be a red herring. We may be barking up the wrong tree on the fact that it's entering/exiting from the real interface instead of appearing to enter/exit from the Virtual IP/CARP interface.

Any advice on HA IPSec configurations would be welcome. We've got a lot of HA setups across the world, and more are coming as we go multi-ISP and multi-firewall for sites.

We're happy to send screen captures to someone privately, but I don't want to post them publicly. There are so many things we'd have to redact that I suspect it would be redundant to do so.

Thanks, in advance!
#9
Are you having a problem getting an IPSec (the new >23.1 style) to connect to a High Availability environment?

I've been hammering away at this for the last hour or so, and this is what solved it for me. It's this little section in the tutorial I totally skipped over because "of course that's still there" from when I ran IPSec tunnels under the old style IPSec before.

QuoteFirewall Rules Site A & Site B (part 1)
To allow IPsec Tunnel Connections, the following should be allowed on WAN for on sites (under Firewall ‣ Rules ‣ WAN):


  • Protocol ESP
  • UDP Traffic on Port 500 (ISAKMP)
  • UDP Traffic on Port 4500 (NAT-T)

In my case, ever since going to High Availability, I've had to explicitly specify what CARP Interface IP or an Alias containing the CARP Interface IPs (for each one of my ISPs) my rules applied to.

This got me thinking "I'm only accepting IPSec VPN traffic on one IP of each block of IPs from the ISPs. I'll bet I have to put some custom rules in place to accept this."

So I created some new rules based on the above that look like this (this is the first one. You can clone it and modify for the other two):

Firewall - Rules - WAN:

  • Action: Pass
  • Disabled: No
  • Quick: Yes
  • Interface: WAN
  • Direction: In
  • TCP/IP Version: IPv4 *Your preference, but I don't use IPv6
  • Protocol: ESP
  • Source/Invert: No
  • Source: acl_remote_sites *An alias that includes my remote sites' IP addresses.
  • Destination/Invert: No
  • Destination: acl_wan_1st_ips *An alias that includes ISP1's 1st IP, ISP2's 1st IP, etc.
  • Destination Port Range: Greyed Out, but on others you'll put in ISAKMP or NAT-T
  • Log: No
  • Category: [blank]
  • Description: IPSec ESP
  • No XMLRPC Sync: No
  • Schedule: None
  • Gateway: GWG_Pri_ISP1_Sec_ISP2_Tert_ISP3 *A Gateway Group I created to decide the order for failover

Save, Rinse, Lather, Repeat for the other two rules. Put them at the top of your WAN rule stack under your block rules and maybe your Allow CARP Traffic rule. This way the rule is processed quickly.

In my case, within 5 minutes of applying these rules, my remote firewalls were connecting to my High Availability cluster. It did take about 5 minutes though.

YMMV, but leave a "thumbs up" or something if this helped you. :-)
#10
Virtual private networks / OpenVPN - Routing bug?
January 23, 2024, 04:55:03 PM
Hey folks!

I'm not sure if we call this a bug or ... what.

Here's what I've discovered, and reproducing it seems easy enough.

On a given firewall for a multi-site (10+ sites) organization, I previously used IPSec for the site-to-site connections.

IPSec tunnels work, but they do not allow for multi-WAN failover. For instance, we've recently moved from using "Dedicated" circuits (Costly, slow) to using multiple carriers with "Best Effort" circuits (Inexpensive, and generally fast). So we have two ISPs coming into each building. If the primary connection fails, the secondary connection seamlessly takes over by utilizing Gateway Groups.

However, IPSec tunnels tend to rely heavily on IP addresses for connections and identifiers. There are no options for "Try this connection as your primary, then this one as your fallback, and then on the other end, try this connection as your primary, and this other connection as your fallback."

This is where OpenVPN shines. Admittedly you have to define one site as a "parent/server" site and the other as a "child/client" site, but the multiple connection points thing is much easier to do, because in the client portion of OpenVPN, I can say "connect here on this port first, and then here on this port second."

However--

The desire by management for RoadWarrior VPN connections such that "once you connect to one site, you are connected to all sites" has become a problem. In my OpenVPN RoadWarrior configuration, I can define "Remote Networks" and include all the subnets of all the other sites. Works like a charm, until it creates a massive problem.

We noticed this when one internet connection was behaving flaky, and we were failing over to the other internet connection. We would (seemingly randomly) have trouble with sites connecting to other sites. And after about 6 months of frustration, I think I've finally found what's happening.

I started putting the Client or Server interface in the name of the OpenVPN profile. So for instance, parent/servers were ovpns{1,2,3,4,5,...} and child/clients were ovpnc{1,2,3,4,5}. This corresponds to the netif field in System >> Routes >> Status.

Lets assume that the RoadWarrior VPN connection, which was created when I first commissioned the site, is "ovpns1". And then after that, depending on the site, we added "ovpns{2,3,4...}" or "ovpnc{1,2,3...}".

When a site would stop routing to another site, I'd have to go in, disable the OpenVPN on both sides, go to the routes table, clear any leftover routes to the destination network(s) on each side out, and then restart the server, then restart the client. And this is where it gets interesting...

For many sites, I migrated them back to IPSec for stability reasons, and if the internet connection is being flaky, I'm just deactivating one IPSec tunnel and activating the second one. Which I can do automatically using Monit. But today I went to a site where I still had the OpenVPN Server/Client setup for the site-to-site, and I noticed when I went into the routes table that the "RoadWarrior" OpenVPN Server (ovpns1) had routes to all the other networks still there, even though no one was connected to the RoadWarrior VPN. So I cleared out all the ovpns1 routes to other sites and restarted the OpenVPN tunnel. It's back alive again.

Here's my theory:
All is well until someone connects to the RoadWarrior VPN connection, at which point the routes for the RoadWarrior VPN are created and it disrupts the Site-to-Site routes. Then that person disconnects and the routes do not delete. Which would explain the unpredictability of when this happens.

Here's my proposed fix/solution:

  • Routes need "weights" or "costs" assigned to them. The RoadWarrior routes need a "higher cost" or "lower weight" than the Site to Site VPN tunnel routes. And there needs to be a way to script out clearing routes, because I have not found that command in BSD yet.
  • There needs to be a cleaner way to delete routes via scripts. I've figured out ways to start and stop OpenVPN and IPSec tunnels using Monit. It would be nice if there was a way to write a global script that clears routes after I stop a tunnel and before I start it back up again.
#11
Hi all,

It's possible this is covered somewhere and I missed it in my searches. If so, point me in the right direction and thanks in advance!

Under IPSec, there's a setting called Dead Peer Detection that would send an R_U_THERE packet every X seconds and if it didn't get a response, declare the tunnel dead and you could do resulting actions (Clear, Restart, Trap, etc.). It was far from perfect and there were good reasons not to use it in specific scenarios, but it existed as an option and could be helpful.

Under OpenVPN, there doesn't seem to be the same thing. Here's what I'm running into:

  • The tunnel drops because one side or another has an internet issue. While I could go down the rabbit hole of why the ISP sucks, it happens and it's a fact of life.
  • The issue is, when this happens, the firewall on the remote side (and sometimes the local side too) seem to retain routes to that network over the OpenVPN number. You can see this in System > Routes > Status and search on the ovpns# or ovpnc#, depending on which side of the tunnel you're looking at in that moment.
  • With these stale routes in place, even if the tunnel re-establishes, the routes don't seem to work.
  • The only solution that I've found that consistently works is to disable the tunnel on both sides, go and clear the routes out of the table above, then re-enable the server, and then re-enable the client.

Ideally I'd like to automate this process. Here's what I'm thinking, and if someone has a more elegant solution, I'm all ears:

I'm thinking all this could be done with Monit.
Set up a check in Monit for every minute, pinging a remote host. I'd suggest a remote host that isn't the remote firewall, because that way you ensure that the route is working all the way through to the end destination and not just to the remote firewall. So in my case, I use the NVRs at the remote site.

Assumptions:

  • Monit is running on each firewall of the tunnel
  • Each running once a minute
  • Each sync'ed with time using NTP

On the 10.0.0.0/24 network side (server side):
ping -4 -c 4 -S [LAN_Address] [Remote_NVR_IP]
(I've also seen in the forums where people do this with tcpdump. I'm not particular. I just need something that can be interpreted as a success/failure)
if [failure]

  • disable the tunnel (what command? **See below) I was thinking psgrep server[##] but it seems psgrep isn't an option in OPNSense, and ps -ax | grep server[##] returns both the real process ID and the grep command process ID)
    Discovered that 'pkill -f server[##]' returns the process id, which can be killed off, but that doesn't disable the tunnel. It just kills the current one.
  • delete all the routes for that tunnel from the routes table (what command?)
  • enable the tunnel (what command?)

On the 192.168.72.0/24 network side (client side):
ping -4 -c 4 -S [LAN_Address] [Remote_NVR_IP]
if [failure]

  • disable the tunnel (what command?
  • delete all the routes for that tunnel from the routes table (what command?)
  • wait 15 seconds (sleep 15) - this gives the server side time to "reset"
  • enable the tunnel (what command?)

Can the above actions be scripted? If so, does someone have a template for this already in Monit?

(Edits for clarification)
#12
23.7 Legacy Series / Novatel Cellular LTE modem
December 01, 2023, 06:34:20 PM
This is running OPNSense 23.7.9-amd64, new build.

The hardware I have:

  • ProtectLI FW2B-0 (CPU: Celeron J3060, Mem: 8GB, SSD: 250 GB)
  • PCI-E SIM Card Adapter (Installed in the WiFi/LTE slot)
  • Novatel LTE modem (Installed on the PCI-E SIM card adapter)
  • External Antennae for LTE

The Novatel card shows up on the system as ue0, and I can assign it as the WAN interface.

What I cannot do is define an APN for the SIM card, get information from the SIM card, or connect to the modem at all using cu -l as noted in OPNsense documentation. So the interface is there, but showing as down. I have DHCP and DHCP6 selected for the IP address assignments.

I found this HOWTO and attempted to follow it. But I do not have a /dev/cuau0 or child devices.

I do however see the device when I run the following:

root@fw-026-001:/dev # dmesg | grep Nova
ugen0.4: <Novatel Wireless, Inc. Novatel Wireless HSPA> at usbus0
cdce0: <Novatel Wireless, Inc. Novatel Wireless HSPA, class 239/2, rev 2.00/0.00, addr 3> on usbus0


And running usbconfig, I see the following:
root@fw-026-001:/dev # usbconfig
ugen0.1: <Intel XHCI root HUB> at usbus0, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=SAVE (0mA)
ugen0.2: <vendor 0x1a40 USB 2.0 Hub MTT> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (100mA)
ugen0.3: <Avocent Avocent AVRIQ-USB> at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON (100mA)
ugen0.4: <Novatel Wireless, Inc. Novatel Wireless HSPA> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (500mA)


Since there's no /dev/cuau0 or similar device, I cannot run cu -l /dev/cuau0 or see any child devices like .1 or .init or whatever.

I also tried cdce0 and ugen0.4:
root@fw-026-001:/dev # cu -l /dev/cdce0
/dev/cdce0: No such file or directory
link down

root@fw-026-001:/dev # cu -l /dev/ugen0.4
cu: unsupported speed 9600
root@fw-026-001:/dev # cu -l /dev/ugen0.4 -s 115200
cu: unsupported speed 115200


I can't create a PPP interface as defined in the HOWTO doc because when I go to that screen (/interfaces_ppps_edit.php) there's no Link Interface(s) in the dropdown, even if I unassign the NovaTel ue0 interface from the WAN assignment.

I'm likely missing something stupid due to having no experience configuring an internal, integrated cellular modem into an OPNsense box. I've done this loads of times using a JetPack or MiFi.

What am I missing?

Thanks in advance!
#13
I've been asked by management to add a disclaimer between the OPNsense logo and the Login prompt. The disclaimer is their boilerplate "This is a government/commercial institution device, hacking is a crime and you really shouldn't be here if you aren't supposed to be here, blah blah" message that we have on all the PCs when you press Control-Alt-Delete and before the login prompt appears. This is for the HDMI/VGA console and the SSH console, not the Web UI (though they may want that too eventually).

Is there an easy way for me to add this verbiage to the login screen? In linux I think this is /etc/motd, but I'm not sure what it is on OPNsense/FreeBSD.

Thanks, in advance!
#14
This doesn't answer your question specifically, but just my two cents on it.... In IPSec for a Phase 1 tunnel, at the very top, there's a field that defines whether either side can attempt to establish the tunnel, or if one side does it immediately, on traffic, or just listens for a connection. I've used this in the past to dictate when tunnels are established.

As far as the "Connections (new)" section is concerned, I'm an old crusty OPNsense user, having switched over around 2016. I'm still confused what this "Connections (new)" section is for. I know what I'd like it to be for - multiple IP addresses for the same location, like for where we have redundant internet connections and if one goes down or is unavailable, it "fails over" to the next one in the list - but I've not found the documentation stating officially what its purpose is.
#15
I've applied these patches to a few firewalls, but looking at them, they seem to only affect the UI, not the underlying code that may create or destroy routes when they are initiated or dropped. Am I being dense, or is this the case?

The issue I'm seeing is that nothing seems to be consistently destroying/deleting the routes when a tunnel drops, and then because there's already "a route" when the tunnel re-establishes, the route command can't do it's job. But the old routes are also stale/dead and don't work.