IPsec (site to site) connection "problems" and an interim solution

tofflock · May 24, 2018, 04:15:50 PM

Hi

I have identified a problem with IPsec (site to site) not connecting automatically.

I have two FWs (UK-FW & FR-FW) both running OPNsense 18.1 series:

My tests and investigations described below have been carried out on UK-FW which is up-to-date and has the following overall build:

OPNsense 18.1.8-amd64
FreeBSD 11.1-RELEASE-p10
OpenSSL 1.0.2o 27 Mar 2018

FR-FW also has the same build - i.e. they're both up-to-date at the time of writing.

Both firewalls use a domestic ADSL line for their WAN connection.
I have configured a VPN between the two firewalls.
Both firewalls use a dynamic dns service to facilitate flexible configuration using FQDN, and not hard-coded IP addresses.
This VPN uses a PSK and, when it is up, it works well.

The following statement applies to both firewalls.

QuoteThe ADSL line is connected to a modem which has a PPoE connection to the WAN port on the OPNsense firewall. When the WAN connection comes up, it always has a fresh (& different) public IP address. This new address is associated quite quickly with its FQDN via the dynamic dns service.

This means that after a reboot of either one or both firewalls, the IPsec tunnel will not be established automatically. I recognise that the initialisation of the tunnel now occurs at a different time in the boot sequence than when I reported a problem with v17.1.5 (Thanks for fixing that bug :) ). However, I believe that there is a missing component which means that in my case (i.e. neither firewall can have a fixed public IP address) it can never be established automatically. (Please note, there may be something I've missed (in the configuration) and I am open to suggestions and advice :) )

Here are my observations and fixes:

Let's assume that IPsec between UK-FW and FR-FW is working (traffic is flowing). I will assume in the following scenario that the UK-FW is my local firewall and is described by the details referring to the "Left" side.
This means that the remote firewall (FR-FW) is described by the parameters for the "Right" side.(
Now consider what happens when FR-FW drops its ADSL connection, and then reacquires it. After this reconnection, it will have a new public IP address. FR-FW ensures that this new address is updated to the dynamic dns service.
When the WAN comes up, FR-FW also appears to update the IPsec configuration file (/usr/local/etc/ipsec.conf) so that its new WAN address is correctly inserted into the configuration lines:

Code Select

conn con1
  .
  .
  left = <my WAN ip address>
  leftid = <my WAN ip address>
  .
  .

The IPsec tunnel will not come up at this point.
If one looks at the two IPsec configuration files on UK-FW, one will observe discrepancies which will never be detected or fixed.
In the IPsec configuration file (/usr/local/etc/ipsec.conf), there is a hard-coded IP address for the remote firewall (FR-FW) in the line which starts " rightid = ". This address is now wrong and there appears to be no automatic fix.
Also, in the IPsec secrets file (/usr/local/etc/ipsec.secrets), there is also a "stale" (old) IP address associated with the shared key (PSK).

I have identified two ways to bring the IPsec tunnel back up. They are:

Method 1

Using the GUI (web console at VPN: IPsec: Tunnel Settings , press the "edit" button for either the phase_1 or phase_2 entry. Do not change any parameter, but just press the "Save" button.
When prompted, press the "Apply Changes" button.
The IPsec tunnel will come up.
Note: This action has to be carried out on the system which has not just been assigned a new WAN address. In the scenario I described above, this would have been carried out on UK-FW

Method 2

Working on the system which has not just been assigned a new WAN address, one can correct the errors in the two ipsec configuration files. (This can done manually (with a text editor *1) or scripted (less tedious *2))
Then one needs to run the commands

Code Select

/usr/local/sbin/ipsec rereadall
/usr/local/sbin/ipsec reload

The IPsec tunnel will come back up.

Notes *1

Editing ipsec.conf with a text editor (I used vi) is potentially risky. It's certainly tedious! ;)
Make sure you don't delete or change the two spaces at the beginning of the line.
Change only the lines with an incorrect IP address; probably the " leftid = " entry.
When editing ipsec.secrets, make sure that you change only the IP address at the beginning of the line.
When you've finished the edit, make sure that the file (ipsec.secrets) is owned by root and has 0600 permissions.

Notes *2

Having tested my observations and trying a manual fix (see above), I scripted the checking and correction.
I wrote a shell script (I used bash as I'm more familiar with it) which is defined as a service and can be called from the cron facility (web console at System: Settings: Cron)
The script is run every minute.
The script checks IP address for both sides (right and left) in the ipsec.conf file. It also checks the IP address in the ipsec.secrets file.
If any error is detected, it makes the necessary correction.
If, and only if, any change (correction) is made, it then runs the two ipsec commands shown in the code block above.
My IPsec configurations were simple with only a single Phase_1 and a single Phase_2 defined on each firewall. Therefore, the script was easy to implement as I didn't have to check (& parse) multiple "conn" sections in ipsec.conf and ipsec.secrets.
In my experiments and observations, the IPsec tunnel is quickly reestablished.

Results of scripting

I have installed my script at both ends of my IPsec tunnel - i.e. on UK-FW and FR-FW.
I can remotely reset the modem on either firewall. The script brings the tunnel back up successfully in each of the three possible scenarios:
- Modem on UK-FW is reset, so UK-FW gets a new public IP address
- Modem on FR-FW is reset, so FR-FW gets a new public IP address
- Both modems are reset at the same time.
- The IPsec tunnel comes back up within 2 or 3 minutes of the modem being reset. (Remember that the modem has to negotiate a working connection together with PPoE authentication, then the dynamic dns upgrade has to go through and finally remember that the "repair" script runs only once per minute as a cron job.)
I can remotely reboot either firewall. The script brings the tunnel back up successfully in each of the three possible scenarios:
- UK-FW is rebooted, so UK-FW gets a new public IP address
- FR-FW is rebooted, so FR-FW gets a new public IP address
- Both firewalls are rebooted at the same time.
- The IPsec tunnel comes back up within 3 or 4 minutes of the firewall being rebooted.

Feature Request / Bug fix
It would be preferable (particularly for those users with more complex IPsec configurations) if the monitoring of the remote end of each VPN connection could be handled within OPNsense so that the IPsec configuration files could be kept up-to-date (automagically) and this would then mean that IPsec would then have a default "Up" status, rather than a default "Down" status which is what I observe with a stock OPNsense installation at present.

Alternative approach
A long time ago I learned that hard-coding IP addresses into configurations was best avoided if at all possible. So I experimented with /usr/local/etc/inc/plugins.inc.d/ipsec.inc (which appears to be the php code that generates the IPsec configuration files from the user-input data. I found that the "rightid" parameter has to be a hard-coded IP address. IPsec will not work with a FQDN here. Similarly, I couldn't get the secrets file to work with either a FQDN or the special parameter "%any". My reading of the man page suggested that the "%any" parameter should have worked here, but it's less desirable, probably, for security reasons.

Thanks for reading this long post.
I hope that it might help someone else solve a problem, and better still prompt a fix within either OPNsense or ipsec.

Peter

franco · May 24, 2018, 10:50:20 PM

Hi Peter,

Thanks for this. I'm unsure about a few key points that I'd like you to clarify:

* What are your GUI-bound settings for: My identifier, Peer identifier
* You use the FQDN on both sides for the remote address?
* What is your WAN setup... DHCP?
* What interface is your IPsec phase 1 bound to?

In theory IPsec should restart itself and also regenerate an proper up-to-date config on DHCP IP changes. But it sounds like it's not doing this.

FQDNs are tricky as peer IDs as they need to be resolved at the right time and cannot change unless the IPsec is forcefully restarted.

Cheers,
Franco

tofflock · May 25, 2018, 02:36:04 AM

Hi Franco

First, here's some answers for you:

QuoteWhat are your GUI-bound settings for: My identifier, Peer identifier

UK-FW		FR-FW
In "VPN: IPsec: Tunnel Settings" My Identifier is set to "Dynamic DNS" with a value of : "xx1.duckdns.org"		In "VPN: IPsec: Tunnel Settings" My Identifier is set to "Dynamic DNS" with a value of : "xx2.duckdns.org"

xx1.duckdns.org is the name associated with the public IP address of UK-FW
xx2.duckdns.org is the name associated with the public IP address of FR-FW

Peer identifier is set to "Peer IP address" on both UK-FW and FR-FW

Quote[Do] you use the FQDN on both sides for the remote address?

I guess you mean "Remote gateway" parameter?

If so, then "yes", I use the appropriate FQDN on both sides.
So, on UK-FW the value of the remote gateway is xx2.duckdns.org
and on FR-FW the value of the remote gateway is xx1.duckdns.org

QuoteWhat is your WAN setup... DHCP?

In "Interfaces: [WAN]" the setting for "IPv4 Configuration Type" is set to "PPPoE"
This is the same on UK-FW and FR-FW.

QuoteWhat interface is your IPsec phase 1 bound to?

In "VPN: IPsec: Tunnel Settings" (Phase 1 settings), the Interface is set to "WAN"
This is the same on UK-FW and FR-FW.

Now a comment (observation?)

QuoteIn theory IPsec should restart itself and also regenerate an proper up-to-date config on DHCP IP changes. But it sounds like it's not doing this.

I think I agree with your statement, and the config file is being updated with the "local" (left) data. I'm not sure how this is done because i haven't dug that deeply into the code.

However, in the config file on UK-FW, there also exists the (public) IP address for FR-FW. UK-FW is not checking the public IP address of FR-FW (which it can (only) get from DNS) and ensuring that this address is correct within " rightid = " in the config file on UK-FW.

And No 2 above also applies to FR-FW not monitoring the public address of UK-FW for its own ipsec.config file.

Finally, because I am using a "Mutual PSK" for authentication, you'll also find the public IP address of FR-FW embeded in the ipsec.secrets file of UK-FW.

So, when I wrote my checking and fixing script, I checked all the numerical IP addresses in ipsec.conf associated with left, leftid and rightid parameters and the numerical IP address in ipsec.secrets. If I found any discrepancy, I fixed it (sed is wonderful :) ) and set a changed flag. At the end of the script, I used the state of this flag to indicate whether an ipsec rereadall and an ipsec reload were required.

The current version of ipsec appears to only function with numerical IP addresses. Therefore, it has to be the responsibility of OPNsense to check the validity/accuracy of the data going into ipsec.config and ipsec.secrets. This is clearly done when the config for Phase_1 or Phase_2 is saved. I think it just needs to happen regularly and routinely in the background. I think the Voldermort (*1) installation {p.*e} used to do this, because that was one of the reasons I moved from IPCop to {p.*e} around 2009. I found that {p.*e} was better than IPCop in restoring an IPsec VPN. (Of course, it now appears that IPCop has withered on the vine, but I digress. Sorry.) What I don't have at present, is an installed version of {p.*e) to look at the code. Sorry.

*1 Voldermort - "He Who Must Not Be Named".

I hope the answers and the numbered comments help explain what I think is "wrong". If you need any more information, then just ask.

Cheers, and many thanks for the hard work

Peter

mimugmail · May 25, 2018, 06:39:55 AM

I havent read everything closely but consider:

When you start using %any on one side, you can only set the VPN to respond only, not sure if you want to do this.
IPSEC with dynamic IPs on both sides is considered to work with limitations (not esp. for OPNsense, more in general)

There are better solutions when it comes to both sides dynamic. One is OpenVPN, or the other would be Zerotier, since you have a central instance in the cloud.

I dont wont to push you away from IPSEC, but there are better solutions for you problem and you might save some time :)

tofflock · May 25, 2018, 03:06:51 PM

Hi mimu

Many thanks for your reply and suggestions. It was a bit left-field, but it made stop, think and dig a bit further. What I hadn't said earlier is that what I want to achieve is a bridge between two (private) vlans. IPsec running on the firewall/router that OPNsense provides actually gives me a simple (almost elegant?) solution and allows me to move whatever data I want, from any device, to where I need it. For example, the other day I was in the UK and printed a document to a family member in France. Saved them installing umpteen MB of printer driver to get a single A4 page :)

So I went and investigated your suggestions. Zerotier first, because I know absolutely nothing about it. Looks useful, probably very useful in some scenarios, but it gave me the strong impression that it's a per-device solution. Fantastic (if you're not put off by the cloud word) for some types of application; but I want to connect anything to everything. I don't think ZT can offer me that. Not without installing "client-side" software on almost everything. Also, they're a business and want to make money; I'm not a business (now) and I can't justify that money. Finally, one has to address the cloud word. When I was working, we banned the "just" word. Engineering is about detail and the "just" word implies it's easy and we haven't thought about the detail (it gets left to someone else). In my view, "cloud" is almost up there with "just". I hope this doesn't offend. I like to know exactly where my data is, and where it's going. I use IPsec (and email, and ...) and I hope I know the risks. But that's all (as far as I can make it) under my control. I do use a cloud, but it's my cloud (nextcloud) and it runs on my hardware in my cellar.

Anyway, then I turned to OpenVPN. I use this already as a "road warrior" and I've therefore used it as a bridge from a single device back to base. It works very well in this mode - though I find the network delay from Australia back to the UK a little disconcerting!. I went looking on the OpenVPN site to see how they suggest using it as a bridge. My quick read of two examples (URLs below in case anyone else is interested) led me to believe that in order to do this, then the solution was more complex than the IPsec solution and involved more hardware in order to do the routing. I have no objection to having more hardware, but each device needs managing and supporting, and my IT support team (me) is already at full stretch, given the day-time jobs of child-minding the g-kids, gardening, building electronics, ...

Anyway, coming back to technicalities of trying to run a VPN over domestic grade ADSL to two dynamic nodes is really rather wild, but cheap, and very useful when it works. What any VPN solution is going to need is for each end to "know" the parameters (principally the address) of the other end. When one end goes down, there is always going to be a lag while it gets itself back up and back in communication with the its peer.

I've just gone and looked at my IPsec logs to see how long it's been up. It's currently at 1.5 days, and I suspect the previous resync was me testing and prodding and adjusting the script. It now has 2 logs (I like logs for solving problems :) ) - a hiccup log, so I can see quickly when the last problem was, and a minimalist boring log.

So many thanks for taking the time to send me your suggestions. My apologies for the long-winded answer; but I think I'll stay with IPsec for now because it's a good fit for what I need.

Best wishes,

Peter

OpenVPN for site-to-site connections
https://docs.openvpn.net/how-to-tutorialsguides/site-to-site-layer-2-bridging-using-openvpn-access-server/
https://docs.openvpn.net/connecting/site-to-site-routing-explained-in-detail/

poodad · February 04, 2019, 03:15:17 PM

I am seeing this same behavior. I have ipsec set up between an OPNSense system and a Sophos SG UTM. Both sides have dynamic IP address and I use DuckDNS to track names to the IP addresses. I have verified that DuckDNS is being updated by both sides when an IP address changes.

The VPN works great until the IP address of the Sophos side changes. OPNSense can "see" the new ip address (if I ping xxxx.duckdns.org from the OPNSense CLI, I get the right ip). However, the VPN stops working. Currently, I reboot OPNSense and everything starts working until the Sophos UTM's ip changes, then it breaks again.

Sophos seems to handle a change to the OPNSense side's ip without problem.

IPsec (site to site) connection "problems" and an interim solution

tofflock

May 24, 2018, 04:15:50 PM

franco

May 24, 2018, 10:50:20 PM #1

tofflock

May 25, 2018, 02:36:04 AM #2

mimugmail

May 25, 2018, 06:39:55 AM #3

tofflock

May 25, 2018, 03:06:51 PM #4

poodad

February 04, 2019, 03:15:17 PM #5