OPNsense Forum

Archive => 23.1 Legacy Series => Topic started by: DenverTech on April 05, 2023, 10:52:29 pm

Title: [Solved] LetsEncrypt issues after v23.1 upgrades? (likely just mine)
Post by: DenverTech on April 05, 2023, 10:52:29 pm
I'm currently on 23.1.5_4. It appears that sometime after I upgraded to 23.1, the ACME updates stopped working. I've done several updates since then with no change.

TL;DR of what I'm seeing:
- Certificate renewals are happening as per normal.
- All renewals are successful (log pics attached, showing renewals on 4/4/2023)
- However, the firewall still sends the old certificate to all requests, which is now expired (pics attached, showing expiration on 4/1/2023)
- I have tested this on multiple browsers and machines and all get the same expired certificate reply. Tested in Chrome, Firefox, and Opera, on Windows 11 and on Ubuntu.
- Confirmed that the system IS using the LetsEncrypt certificate as its default. The system only has one certificate and it's this one that's having the issues. Pic attached of that, too.

Any ideas? I've already exported the config and imported it on a new installation with the same results. Seems almost as though the ACME Client plugin broke during the upgrade to 23.1, though no one else is reporting it, so I'm guessing it's just mine. The fact that it's pulling new certs successfully, but then not using them it is definitely what has me confused. Renewals have been working fine previously since v19 and this is the first time it's stopped renewing since then.
Title: Re: LetsEncrypt issues after v23.1 upgrades? (likely just mine)
Post by: Koloa on April 06, 2023, 07:01:33 am
This won't help you in any way, but, I just signed on to the forum now to report that ACME + Gandi as a plugin haven't been working after the 23.1 upgrades.  It worked fine prior to 23.1.x, but, since then, when my 90 days have all come and gone, I've been unable to get certificates renewed or issued using DNS01 via Gandi.

I was able to get it working for HTTP challenges, but that's not what I need.

Even with extremely verbose logging turned on, and doing it from the command line, it wasn't clear why it was failing, particularly given that the configuration on my end had not changed.  I also tried re-issuing my Gandi API key, but, that had no effect. 

Like I said, doesn't directly help you, but, you're not alone...
Title: Re: LetsEncrypt issues after v23.1 upgrades? (likely just mine)
Post by: DenverTech on April 06, 2023, 07:56:50 am
Interesting. Very different, but interesting anyway. With mine, it's successful...but doesn't seem to use the updated cert. Yours sounds more like an authentication issue. I used to use Gandi and it was always a bit iffy on certain things so it's not surprising, but odd that it started with 23.1. Maybe related in some strange way.
Title: Re: LetsEncrypt issues after v23.1 upgrades? (likely just mine)
Post by: abulafia on April 06, 2023, 09:26:37 am
No help here, but I also had some issues a few weeks back where renewals would no longer work.

I *think* it was failing the DNS challenge...?

I ended up deleting and newly creating my acme/letsencrypt config. 

That was end of February.
Title: Re: LetsEncrypt issues after v23.1 upgrades? (likely just mine)
Post by: Taomyn on April 06, 2023, 11:25:15 am
This is the exact opposite of the experience as I've just discovered after upgrading to v23 a couple of weeks ago.

I did a fresh install of v23 and imported my old v22 configuration, everything went very smoothly. Today I discovered than none of my LE certificates were not getting renewed automatically each night and all failed - I'm using HTTP challenges not DNS. If I force them manually they work immediately. I've only done a few but as I don't want them all renewing the same day every few months I'm going to stagger the manual renewals across different days.

My only question is will this fix it for the next automated renewal months down the line. If this is expected behaviour after importing a config into a fresh install, then the very least the firewall should be warning us.
Title: Re: LetsEncrypt issues after v23.1 upgrades? (likely just mine)
Post by: tofflock on April 06, 2023, 07:48:54 pm
Hi DenverTech

I had a similar situation in the middle of March (2023) - I was running V23.1_6 at the time.  Whilst running V23.1_6 my Acme.sh had run successfully because a certificate update was needed.  I didn't bother looking at the certificate details, until I noticed that my browsers (I tried different browsers on different machines too) were all telling me that there was a certificate problem.  I looked at the certificate and sure enough it had expired.  I spent a few hours digging to try and understand how certificates are stored and referenced in OPNsense.  What follows is a summary of how certificates are stored, what caused the problem in my system, and how I fixed it.

All certiificates are stored in the config file ( /conf/config.xml ) in a structure that looks like the box below.  "<cert>" is at level 2 (with "<opnsense>" at level 1 (top)).  There is a separate "<cert>" section for each certificate.  The "<refid>" item is unique for every certificate, and is used to select a required certificate.

Code: [Select]
<?xml version="1.0"?>
<opnsense>
  <version>11.2</version>
  .
  <cert>
    <refid>a1b2c3d4e5f6</refid>
    <descr>Text Description</desc>
    <crt>[Very long continuous string of the public key of this certificate]</crt>
    <prv>[Very long continuous string of the private key of this certificate]</prv>
  </cert>

Now in another level 2 section denoted by "<system>", there exists a level 3 section denoted by "<webgui>".  See the next box for its structure:

Code: [Select]
<?xml version="1.0"?>
<opnsense>
  <version>11.2</version>
  .
  <cert>
    .
  </cert>
  .
  <system>
    .
    <webgui>
      <protocol>https</protocol>
      <ssl-certref>6045008dd0e08</ssl-certref>
      <port>8443</port>
      <ssl-ciphers/>
      <interfaces/>
      <compression>5</compression>
      .
      .
    <webgui>
    .
  </system>



The item "<ssl-certref>" contains the 12-digit identifier for the certificate that is to be used for the web server. 

In my system which was still serving the out-of-date certificate, the identifier (pointer) contained in the "<ssl-certref>" parameter was actually the id for the old certificate, not the new certificate that had been acquired by acme.sh.  That explained why the certificate being served by the web server, was out of date.

Fixing the problem
That fixed the problem for me.  What I didn't do was locate the code that updates the <webgui> section after acme.sh has run and try to come up with an hypothesis as to why it wasn't updated correctly when acme.sh ran successfully.
I'll keep an eye on it the next time acme.ssh runs to see if it happens again.

An Aside

Having fixed the certificate pointer, I went and looked at the certificates from the GUI (System -> Trust -> Certificates ).
I then noticed that the entry for the old (out-of-date) certificate for (ACME Client) now had a little waste bin icon at the end of the line, indicating that it could be deleted.  When I had started my investigation I had noted that there was no waste bin icon for the old certificate.

HTH with your problem

PeterF



Title: Re: LetsEncrypt issues after v23.1 upgrades? (likely just mine)
Post by: DenverTech on April 06, 2023, 09:01:54 pm
@tofflock

Thanks! That sounds VERY similar. Mine only shows one LE cert in System > Trust > Certs, with no delete button, so it hasn't flagged it as having an expired AND a non-expired, so it may be a bit different. However, I'll dig into the pointer in the config later today and let you know if it's the same sort of issue internally. Sounds almost as though the target cert just isn't getting updated in configs randomly. Really weird if that's the case.

Will update later with any new info I find.
Title: Re: LetsEncrypt issues after v23.1 upgrades? (likely just mine)
Post by: DenverTech on April 06, 2023, 09:12:32 pm
I jumped in and checked this, but sadly all refs are a match. Looks like this is a whole new issue. :(



Hi DenverTech

I had a similar situation in the middle of March (2023) - I was running V23.1_6 at the time.  Whilst running V23.1_6 my Acme.sh had run successfully because a certificate update was needed.  I didn't bother looking at the certificate details, until I noticed that my browsers (I tried different browsers on different machines too) were all telling me that there was a certificate problem.  I looked at the certificate and sure enough it had expired.  I spent a few hours digging to try and understand how certificates are stored and referenced in OPNsense.  What follows is a summary of how certificates are stored, what caused the problem in my system, and how I fixed it.

All certiificates are stored in the config file ( /conf/config.xml ) in a structure that looks like the box below.  "<cert>" is at level 2 (with "<opnsense>" at level 1 (top)).  There is a separate "<cert>" section for each certificate.  The "<refid>" item is unique for every certificate, and is used to select a required certificate.

Code: [Select]
<?xml version="1.0"?>
<opnsense>
  <version>11.2</version>
  .
  <cert>
    <refid>a1b2c3d4e5f6</refid>
    <descr>Text Description</desc>
    <crt>[Very long continuous string of the public key of this certificate]</crt>
    <prv>[Very long continuous string of the private key of this certificate]</prv>
  </cert>

Now in another level 2 section denoted by "<system>", there exists a level 3 section denoted by "<webgui>".  See the next box for its structure:

Code: [Select]
<?xml version="1.0"?>
<opnsense>
  <version>11.2</version>
  .
  <cert>
    .
  </cert>
  .
  <system>
    .
    <webgui>
      <protocol>https</protocol>
      <ssl-certref>6045008dd0e08</ssl-certref>
      <port>8443</port>
      <ssl-ciphers/>
      <interfaces/>
      <compression>5</compression>
      .
      .
    <webgui>
    .
  </system>



The item "<ssl-certref>" contains the 12-digit identifier for the certificate that is to be used for the web server. 

In my system which was still serving the out-of-date certificate, the identifier (pointer) contained in the "<ssl-certref>" parameter was actually the id for the old certificate, not the new certificate that had been acquired by acme.sh.  That explained why the certificate being served by the web server, was out of date.

Fixing the problem
  • I made a backup copy of the config.xml file
  • I located the new acme.sh acquired certificate in its <cert>..</cert> block and noted its refid
  • I edited the config file and updated the certificate reference in the "<ssl-certref>" section with the correct id from step 2
  • Rebooted the system
That fixed the problem for me.  What I didn't do was locate the code that updates the <webgui> section after acme.sh has run and try to come up with an hypothesis as to why it wasn't updated correctly when acme.sh ran successfully.
I'll keep an eye on it the next time acme.ssh runs to see if it happens again.

An Aside

Having fixed the certificate pointer, I went and looked at the certificates from the GUI (System -> Trust -> Certificates ).
I then noticed that the entry for the old (out-of-date) certificate for (ACME Client) now had a little waste bin icon at the end of the line, indicating that it could be deleted.  When I had started my investigation I had noted that there was no waste bin icon for the old certificate.

HTH with your problem

PeterF
Title: Re: LetsEncrypt issues after v23.1 upgrades? (likely just mine)
Post by: DenverTech on April 06, 2023, 09:18:51 pm
Ok...got a fix, but no idea why/how it broke in the first place.

Things I tried that didn't work:
- Check ref of the cert in config files (it was correct)
- Re-issuing the cert
- Reinstalling ACME plugin
- Import the config onto a new firewall as-is

Thing that did work (this is stupid easy and I should have done it first...not sure why an import to a new firewall didn't work):
- Switch active cert (system > trust > certs) back to the self-issued one
- Reboot (this part was required or the rest didn't work)
- Switch active cert back to LetsEncrypt
- Now, suddenly, it's issuing the real cert

Thanks everyone! Gotta love the weird issues.
Title: Re: LetsEncrypt issues after v23.1 upgrades? (likely just mine)
Post by: tofflock on April 07, 2023, 06:02:17 pm
Ok...got a fix, but no idea why/how it broke in the first place.


Glad it's sorted for you.  I'm going to watch my certificate the next time LE does an update and see what happens.

I think the protocol now is for you to insert a [SOLVED] at the beginning of your post title, if you're happy that it is.

Good luck!

PeterF