OPNsense Forum

Archive => 18.7 Legacy Series => Topic started by: vince on August 23, 2018, 11:54:08 am

Title: Lets Encrypt - various errors
Post by: vince on August 23, 2018, 11:54:08 am
Hello :) we've recently switched around a bit some of our network architecture and went from one opnsense box behind a modem using pppoe passthrough to a ha-setup behind a router. Said router has port forwarding enabled, since the firewall on it cannot be disabled.

Using the old setup creating certificates worked just fine. 1 domain and a few SANs. Now it always fails, tested with 18.7.1, 18.7 and 18.1.10 - acme.sh 2.7.9 and 2.7.8 (the old setup was running a 17.7.12 with acme.sh 1.13)

I've uploaded a redacted log of our ha-primary, running 18.1.10 with acme.sh 2.7.8, to https://file.io/muHdvl - if somehow would be so kind as to have a look... our ha-secondary is already on 18.7.1 with acme.sh 2.7.9 - I can upload a log from that system as well.
It does show some errors, but I don't know where I might have gone wrong. I even temporarily allowed all traffic to the https port, which, to me it, rules out the firewall as the source of this problem.
I also have checked the A and CNAME records, they are correct and there is no AAAA record.
Title: Re: Lets Encrypt - various errors
Post by: vince on August 24, 2018, 08:59:44 am
I notice there a quite a few views, yet no reply. Could I have done something better in describing my problem, or ... ?
Title: Re: Lets Encrypt - various errors
Post by: vince on August 24, 2018, 09:31:07 am
Explicit errors that are to find in the linked log file (see first post):

Code: [Select]
[Thu Aug 23 11:16:38 CEST 2018] original='{
  "type": "http-01",
  "status": "invalid",
  "error": {
    "type": "urn:acme:error:connection",
    "detail": "Fetching http://sub.example.com/.well-known/acme-challenge/FH6K-FkTi402Yxnz4GgGH2QmgQ04ZZ7KGlbWbJ3_vIg: Timeout during connect (likely firewall problem)",
    "status": 400
  },
  "uri": "https://acme-staging.api.letsencrypt.org/acme/challenge/KIcdLYd-AGixisDwtryje-eCEjmXPl59j1A2Wj14Nho/162774506",
  "token": "FH6K-FkTi402Yxnz4GgGH2QmgQ04ZZ7KGlbWbJ3_vIg",
  "keyAuthorization": "FH6K-FkTi402Yxnz4GgGH2QmgQ04ZZ7KGlbWbJ3_vIg.Dw8O-XYchKlLNiCK7AvuJE-v2gfYOVv9uF1tfsKz2to",
  "validationRecord": [
    {
      "url": "http://sub.example.com/.well-known/acme-challenge/FH6K-FkTi402Yxnz4GgGH2QmgQ04ZZ7KGlbWbJ3_vIg",
      "hostname": "sub.example.com",
      "port": "80",
      "addressesResolved": [
        "X.X.X.X"
      ],
      "addressUsed": "X.X.X.X"
    }
  ]
}'
I really do not see how the firewall could be an issue here, but maybe someone here knows more about that.

Code: [Select]
[Thu Aug 23 11:16:39 CEST 2018] original='{
  "type": "urn:acme:error:malformed",
  "detail": "Unable to update challenge :: The challenge is not pending.",
  "status": 400
}'

Code: [Select]
[Thu Aug 23 11:16:46 CEST 2018] Diagnosis versions:
openssl:openssl
OpenSSL 1.0.2k-freebsd  26 Jan 2017
apache:
apache doesn't exists.
nginx:
nginx doesn't exists.
socat:
[...]

I did see a 403 in another earlier log too, sadly I seem to have deleted that one already.
edit: got it recreated with 18.7
Code: [Select]
[Fri Aug 24 10:15:03 CEST 2018] original='{
  "type": "http-01",
  "status": "invalid",
  "error": {
    "type": "urn:acme:error:unauthorized",
    "detail": "Invalid response from http://sub.example.com/.well-known/acme-challenge/<token>: \"\u003c!doctype html\u003e\n\u003c!--[if IE 8 ]\u003e\u003chtml lang=\"en\" class=\"ie ie8 lte9 lt
e8 no-js\"\u003e\u003c![endif]--\u003e\n\u003c!--[if IE 9 ]\u003e\u003chtml lang=\"en\" class=\"",
    "status": 403
  },
  "uri": "https://acme-staging.api.letsencrypt.org/acme/challenge/<challenge>/163133057",
  "token": "<token>",
  "keyAuthorization": "<token>.<key>",
  "validationRecord": [
    {
      "url": "http://sub.example.com/.well-known/acme-challenge/<token>",
      "hostname": "sub.example.com",
      "port": "80",
      "addressesResolved": [
        "X.X.X.X"
      ],
      "addressUsed": "X.X.X.X"
    },
    {
      "url": "https://sub.example.com/.well-known/acme-challenge/<token>",
      "hostname": "sub.example.com",
      "port": "443",
      "addressesResolved": [
        "X.X.X.X"
      ],
      "addressUsed": "X.X.X.X"
    },
    {
      "url": "https://sub.example.com/?url=/.well-known/acme-challenge/<token>",
      "hostname": "sub.example.com",
      "port": "443",
      "addressesResolved": [
        "X.X.X.X"
      ],
      "addressUsed": "X.X.X.X"
    }
  ]
}'
Title: Re: Lets Encrypt - various errors
Post by: Droppie391 on August 24, 2018, 11:16:34 am
although i´m not that into how lets encrypt works, i do know that it needs to get access to the host the agent runs on for validation. I´m assuming that the agent on the opnsense box generates a hash with the local address of the box and NOT with the public address of the router. This then must fail as lets encrypt tries to communicate with your router and not with the opnsense box (even with port-forwarding switched on).
Title: Re: Lets Encrypt - various errors
Post by: vince on August 24, 2018, 11:56:43 am
interesting idea, however we have a host (openbsd, acme.sh as well) at another location which is behind a firewall as well, so I guess we can rule out that the system acme.sh runs on needs to have a public IP, I just needs to know what the it's public IP is and needs to be reachable for verification.
Title: Re: Lets Encrypt - various errors
Post by: guest15389 on August 24, 2018, 02:48:57 pm
I don't use letencrypt on my router, but normally the problem comes back to something not starting up or a port in use.

It looks like it is doing a HTTP check for validation and the IP address of the box doesn't matter. I use a debian box with only a private IP on it and I have 80 forwarded when I want to do a check.

Now I use cloudflare and it just does DNS validation.

I'm not sure where it is on OPN, but can you check the .conf file for letsencrypt as it should show the method used for validation and that might point you to something.

# Options used in the renewal process
[renewalparams]
authenticator = dns-cloudflare

Title: Re: Lets Encrypt - various errors
Post by: vince on August 27, 2018, 11:42:36 am
Soo, I just discovered that I was always running into DNS rebinding check, that 501 error page was all lets encrypt could see and thus could not verify. I've since disabled the GUI on the WAN-port.

Also - I found a file that ins apparently responsible for the redirects (acme_anchor_rules), it redirects port 80 to 40k-something. When I used the packet capture I always see TLS requests from lets-encrypt. The acme.sh.log does list port 80 though - which times out.

Is there a setting that always redirects 80 to 443, or something like that?

Could this possibly be a NAT-issue? NAT is
Code: [Select]
nat on WAN_IF inet from $LOCAL to !LOCAL -> WAN_CARP_IP port 1024:65535
Title: Re: Lets Encrypt - various errors
Post by: vince on August 28, 2018, 10:46:30 am
So, further diving into this and still no solution :/

1. Router has a port-forward 80&443 to opnsense
2. opnsense allows access from external to opnsense:80&443 (GUI is OFF for the WAN_IF)
3. opnsense has a port-forward 80&443 to localhost:43580
4. on localhost:43580 is the lighttpd run by the acme-plugin (which is always running, not just when needed, which I find a little weird)

acme.sh still shows "Timeout during connect", "status: 400" BUT when I access that manually I can download the challenge

Has anyone ideas / pointers as to what could be the issue here?