Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - gazd25

#1
Hi Franco,

I had thought similar, but then had ruled the RA out because the issue occurs when testing pings from the firewall itself when IPv6 stops routing.

During earlier testing, when I noticed IPv6 had stopped routing I went on to the firewall and set up a ping from the interface diagnostics to Cloudflare DNS IPv6 address and the error I saw is "No route to host" after this I restarted the routing service manually from the dashboard and ran the same ping from interface diagnostics from the firewall and it worked correctly with no loss as well as all IPv6 LAN traffic now forwarding too.

Or do you mean an RA from the ISP side? Not sure I can do much about that other than what I'm already doing to workaround :)

Is there a way i could tell in the logs if another RA is being recieved from the ISP?

When you were helping me resolve a previous issue with IPv6 after being forced to change from setting a static IPv6 address to dynamic, as I recall it became obvious my ISP weren't even sure what servers were giving the DHCPv6 info/prefix out to me because they hadn't set any up, only to find that it was a set of upstream Entanet/Cityfibre servers that were doing it after investigation.

Thanks

Gareth
#2
I also have shared forwarding enabled on my config.

I've tried switching to using Quad 9 as my monitor IP like yours Opnfwb, cant see it'd make much difference, but no harm in trying and not able to reboot today for testing so will resume testing soon.

Thanks for the help guys.
#3
Initial indications are that the changes I've applied similar to yours Opnfwb don't seem to have resolved the issue, it still occurs intermittently.

That said I now have monitoring of it, so was thinking maybe as a workaround to use Monit to restart the routing service if it detects a drop of the IPv6 gateway for longer than a certain time.

Looking in to that now to see if possible.
#4
I've added the settings as per your advice Opnfwb, but using one of Cloudflares IPv6 addresses as my monitor: 2606:4700:4700::1111.

My IPv4 was already using gateway monitoring so no need to set anything there.

I can already see the upstream gateway monitoring of IPv6 changes dynamically in the dashboard at each reboot, similarly to the way IPv4 always has, I'm guessing based on what is dynamically handed out during the PD.

I'll monitor and let you know the outcome.

Many thanks again for the help.
#5
Thanks Opnfwb, I'll give this a try when I get a minute free this weekend
#6
Just done a little further testing and important to note that when the IPv6 routing fails, I have also tried to ping from an IPv6 interface on the OPNsense firewall to an internet IPv6 address and the error I see is "No Route to Host"

Restarting routing service then allows the same ping set in the diagnostics from the firewall host to the Internet address to succeed with no losses so while I thought maybe it was just routing through from the LAN side to the internet that was failing, it appears the firewall host itself also cannot route to the IPv6 internet while in this state.

Thanks

Gareth
#7
I forgot to add, I'm currently running the latest stable release 24.1.6, though this issue has persisted for a long time throughout a fair number of updates.

Thanks

Gareth
#8
Hi All,

Firstly, I'd like to thank everybody for their sterling work on OPNsense, people like me would be much worse off without it, so thank you very much to all contributors.

I've been refining my OPNsense config for some considerable time and while it's now relatively complex, I have reached a very positive place with pretty much everyting I want working correctly.

I have been having a minor problem for some time, think it actually started back in the times 23.1 release which was around the time I first deployed IPv6 on my network. It's more of a niggle than a serious issue but the ability to replicate the fault fairly consistently does suggest a potential timing issue in the code at boot that might be responsible.

My system publishes IPv6 /56 from my ISP using track interface on my LAN network to a /64 internally, DHCPv6 sends my prefix from the ISP and they dont publish me an IP so an autoassigned one is set, but this is relatively normal and all traffic routes and works as expected.

The problem comes in that after boot up the system will be working and routing IPv6 correctly, then an unknown number of minutes later, for some reason will stop routing. When this happens, I'll go to the dashboard interface and restart the routing service manually and it'll start routing IPv6 again and until I reboot next time, it'll continue working as expected.

I would say the above occurs maybe 9/10 boots and occassionally for a reason I also cant define it simply continues to work as expected.

I'm hoping one of the experts here can help me get to the bottom of the root cause and fix and happy to collect logs, and test as needed since i run OPNsense in a VM with easy snapshot and rollback capability.

Many thanks

Gareth

#9
23.1 Legacy Series / Re: ACME LetsEncrypt + Cloudflare
August 18, 2023, 02:04:24 PM
Hi Skydiver,

It's been a long time since I set this up myself, but I'll try and offer what help I can.

What I can tell you based on your picture is that my config looks a little different in that under the Global API key section, it's empty and I've only got config under the "Restricted API Token Section" I've attached a picture to show this.

I looked in my Cloudlfare setup page and it looks as if the "CF Account ID" field is populated with the number that appears on the specific DNS domain dashboard page on Cloudflare down the right hand side.

I've also created a single restriced API token under the API section which is in "Profile" on Cloudlfare, which looks like the attached pictures.

Essentially my token has zone read and zone DNS Edit rights.

This has worked pretty flawlessly for me other than the one problem I had which turned out to be because the IP address I was accessing from changed from using IPv4 to using IPv6, so was refusing access to the API because I'd used a client IP address filter to secure it, but I wouldn't recommend configuring this unless you are accessing from a fixed IP as I am, so just leave it open.

Hope this is of some help to you.

Thanks

Gareth



#10
23.1 Legacy Series / Re: ACME Cert Renewals
June 12, 2023, 08:02:38 PM
Well,

As is always the best way, I solved my own problem.

There was some changes a little while back, related to my IPv6 configuration, as a result of a change to the PPPoE initiation in 23.1.7 version of OPNSense. For more detail on that, see here:

https://forum.opnsense.org/index.php?topic=33864.45

In any case, after trying pretty much everything else I could think of, I began investigating the Cloudflare API as a possible culprit for the failure to renew on the ACME client.

Turned out, I had locked down the API calls to a specific token allowing only my old static IPv4 and IPv6 addresses to make the request, the IPv6 of course has now changed because of moving to PPPoEv6 on my WAN interface.

It seems my firewall was using IPv6 to contact the Cloudfare API and then not falling back to IPv4 if the request failed due to the API controls disallowing the request, also leading to a fairly nonsensical error being logged on OPNSense which bore no resemblance to what was actually going on.

In any case, I did a little testing to ensure I knew which of my firewalls IPv6 addresses the Cloudflare API was receiving the request from, altered the API token settings on Cloudflare to allow this IPv6 to initiate the requests to the API and hey presto, my certificates are now both renewing correctly again.

Hope my little journey helps somebody else in the future :)

Thanks

Gareth

#11
23.1 Legacy Series / ACME Cert Renewals
June 12, 2023, 03:49:19 PM
Hi Guys,

On my up to date OPNsense 23.1.9-amd64 firewall, I've noticed that my ACME certificate renewals are both now showing as failed validation in the logs as below:

2023-06-12T14:32:53   acme.sh   [Mon Jun 12 14:32:53 BST 2023] Error add txt for domain:_acme-challenge.contoso.com
2023-06-12T14:32:53   acme.sh   [Mon Jun 12 14:32:53 BST 2023] invalid domain

I cant see much history in the logs but it seems to have showed the same error for the last few renewal attempts which happen at midnight automatically.

The ACME renewal process uses the Cloudflare DNS validation method and no config changes have been made at all. Until recently this has always worked very well for me without issues.

I did run an update this morning and noticed a new ACME script was brought down, so wondered if there has been any changes which might have impacted?

I also tried to force renew and noted that the extra text record never appears in Cloudflare DNS as expected, so it does appear to be some change, but it's difficult to say for sure.

I've got a snapshot and rollback capability so am going to try a few different things in testing, but thought it was first worth raising to see if it's just me.

Thanks for any help.

Gareth
#12
Hi All,

I realise I've gone a bit quiet since Franco has been helping me get to the bottom of the problem I was experiencing so wanted to provide an update.

After a couple more patches and troubleshooting steps supplied by Franco, we were able to identify that the problem was a result of slight misinformation from my ISP and a change to the code in 23.1.7_3 that means a PPPoEv6 request is no longer sent by OPNsense when assigning a static IPv6 address to the WAN interface, when before it used to be but really why would it be needed? And further it was enirely invisible to me until this problem occured.

Well as it turns out, if my ISP's equipment didn't recieve that request at the time of the PPPoE initiation, they were disabling IPv6, which was why I lost outbound IPv6 routing.

I have been able to resolve by changing my WAN interface IPv6 setting to PPPoEv6 and deleting the old now non-needed gateway.

I did a full regression to 23.1, applied no patches and performed the above steps, then updated to 23.1.7_3 and this resolved my issue fully.

I hope this gives some help to others experiencing the same or similar issues.

I also wanted to take the opportunity to thank Franco for his amazing help, it's no wonder he is a hero member!!!

And further thanks to everybody involved with OPNsense, I personally am hugely grateful for the great support, guidance and great product you provide.

Thanks

Gareth
#13
Hi Franco,

Just rebooted and taken a copy of the routing table from System > Routes > Status, which is hopefully what you are after and emailed it over to you.

Thanks

Gareth
#14
Hi Franco,

So just applied the 48855143b patch on top of the 766f1f0c5a3.

Same as before, no change to the IPv6 gateway which remains offline.

Thanks

Gareth
#15
Hi Franco,

So just applied the patch and I'm afraid it doesn't seem to alter the behaviour at all.

I applied the 766f1f0c5a3 patch to a newly updated 23.1.7_3 and it didn't seem to have any impact, rebooted a couple of times to rule out any issue with timing.

Next to make double sure, I reverted to my working 23.1 using a VMware snapshot, installed the 23.1.7_3 update and then added the 766f1f0c5a3 patch again and rebooted a few more times to check if the issue was timing, but behaviour remained consistent.

Whilst doing this, I did remember something that happens rarely but occassionally and all this rebooting has made the issue more obvious. Sometimes I would say maybe once out of every 5 boots, I will have to click the dpinger play button manually to start the IPv6 gateway. It had almost slipped my mind since it's rare I'll need to do it, but it does occasionally happen when running an update or rebooting for some reason and when I manually click the dpinger button the gateway had always previously started correctly.

Have a feeling that behaviour has been there for a while but would happen so rarely that it was simple just to click the button and not worry.

I wonder if some of the new code you have introduced has crystalised that issue in a way it hadn't on the previous version?

In any case, I also attempted the same thing to manually click the dpinger button, after the 23.1.7_3 update and the patch and it doesn't start the gateway I'm afraid so no IPv6 traffic.

I did also run the "opnsense-log | grep refusing" command against the updated and patched version and it shows the same error:

<11>1 2023-05-16T10:08:45+01:00 OPNSense.local opnsense 78232 - [meta sequenceId="8"] /usr/local/etc/rc.routing_configure: ROUTING: refusing to set inet6 gateway on addressless wan

Let me know what else I can do to help.

Thanks

Gareth