HI,
after upgrade to 24.7.4 my zerotier connection are death.
Move back to 24.7.3 they are working.
What is wrong?
Greetings Mario
here stops too after updates.
I migrated to openvpn in a hurry but i like to use zerotier.
Same here. Reverted back to 24.7.3-1 for now.
how rollback? how do it remotely?
This seems to be a common theme but it's a wild situation.. which change, which component update of this affects ZeroTier operation?
Cheers,
Franco
There have been issues in the past, mostly because of routing issues within OPNsense – I guess – in all cases the device was connected properly to ZT was able to see all neighbours. But no comms to OPNsense. A zt leave and zt join fixed it (or removing the checkmark for the network in the gui).
This time it's different. Everything works with 24.7.3_1 (and with many releases prior, too. The above issue was last seen sometimes earlier this year). After installing 24.7.4 ZT no longer works.
A zt leave and zt join don't solve it anymore, no comms.
The firewall rules to allow data flow from the ZT network to other networks or the firewall itself don't show any states in inspect mode.
Reverting to 24.7.3_1 restores ZT connectivity immediately.
Question is: how to hunt down "which change, which component update of this affects ZeroTier operation"? Any directions on where to start?
=======
Initially I thought it was in the kernel - reverted the kernel to 24.7.3 - no dice.
=======
Then I reverted the OPNsense package - restarted ZT - success.
- kernel and OPNsense on 24.7.3 - the rest on 24.7.4
=======
Reinstalled 24.7.4 kernel - so fully on 24.7.4 except OPNsense which is on 24.7.3 - success again.
=======
With the kernel out of the way - which was the only sane option :) - best guess now is this is caused by PHP (?)
> Reverting to 24.7.3_1 restores ZT connectivity immediately.
Someone will tell me what this means? Full revert? Still leaves the question if this is a kernel or core issue...
Cheers,
Franco
Ok so it's core but that still leaves a number of guesses:
https://github.com/opnsense/changelog/blob/ffb1c305508e360a4bcaa131e562d56d393ef2b4/community/24.7/24.7.4#L27-L53
It would probably be best if someone with the issue could do a bisect on stable/24.7 between tags 24.7.3 (good) and 24.7.4 (bad).
Cheers,
Franco
Quote from: franco on September 13, 2024, 09:39:39 AM
> Reverting to 24.7.3_1 restores ZT connectivity immediately.
Someone will tell me what this means? Full revert? Still leaves the question if this is a kernel or core issue...
It means the whole system was reverted to 24.7.3_1.
Same issue on my end.
I run a Zerotier Tunnel between a OPNsense Business Edition (home) and OPNsense Community running at my hoster. Right after updating to 24.7.4 on the OPNsense Community Edition, Zerotier is dead. Both Zerotier installations are shown as online, but none of the devices can ping each other on their Zerotier IP or any other IP that is routed over this Tunnel.
I try to downgrade my OPNsense to 24.7.3 as a solution for now.
Update: Downgrade with "opnsense-revert -r 27.7.3_1" worked, traffic is fliowing again :)
Since nobody helped further so far my best guess is https://github.com/opnsense/core/commit/1dba25fed8 and someone will need to confirm or deny that's the one. I'm assuming we're talking about assigned ZeroTier interfaces? At first glance this has nothing to do with ZeroTier...
Cheers,
Franco
HI,
sorry I do not unterstand what do to?
Greetings Mario
# opnsense-patch 1dba25fed8
Doesn't matter if 24.7.3 or 24.7.4.
If applied to 24.7.3 it should get worse. If applied to 24.7.4 it should be better -- given that it's the bad commit in question.
What I'm trying to tell here is that nothing related to ZT was changed so it's unclear what the problem is.
First we find out which commit. Then we need to figure out why ZT doesn't like it. Someone with the problem needs to do this.
Cheers,
Franco
(https://forum.opnsense.org/index.php?action=dlattach;topic=42798.0;attach=37852;image)
I do this on 24.7.4
and it help!!!!
root@OPNsense:~ # opnsense-patch 1dba25fed8
Fetched 1dba25fed8 via https://github.com/opnsense/core
Hmm... Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|From 1dba25fed8686f865a9425bd08c04a01075c94e1 Mon Sep 17 00:00:00 2001
|From: Franco Fichtner <franco@opnsense.org>
|Date: Fri, 7 Jun 2024 22:22:02 +0200
|Subject: [PATCH] interfaces: force regeneration of link-local on spoofed MAC;
| closes #4430
|
|(cherry picked from commit a2ac1999f37ee98da22b6edd42c430c8dbb6534b)
|(cherry picked from commit 7669567944ec26ffea088a636482e04b0e9912d6)
|---
| src/etc/inc/interfaces.inc | 35 +++++++++++++++++++++++++++++++----
| 1 file changed, 31 insertions(+), 4 deletions(-)
|
|diff --git a/src/etc/inc/interfaces.inc b/src/etc/inc/interfaces.inc
|index 1b672e7f50..152155dafa 100644
|--- a/src/etc/inc/interfaces.inc
|+++ b/src/etc/inc/interfaces.inc
--------------------------
Patching file etc/inc/interfaces.inc using Plan A...
Reversed (or previously applied) patch detected! Assuming -R.Hunk #1 succeeded at 2316 (offset 32 lines).
done
All patches have been applied successfully. Have a nice day.
opnsense-patch 1dba25fed8
it solved to me!
Thanks, Franco.
Hi,
works a treat !
Thanks Franco !
Confirmed. Patch works here as well. Thanks
Confirmed working with the patch.
Quote from: franco on September 13, 2024, 04:28:19 PM
# opnsense-patch 1dba25fed8
Applied to 24.7.4 and ZT is working again after a reboot.
The patch is just for triage. ZeroTier has an issue with auto-link-local flag and I have no way of testing this so someone with the setup please take a closer look at ifconfig in the working and non-working case.
My assumption is still that this is true for assigned ZeroTier interfaces, but maybe I missed someone confirming that. And is this an IPv4 or IPv6 tunnel?
Cheers,
Franco
i'm using only IPv4 tunnel
Quote from: franco on September 13, 2024, 09:05:33 PM
I have no way of testing this
I can set up a public ZT network for you to play with, just drop me a line
Quote from: franco on September 13, 2024, 09:05:33 PM
so someone with the setup please take a closer look at ifconfig in the working and non-working case.
Just the ZT part:
Non working:
REDACTED: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 5000 mtu 2800
description: ZeroTier (opt2)
options=80000<LINKSTATE>
ether 58:9c:fc:10:92:2f
inet 172.27.8.25 netmask 0xffff0000 broadcast 172.27.255.255
inet6 fe80::5a9c:ffff:ffff:ffff%REDACTED prefixlen 64 scopeid 0x7
groups: tap
media: Ethernet 1000baseT <full-duplex>
status: active
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Opened by PID 66352
Working after applying the patch:
REDACTED: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 5000 mtu 2800
description: ZeroTier (opt2)
options=80000<LINKSTATE>
ether 7a:fd:ba:es:1f:1c
hwaddr 58:9c:fc:10:92:2f
inet 172.27.8.25 netmask 0xffff0000 broadcast 172.27.255.255
inet6 fe80::5a9c:ffff:ffff:ffff%REDACTED prefixlen 64 scopeid 0x7
groups: tap
media: Ethernet 1000baseT <full-duplex>
status: active
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Opened by PID 61635
Quote from: franco on September 13, 2024, 09:05:33 PM
My assumption is still that this is true for assigned ZeroTier interfaces, but maybe I missed someone confirming that.
The ZT networks are assigned to an interface in my case, yes.
Quote from: franco on September 13, 2024, 09:05:33 PM
And is this an IPv4 or IPv6 tunnel?
IPv4 in the tunnel
Thanks, I see the issue is somewhat similar to LAGG interfaces: ZeroTier modifies the Ethernet address of the device on its own. That certainly isn't great. I'll propose a patch next week.
Cheers,
Franco
Quote from: franco on September 13, 2024, 11:26:28 PM
ZeroTier modifies the Ethernet address of the device on its own.
It has to. Each device in a ZT network has its own MAC which is calculated from the member id of that device. This address does not change as long as the member id doesn't change which it only does if someone manually resets the member id and therefore makes it a new device to ZT.
ZT needs its own MAC because it works as a SDWAN switch and needs arp to function.
That answers the question why nothing arrived at the firewall. It was just impossible to send Ethernet frames to the ZT network member MAC from OPNsense.
FYI, 24.7.4_1 only fixes a PPP regression, not the ZT issue.
If in need of the PPP fix make sure to reapply the patch Franco posted earlier.
Otherwise here's no need to do anything until Franco has the ZT patch out.
Quote from: pbk on September 14, 2024, 12:11:51 AM
ZT needs its own MAC because it works as a SDWAN switch and needs arp to function.
While that seems clear it's also broken by design from the start because you could override the MAC address from the interface settings which obviously is a bad idea then.
Cheers,
Franco
Quote from: franco on September 14, 2024, 08:39:41 AM
While that seems clear it's also broken by design from the start because you could override the MAC address from the interface settings which obviously is a bad idea then.
Layer 2 over WAN links is broken by design. Just route, folks.
Just to note that ZeroTier practically works flawlessly since 2018 which was the last time someone actively maintained it. But in any case more maintenance would be better... ;)
https://github.com/opnsense/core/commit/dfd9f1766d
https://github.com/opnsense/plugins/commit/4f9e03089
# opnsense-revert opnsense os-zerotier && opnsense-patch dfd9f1766d && opnsense-patch -c plugins 4f9e03089
I'm not considering hotfixing this for the same reason so much care has been taken for the initial request on the spoofmac behaviour improvements:
https://github.com/opnsense/core/issues/4430
To release this into 24.7.5 it will need a good portion of non-ZT testing as well. The normal road forward would be to include it into 24.7.6 at the earliest.
Cheers,
Franco
Not quite there yet, the ether information is gone after the patches and the HWaddress is now displayed instead:
Pre-patches with the initial workaround
ztagimXXXXX: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 5000 mtu 2800
description: ZeroTier (opt2)
options=280401<RXCSUM,LRO,LINKSTATE,RXCSUM_IPV6>
ether 3f:44:6a:e3:2c:52
hwaddr 55:8a:dd:21:95:03
inet 192.168.29.6 netmask 0xffffff00 broadcast 192.168.29.255
inet6 fe80::5%ztagimXXXXX prefixlen 64 scopeid 0xd
inet6 fca2::1 prefixlen 40
groups: tap
media: Ethernet 1000baseT <full-duplex>
status: active
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Opened by PID 54657
Post-patches and rebooted
ztagimXXXXX: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 5000 mtu 2800
description: ZeroTier (opt2)
options=280401<RXCSUM,LRO,LINKSTATE,RXCSUM_IPV6>
ether 55:8a:dd:21:95:03
inet 192.168.29.6 netmask 0xffffff00 broadcast 192.168.29.255
inet6 fca2::1 prefixlen 40
inet6 fe80::5%ztagimXXXXX prefixlen 64 scopeid 0xd
groups: tap
media: Ethernet 1000baseT <full-duplex>
status: active
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Opened by PID 64048
Some output has been edited of course :)
Ok so the core patch did not apply for one reason or another... I made a backport and edited the original post.
Thanks for testing so far.
Cheers,
Franco
Just to confirm for everyone, patches are working now. Don't forget to restart the service :)
service zerotier restart
Quote from: franco on September 14, 2024, 09:25:12 PM
Ok so the core patch did not apply for one reason or another... I made a backport and edited the original post.
Thanks for testing so far.
Cheers,
Franco
Which one should we use?
opnsense-patch 1dba25fed8
OR
opnsense-revert opnsense os-zerotier && opnsense-patch dfd9f1766d && opnsense-patch -c plugins 4f9e03089
This one is to be used. Note that the two revert commands are only needed if the previous patches have been attempted by the people testing. If unsure it is best to run this chain of commands in full, it is absolutely safe.
I've added the last step of restarting the service to the chain as well.
opnsense-revert opnsense os-zerotier && opnsense-patch dfd9f1766d && opnsense-patch -c plugins 4f9e03089 && service zerotier restart
My suspicion is that
# service zerotier restart
would temporarily fix this anyway?
Cheers,
Franco
Hi Franco,
concerning ... "would temporarily fix this anyway?"
I applied before mentioned patches yesterday around lunchtime. I haven't done anything since with OPNsense. And zerotier is still working as it should.
A restart of zerotier didn't do anything when the problem occurred for the first time a couple of days ago with the unpatched version.
Ok, depending on what the zerotier binary does it may do down/up on top of fiddling with the MAC address of the interface, which would cause it to break again. It's a tough spot to be in for a VPN. ;)
Cheers,
Franco
Quote from: newsense on September 16, 2024, 05:19:55 AM
I've added the last step of restarting the service to the chain as well.
opnsense-revert opnsense os-zerotier && opnsense-patch dfd9f1766d && opnsense-patch -c plugins 4f9e03089 && service zerotier restart
I'd like to assist with testing, but just one question before starting. If I "patch" these hotfix changes, will I need to revert these changes before or after a later released 24.7.5 or 24.7.6 including the fix or is just updating to these versions overwrite all these "patches" and nothing to do afterwards?
These patches will be in 24.7.5+, most likely with the latest version of ZT that was released late last week.
Quote from: pkirsche on September 16, 2024, 06:14:28 PM
I'd like to assist with testing, but just one question before starting. If I "patch" these hotfix changes, will I need to revert these changes before or after a later released 24.7.5 or 24.7.6 including the fix or is just updating to these versions overwrite all these "patches" and nothing to do afterwards?
It's complicated.
The ZT plugin change will probably stick if we update with 24.7.5 or not. The core change will be scrubbed with 24.7.5 when it isn't included (maybe because it got pushed to 24.7.6). If it's included it gets scrubbed, too, but will work regardless (because it's being included in the update).
24.7.5 is doable for next week. But I need to recheck the LAGG case and the general ability to spoof the MAC address.
Cheers,
Franco
Quote from: newsense on September 16, 2024, 05:19:55 AM
This one is to be used. Note that the two revert commands are only needed if the previous patches have been attempted by the people testing. If unsure it is best to run this chain of commands in full, it is absolutely safe.
I've added the last step of restarting the service to the chain as well.
opnsense-revert opnsense os-zerotier && opnsense-patch dfd9f1766d && opnsense-patch -c plugins 4f9e03089 && service zerotier restart
Thanks for this! My ZeroTier mysteriously stopped working a few days ago, and I noted today that this coincided with the upgrade to 24.7.4_1. After a fair amount of trying out stuff (restarting ZT service or trying out a different network didn't work either), I had nearly given up but thankfully discovered this thread. Applying those patches as suggested fixed my ZeroTier connectivity and routing - works just like before now!
Quote from: franco on September 16, 2024, 07:47:25 PM
The ZT plugin change will probably stick if we update with 24.7.5 or not. The core change will be scrubbed with 24.7.5 when it isn't included (maybe because it got pushed to 24.7.6). If it's included it gets scrubbed, too, but will work regardless (because it's being included in the update).
I really hope these patches stick after those updates!
Cheers,
Pranay
We will likely push all of them into 24.7.5. :)
Cheers,
Franco
Confirm, the patch works.
I can confirm that Zerotier works on a fresh install of 24.7.5. I will update one of the previously affected 24.7.4 systems (that was working again thanks to the script posted by Newsense) tonight to verify that as well. Zerotier was not working on multiple fresh installs of 24.7.4 (as well as upgraded units) previously. Thank you for such a rapid solution (both the script and the fix in 24.7.5).
And I can confirm that a 27.4_1 [EDIT: that should be 24.7.4_1] system that was:
* Non-functional with Zerotier.
* Had been modified with the script from Newsense so that Zerotier was working.
* Was upgraded to 24.7.5 and Zerotier continues to work correctly.
> And I can confirm that a 27.4_1 system that was:
That should read 24.7.4_1 I believe.
Cheers,
Franco
Yes, very much so. late night and cross-eyed. 24.7.4_1 indeed.
Another confirmation... ZeroTier works again like a charm after 24.7.5 update. Thanks everyone involved!!!