Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Kallex

#1
Quote from: Kallex on August 24, 2021, 11:15:50 PM
I can try to; we're on production environment so I can on earliest try it on weekend.

I guess that's not the "Stable Business Branch" release, can I easily roll back to the last stable one after checking that version out?

I'll report back regardless whether I could test it or not.

EDIT: Realized it's indeed a business release. I'll test it at latest on weekend and report back.

I got to test it now. My issue does not replicate anymore with this newest version, thank you :-).

So initially I had performance issues on routing VLAN <=> LAN through ax0 (10 GbE) on Deciso DEC 840. After this patch the issue is clearly gone.

I don't have any real performance numbers between VLANs, but the clear "laggy issue" is entirely gone now.
#2
I can try to; we're on production environment so I can on earliest try it on weekend.

I guess that's not the "Stable Business Branch" release, can I easily roll back to the last stable one after checking that version out?

I'll report back regardless whether I could test it or not.

EDIT: Realized it's indeed a business release. I'll test it at latest on weekend and report back.
#3
Ok, that was nice and clean to confirm.

To clarify the terms below Deciso 840 has 4x GbE ports (igb0,1,2,3) and 2x 10GbE SFP+ ports (ax0,ax1).

The issue with Deciso 840 is the 10 Gbe SFP+ ports routing VLAN traffic. In my case it was supposed to route the traffic alongside untagged LAN traffic, so this is the scenario I can confirm.


1. Before changes - VLAN routing worked

Before using SFP+ ports I had LAN + VLAN routed with igb0 interface. Everything worked well, no issues.

2. After changes - VLAN routing broken (affecting other routing too)

After moving LAN + VLAN over SFP+ port (ax0), the issues started. When VLAN-traffic was routed, heavy lag spikes on non-VLAN traffic also. I don't have performance numbers, but the traffic wasn't heavy - yet it heavily affected whole physical interface.

3. Fixed with moving VLAN to igb0 while keeping LAN on ax0

As I knew the "everything on igb0" worked, I wanted to try if its enough to move just VLAN to igb0 and keep LAN on ax0. It required some careful "tag-denial" on switch routes to not "loop" either untagged or VLANs, but the solution worked.

EDIT: Of course this workadound/fix was only feasible because my VLAN networks didn't need the 10 GbE in the first place.


As I need to change 2x managed switches and be very careful not to make my OPNsense inaccessible, I'm hesitant to try "the other way around"; moving VLANs to SFP+ and LAN to igb0 - just to test whether whole VLAN routing is broken, or is the issue just when LAN/VLAN is "routing back" through the same physical interface.

I also didn't test the 10 GbE speeds (no sensible way to test it right now through OPNsense), but the lagging/latency issue was so clear, that there obviously was something not working.
#4
If you meant me, no I don't have Sensei and I believe (can't even right now find the setting) I don't have IPS enabled (at least not on purpose).

We do use traffic shaping policies for 2x WANs, but that's about it. All the other is just basic (rule limited) routing between LAN/VLANs.

I didn't touch anything on the recent change, except moved the LAN (+ VLANs associated with it) from igb0 interface to ax0.

I'll configure backwards soonish (hopefully today), as the 10 Gbe wasn't yet really utilized and the issue is really easy to spot right now. So I get more info about my scenario soon.
#5
Just to chip in and offer possibly "standard" hardware approach.

I'm using Deciso's "own" hardware, which should help replicating/reproing the issue.

Deciso DEC 840 with OPNsense 21.4.2-amd64, FreeBSD 12.1-RELEASE-p19-HBSD

I have one main VLAN routing to untagged (main LAN). I upgraded my main switch to 10 gbps and changed my LAN+VLAN interface from GbE port to SFP+ port at 10 GbE.

Everything else works well, but VLAN <=> LAN routing causes massive lag on completely separate routing (like 400-1000ms spikes); the extreme one being CPU spike up to 80%+ which caused several seconds of 1000-1300ms spikes on separate routing (light traffic).

I will reconfigure (likely today) the VLAN parts to separate GbE interface and see if the issue solves by that, next step will be restoring whole network to GbE ports (as it was before).

I did install new switch in the network, so it might play part of this, but based on the behaviour, it seems unlikely.
#6
Didn't seem to help.

I removed the static entry and allowed DHCP to serve it dynamically. Then I turned off the client (let the lease go offline), restarted the DHCP server (to allow it be removed) - and removed the lease.

Then configured the Static entry for different MAC (my other VM) with that IP and started back the "wrong MAC" system, it still got the same lease, that's already been reserved for static use.


In my initial discovery, I had the original server long-running (months) and then turned it off for maintenance... and while adding new devices to IoT network, noticed them getting the server's IP. So there wasn't any dynamic leases competing with static then.

I try to see what way I can dig the system's dhcp.conf etc myself to see if I spot something.
#7
That would make sense. Let me try that.

I also now understand what you mean by dhcpd being the ISC version. I tried to dig into their docs or limitations on this, it seems somewhere recognized "issue", but not so clearly that this kind of issue might explain it.
#8
I tested it.

The DHCP gives the same IP for both the devices. Ping "almost works", its actually interesting behavior.

So the "reserved IP" is still handed to the device that it was initially kind of reserved for.
#9
Quote from: chemlud on August 16, 2021, 05:21:49 PM
Quote from: Kallex on August 16, 2021, 04:51:44 PM
To continue a bit more for the security implication of this. When the "reserved IP device" comes online, whichever IP it might acquire:


  • Either it collides with existing IP - breaking the network for those devices
  • Or it gets a new one and again, is not within its firewall-ruleset bound to its "reserved" IP

So either-or its quite a problem.

Or it gets no IP with "no free leases"? Havve you tried?

For networks with specific rules for dedicated/reserved IPs based on MAC I don't hand out any IPs to unknown MACs (checkbox for DHCP). End of story (I guess)...

Thanks, I tried to bee too clever here. I'll do the testing and report back to conclude and not left this hanging.
#10
I'm not sure what you mean by research not done..? If my examples/tests above lack something, I can revisit and describe them better.


Simply put; I think you should either:

Disable the feature that allows to select static IP within the dynamic pool
- Which I understood, that you're likely not going to - and I agree here, its a desirable feature to have

or

Ensure that the reserved IPs are then not assigned to any other MAC
- If you can't guarantee this, you're opening a can-o-worms, including the firewall-security issue


Right now the current UI/documentation doesn't in any way indicate, that "You shouldn't try to reserve from within a pool".

The overlapping or just "stealing the reserved IP" also compromise firewall rules.


I can myself solve my own issue now that I know the behavior, by simply moving my needs to outside dynamic pool.

But I think you might want to revisit that part of firewall rule compromise, because I don't think its a small issue.
#11
To continue a bit more for the security implication of this. When the "reserved IP device" comes online, whichever IP it might acquire:


  • Either it collides with existing IP - breaking the network for those devices
  • Or it gets a new one and again, is not within its firewall-ruleset bound to its "reserved" IP

So either-or its quite a problem.
#12
Quote from: pmhausen on August 16, 2021, 03:27:08 PM
@kallex You might be correct here but this is how the dhcpd works and nothing OPNsense could change. You would need to file a bug ticket with the isc-dhcpd project to get this fixed.

Do you mean this certain dhcpd implementation that OPNSense is bound to use?

I mean I can understand that this isc-dhcpd project might be the right place, but I'd like to argue that the feature as-it-is right now, is not working properly.


I mean I Googled around just generic "dhcpd static ip reservations" (just picked few random - seem to follow pattern);

https://serverfault.com/questions/768655/how-dhcpd-handles-static-ips-vs-dhcp-reservations
https://www.itsfullofstars.de/2019/02/assign-a-static-ip-to-dhcp-client/

Both circulate around "can be inside the scope or outside the scope depending the implementation", both also mention leases that dictate which MAC address has which IP. Both also happen to match my previous experience on alike feature on consumer/prosumer hardware (Asus and Ubiquiti Unifi).

Focusing on leases is where I can clearly point the reservation part of the issue: Why in my active leases list I have two different MACs for the same IP - that's quite an anomaly.


Bottom line; the feature as it stands now, as it leads to end-user assume dynamic pool reservation is OK its actually (quite severe) security issue.

The device that "steals" the reserved IP also gets all the firewall rules of the reserved IP device. One main use case for reserving IPs is exactly this - having custom security/firewall rules for them.
#13
Quote from: franco on August 13, 2021, 01:44:29 PM
21.1 changed this:

o dhcp: removed the need for a static IPv4 being outside of the pool (contributed by Gauss23)

But still thinking about this feature improvement; what would be point of this (to allow the static IP be within the pool), except to exactly give the dedicated IP to that MAC address - from the dynamic pool.

This is what I experienced without issues.

The problem was, that because there was no reservation, the address was given elsewhere when the reserved client was offline.

When looking from this perspective, this seems like a bug to me. The MAC/IPs listed within the pool, need of course be reserved to those MACs.
#14
Thank you all for the patience.

I'm sorry - it took me a while to really understand the dynamic pool vs. "same subnet, outside dynamic pool ranges". So the DHCP server will then serve the addresses still outside its dynamic ranges..?

Which makes sense now, when I get what you mean.

My previous experience was with Asus RT-series and Ubiquiti Unifi - series, where there is alike functionality in UI, and where it indeed does reserve the current IP for that MAC address from dynamic pool as well.


I don't insist this being a bug, but the documentation and UI/current behaviour could use some better tooltips. And possibly rechecking if that "prosumer/consumer router" behaviour is indeed as I have experienced it, and if its something to be mirrored on OPNSense as well.

I'll experiment with the proper understanding now... and remove the IPs from dynamic pools (it does make sense that way too, for logical management perspective and all).


About the pool size:

I have several VLAN separated networks, the one where I spotted this issue is IoT specific net, 100 pool is plenty there when there are around 25 logical devices there now - explicitly being added. But I understand the point.

#15
Now after testing this moreover, can reproduce with varying the obvious settings, I think this is simply a bug.

Not opinion based feature or anything, just simply not working :).

What's the proper way of reporting bugs? Create an issue in GitHub... to which repository?