Hello everyone. :)
Just set up my account after reading the forums for over a year now (thanks for all the help so far that I got from reading here!). I have moved in to a new flat and set up a new MS-01 as my OPNSense. Everything worked so far but I have an extremely weird DHCP behavior that I can't seem to figure out and so I wanted to ask for help:
The most important thing first:
I upgraded this morning to OPNsense 24.7.4, as I thought it might be a bug that is already resolved but this didn't help. Back to the setup:
Behind my Sense sits my Main Switch, a Zyxel ("S1" @10.0.0.10). Attached to the switch I have a NAS and a Proxmox server as well as two other switches, a TP-Link in my office ("S2" @ 10.0.0.20) and a Mikrotik in the living room that also acts as an access point ("S3" @ 10.0.0.30).
When I am in my network, for example connected via my workstation at 10.0.200.11 (WorkVLAN 200, connected to S2) I can reach the Zyxel at 10.0.0.10. I can also reach the Sense from there and reach out to the internet.
However, on my Sense GUI S1 shows as offline. When I wireguard into my Sense with my phone I can also *not* see S1 when I browse to 10.0.0.10. When I am in my homenetwork with my phone via S3 Access Point however I can see it. That means the whole network that sits behind S1 can see S1 but everything in front of it (just the OPNSense and the remote devices that wireguard in) cannot.
Now.. this is where it gets confusing:
When I wireguard in from outside, I can however reach S2 and S3 that sit *behind* S1. I can also reach all my services on the Proxmox server. S2, S3, Proxmox and NAS, all connected to S1 show up as "online" in the DHCP leases.
So far so confusing.. now let's add some more confusion:
I figured something must be wrong with the switch so I reset it. No change. So I removed the static DHCP lease for S1 and gave it a dynamic IP. It now showed up at 10.0.0.13 as online but I could not reach it from anywhere, neither from Wireguarded phones on the Sense nor from the Workstation behind S2. When reassigning a static lease for 10.0.0.10 again, it showed as offline, *however* it did receive this IP address and was now accessible again from behind it but not from in front.
I played that through several times with static and dynamic leases, rebooted the Sense and S1 many times, reset S1 countless times but nothing changes.
It has all worked in the past and I just set up Wireguard to Proton so I first thought that it might have something to do with the Sense routing traffic that is destined for S1 through the tunnel to Proton instead of through the SFP+ port connected to S1. On closer thought this doesn't make too much sense either though, because S2 and S3 that sit behind the same SFP+ trunk are indeed reachable. I also just recently added S3 and the access point and when I plug in my laptop to the wall where S2 is usually connected to, S1 routes me to S3 instead of out to the Sense. So it might also have something to do with my VLAN setup but that is already quite the far fetch.
Furthermore I have a lot of errors in my DHCP log that look like this for pretty much every static IP that I have assigned:
2024-09-15T09:09:52 Error dhcpd from the dynamic address pool for 10.0.20.0/24
2024-09-15T09:09:52 Error dhcpd Remove host declaration s_opt3_1 or remove 10.0.20.20
2024-09-15T09:09:52 Error dhcpd Dynamic and static leases present for 10.0.20.20.
2024-09-15T09:01:58 Error dhcpd from the dynamic address pool for 10.0.20.0/24
2024-09-15T09:01:58 Error dhcpd Remove host declaration s_opt3_0 or remove 10.0.20.10
2024-09-15T09:01:58 Error dhcpd Dynamic and static leases present for 10.0.20.10.
2024-09-15T09:01:55 Error dhcpd from the dynamic address pool for 10.0.20.0/24
It is an extremely odd behavior and a bit above my league. Does anyone have an idea? I'd appreciate the help a lot.
Quote from: lively1355 on September 15, 2024, 11:40:33 AM
when I plug in my laptop to the wall where S2 is usually connected to, S1 routes me to S3 instead of out to the Sense.
This statement doesn't make sense - switches don't route (unless they're layer-3 switches, in which ase you need to describe them further, as that would be very relevent to the problems you're describing). What test are you performing? What do you expect to happen? What actually happens?
My gut feeling is that you have an incorrect netmask on one or more of your firewall interfaces, resulting in overlapping subnets... or something along those lines...
Maybe that sentence was a little confusing, sorry. Let me clear it up:
S1 (Zyxel XGS1250) is connected to S2 (which sits in a different room, TP-Link TL-SG108E) via cables in the wall. If I plug in my laptop via that wall plug, my traffic goes through the wall to S1, which then sends the packets to S3 (MikroTik hAP ax²) instead of upstream to the Sense. If I plug in S2 and connect devices to it, then traffic gets send through S2 to S1 and from there on out.
My network is as follows:
I have my LAN which is 10.0.0.0/24. In the LAN there's only the switches and nothing else.
I have a server vLAN (10) which holds my NAS and my Proxmox Server. They are in the 10.0.10.0/24 range.
I have another "services" vLAN (20), which holds all the services that run on the Proxmox Server (like the PiHole etc) in the 10.0.20.0/24 range.
Then there's a work vlan (200) at 10.0.200.0/24 and an IoT vlan (90) at 10.0.90.0/24 as well as two networks for traffic coming in through wireguard at 10.0.100.0/24 and 10.0.101.0/24.
The network is therefor quite separated and I should not have overlapping subnets - at least to my knowledge and understanding.
What is supposed to happen is that S1 shows up as connected in the OpnSense, which is doesn't. It is also supposed to be reachable from that side (via the wireguarded mobile devices that reach the Sense). It is highly confusing that the whole network is reachable *through* the switch but the switch itself is only accessible from behind the switch and not from in front.
Testing was performed by connected multiple devices to the network behind the switch and connecting to it, which is possible from everywhere behind it.
In short: Everything works but communication between the switch and the Sense. Everything communicated *through* the switch with the sense works, everything communicating behind the switch with eachother or the switch works as well.
There's simply no connection via Sense and Switch directly when it's assigned a static IP of 10.0.0.10. When it's assigned a dynamic IP (for example 10.0.0.13 handed out by the Sense) it shows up as connected in the Sense. When it is handed the static IP of 10.0.0.10, the Switch actually receives the IP and reacts to request on that IP but shows as disconnected on the Sense.
I have to also note that this very setup has worked flawlessly before. For some reason the connection between switch and Sense just broke and I can't figure out why, especially because everything else gets passed through accordingly.