[solved] How do I debug this? clients connected through AP can't ping router

Started by pyroclasticglow, March 16, 2025, 05:00:20 PM

Previous topic - Next topic
Apologies if this isn't the best venue for this sort of newb question; if there is a better one please let me know.

I've recently installed opnsense (25.1.2, on a qotom box) because I wanted to "learn a bit more networking", and, well... careful what you wish for?

My installation is very vanilla. The only maybe-slightly-funny thing right now is that I have four of the ethernet interfaces on the router box bridged; this is temporary while I wait to get a switch.

I have two machines connected directly via ethernet, and this is working fine.
I also have an EAP610 connected via ethernet, and this has... been funny.

Basically I'm intermittently running into an issue where devices connected to the AP are receiving IP addresses via DHCP (from opnsense) and are showing up in the DCHP leases list, but they can't access the router or the WAN.

- pinging the router fails (no route to host; wireshark shows a bunch of unanswered ARP broadcasts looking for 192.168.1.1, the router ip)
- pings from the router reach the client, but it doesn't reply
2368 2966.036740 192.168.1.1 192.168.1.102 ICMP 98 Echo (ping) request  id=0x91cc, seq=6/1536, ttl=64 (no response found!)- but pinging other things on the network is fine (clients can ping the AP and other clients)
- this AP was working fine with the previous router, and was working for a while with this one
- I haven't added any firewall rules

The other bit of relevant information is that this was working fine until I started messing about with static DHCP reservations. Since then the error has recurred a few times, and has also resolved itself a few times, often on a timescale that coincides with the max DCHP lease length (as in, last time I was putzing around in the morning before work, this problem occurred, and then wireless wasn't working for about 24h).


So: how should I be thinking about this problem? What are useful debugging steps? I know how computers work and I'm comfortable in the command line, but I don't know much about networking.

I'd tend to suspect your bridge setup. You should have the bridge assigned as your LAN interface, configured with the IP address 192.168.1.1, and the member devices should not be assigned to interfaces (at least not ones that have IP addresses). Is that the case? There's a bit of a dance you have to go through to set this up: https://docs.opnsense.org/manual/how-tos/lan_bridge.html

Another typical problem area with bridges is ARP. Check the ARP tables on all devices against the DHCP leases and the expected MACs.

I use four bridges on my firewall, and haven't had a (connectivity) problem with either AP I have (an old Linksys and a Netgear running OpenWRT). Mine are configured as bridges (dang, I swear I really don't have a bridge fetish) so that DHCP and filtering is done by the firewall. It sounds as though your setup is similar. I don't use any static DHCP or NAT, though.

@dseven I followed that guide for setting up the bridge, and it was working until I started to fiddle with DHCP.

I _did_ have an extra interface assignment for the interface that is being used for WAN, but it was disabled. Removing it and rebooting has changed nothing.

@pfry the ARP table also looks fine, it follows what I see in the leases.

Something else I just noticed: in my DCHP logs I'm seeing a lot of redundant requests from the devices that I'm having trouble with, like:

2025-03-16T16:27:19-04:00 Informational dhcpd DHCPACK on 192.168.1.115 to fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:27:19-04:00 Informational dhcpd DHCPREQUEST for 192.168.1.115 from fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:27:19-04:00 Debug dhcpd reuse_lease: lease age 299 (secs) under 25% threshold, reply with unaltered, existing lease for 192.168.1.115
2025-03-16T16:27:19-04:00 Informational dhcpd DHCPACK on 192.168.1.115 to fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:27:19-04:00 Informational dhcpd DHCPREQUEST for 192.168.1.115 from fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:27:19-04:00 Debug dhcpd reuse_lease: lease age 299 (secs) under 25% threshold, reply with unaltered, existing lease for 192.168.1.115
2025-03-16T16:26:45-04:00 Informational dhcpd DHCPACK on 192.168.1.115 to fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:26:45-04:00 Informational dhcpd DHCPREQUEST for 192.168.1.115 from fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:26:45-04:00 Debug dhcpd reuse_lease: lease age 265 (secs) under 25% threshold, reply with unaltered, existing lease for 192.168.1.115
2025-03-16T16:26:45-04:00 Informational dhcpd DHCPACK on 192.168.1.115 to fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:26:45-04:00 Informational dhcpd DHCPREQUEST for 192.168.1.115 from fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:26:45-04:00 Debug dhcpd reuse_lease: lease age 265 (secs) under 25% threshold, reply with unaltered, existing lease for 192.168.1.115

and that seems weird? 

Have you configured a very short DHCP lease time? That may cause client and server to disagree on when it's time to renew.

As for the ARP problem - if you're seeing unanswered ARP requests on the client, try tcpdump on the bridge interface on OPNsense - do you see the ARP requests there?

The max lease time is set to four hours, default is two hours.

I do see the ARP requests on the bridge, and they are being replied to... is it suspicious that the replies are addressed to the mac address and not the IP? They look like,

2.205907 fa:4c:d1:56:48:d3 Broadcast ARP 64 Who has 192.168.1.1? Tell 192.168.1.115
2.205912 FreeBSDFound_10:ad:0b fa:4c:d1:56:48:d3 ARP 46 192.168.1.1 is at 58:9c:fc:10:ad:0b
2.487226 192.168.1.2 255.255.255.255 UDP 843 41450 → 29810 Len=801
2.650812 Apple_31:ed:b2 Broadcast ARP 64 Who has 192.168.1.1? Tell 192.168.1.9
2.650818 FreeBSDFound_10:ad:0b Apple_31:ed:b2 ARP 46 192.168.1.1 is at 58:9c:fc:10:ad:0b

Also to make this additionally fun, my laptop is now connecting fine via the AP, although other devices are not. This happened without me making any changes to firewall settings (the only thing I can think of that i did was establish a VNC connection to another machine on the LAN).

Quote from: pyroclasticglow on March 16, 2025, 09:31:28 PMSomething else I just noticed: in my DCHP logs I'm seeing a lot of redundant requests from the devices that I'm having trouble with, like:

2025-03-16T16:27:19-04:00 Informational dhcpd DHCPACK on 192.168.1.115 to fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:27:19-04:00 Informational dhcpd DHCPREQUEST for 192.168.1.115 from fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:27:19-04:00 Debug dhcpd reuse_lease: lease age 299 (secs) under 25% threshold, reply with unaltered, existing lease for 192.168.1.115
2025-03-16T16:27:19-04:00 Informational dhcpd DHCPACK on 192.168.1.115 to fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:27:19-04:00 Informational dhcpd DHCPREQUEST for 192.168.1.115 from fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:27:19-04:00 Debug dhcpd reuse_lease: lease age 299 (secs) under 25% threshold, reply with unaltered, existing lease for 192.168.1.115
2025-03-16T16:26:45-04:00 Informational dhcpd DHCPACK on 192.168.1.115 to fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:26:45-04:00 Informational dhcpd DHCPREQUEST for 192.168.1.115 from fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:26:45-04:00 Debug dhcpd reuse_lease: lease age 265 (secs) under 25% threshold, reply with unaltered, existing lease for 192.168.1.115
2025-03-16T16:26:45-04:00 Informational dhcpd DHCPACK on 192.168.1.115 to fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:26:45-04:00 Informational dhcpd DHCPREQUEST for 192.168.1.115 from fa:4c:d1:56:48:d3 (iPhone) via bridge0
2025-03-16T16:26:45-04:00 Debug dhcpd reuse_lease: lease age 265 (secs) under 25% threshold, reply with unaltered, existing lease for 192.168.1.115

and that seems weird?

That might be just a TP-Link thing. My TP-Link APs (consumer grade, not Omada) are doing it as well. I suspect they are (ab)using DHCP as some kind of connectivity probe. I find it quite annoying.

Those DHCP messages basically mean, the Host (MAC) wants to extend the lease for that specific IP.

What is your lease time set for DHCP?
Does it happen for all devices?

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

- DCHP max lease time is 4 hours.
- This is happening to all devices connected via the AP.

My current plan is to grab a switch tonight and eliminate the bridge, and see if that helps, and if it doesn't I'll try out a different AP.

So I installed a new switch last night and deleted the bridge and... everything started working again.

I'm not 100% ready to trust again, but this does feel different than the other intermediate moments of functionality I've had previously; I'll update this post if it goes wonky again.