I upgraded from a Netgear Orbi RBR850 to opnsense. I'm running 25.7.5 with the following plugins:
- IGMP Proxy
- mDNS Repeater
- UDP Broadcast Relay
- Universal Plug and Play
In an effort to incrementally introduce complexity whilst focusing on stabilising my configuration, I am trying to replicate my previous network setup.
I have two interfaces:
- WAN - connects to a fibre ont
- LAN - connects to my unmanaged switch
My LAN contains a few servers with static IPs including my services server which hosts Home Assistant. Prior to moving to opnsense, Home Assistant was working with no issues. Since the upgrade the service is restarting with the same errors:
WARNING (MainThread) [zigpy.application] Watchdog failure
WARNING (MainThread) [zigpy.backups] Failed to create a network backup
ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved (None)
WARNING (MainThread) [bellows.thread] Attempted to use a closed event loop
ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [139864590888416] Unexpected exception
WARNING (MainThread) [py.warnings] /usr/local/lib/python3.13/asyncio/base_events.py:2035: RuntimeWarning: coroutine 'ClusterHandler.async_initialize' was never awaited
WARNING (MainThread) [py.warnings] /usr/local/lib/python3.13/asyncio/base_events.py:2051: RuntimeWarning: coroutine 'Device.async_initialize' was never awaited
ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [139864590888416] Error during service call to light.turn_on: Failed to send request: ApplicationController is not running
WARNING (MainThread) [homeassistant.components.mqtt.client] Error returned from MQTT server: The connection was lost.
ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [139864590888416] Error during service call to light.turn_on: Failed to send request: ApplicationController is not running
WARNING (MainThread) [zha.decorators] [<Task pending name='sensor_state_poller_00:0d:6f:00:05:42:89:88-1-2820_PolledElectricalMeasurement' coro=<periodic.<locals>.scheduler.<locals>.wrapper() running at /usr/local/lib/python3.13/site-packages/zha/decorators.py:92> cb=[set.remove()]>] Failed to poll using method [zha.application.platforms.sensor::PollableSensor._refresh]
These appear to indicate that the service can't see the mqtt server or the websocket, which is causing other components to fail.
I'd like to troubleshoot this but I'm not sure the best way to start. I've started Packet Capture on the server ip and port for Home Assistant but nothing seems to be included.
OpnSense is not involved in the intra-LAN communication at all. Unless you use separated VLANs, that is.
What seems to happen here is that a connection to your MQTT server cannot be established. More often than not, the MQTT server has to be specified in configurations. If you have HomeAssistant configured via DHCP and now its IP is different, you will have to reconfigure the MQTT clients.
Most people with HA get an MQTT server by installing the Mosquito add-on. Be sure it's started, check it's visible, make sure you can ping from the HA instance to the Mosquito IP (should be the same, but check it pings) and see if the port is up, and as mentioned if it's connected via static IP make sure it didn't change.
MQTT Explorer can be helpful as you can connect independently to the MQTT server to make sure it's up.
If the MQTT service and HA are in the same VM or container and if said "thing" changed its IP address by the switch of the firewall, that might explain it.
Combining @meyergru's and @Linwood's comments 🙂
Prior to migrating from my Netgear router to opnsense, my configuration was solid and tested. Here's how it is set up:
- 192.168.1.93:1883 - MQTT
- 192.168.1.93:5000 - MQTT UI
- 192.168.1.93:8123 - Home Assistant
- 192.168.1.93:1880 - Node-Red
Home Assistant has an integration to MQTT using 192.168.1.93:1883
It's been stable for over 3 years. Shifting to opnsense has caused the errors in Node-Red and Home Assistant. The ip addresses have not changed. MQTT UI shows traffic from zwave-js-ui, frigate, teslamate, doubletake, and homeassistant.
Any suggestions on how I could troubleshoot the issue using the diagnostic tools in opnsense?
services on same ip but different port. OK.
So have you verified the same server definitively has the same address and can be reached? I'm thinking along the lines of the dhcp server setup on OPN. Usual dig / ping / telnet (to open session) would be a start.
Quote from: instantdreams on October 13, 2025, 05:17:23 PMAny suggestions on how I could troubleshoot the issue using the diagnostic tools in opnsense?
At first glance, data between HA, NR and Mosquito would appear to be all local, on-net (i.e. same subnet, same VLAN) and so does not even pass through OPNsense.
@cookiemonster's question is good, but unless I have just missed a clue, everything you are saying implies OPNsense is not involved. Can you go through your setup and think about it and what you changed and share any theory of even how OPNsense plays any role in the setup described?
For example, you have IP addresses -- is ANYTHING using DNS names, and maybe OPNsense is now the DNS server and different?
Is there any chance around the same time that something changed on the host -- is this HAOS? Or if it's housed in your own linux box, did something like apparmour change?
Please don't take this wrongly, but so far it's kind of like saying "I turned on the back yard light and my toilet overflowed, what's wrong with my light". :)
You have to find the logical connection between the two, then I think people can help debug what's wrong with that aspect.
Yes I get the local-only traffic and therein lies the question. Verify it still goes and gets where it is -supposed- to be.
My thinking is the OP has his host with a static ip on the old router. Now he has OPN as the new router but although traffic is not through it, the host(s) still require the new router to dish out their ip addresses, static reserver or dynamic.
Now lately with isc dhcp to dnsmasq transition it might not be yet setup correctly to have dhcp reservations. Hence I am suggesting to check that basic.
@cookiemonster Services on the same server but different port is relatively standard and I am confirming that the same ip address and ports are accessible from the network prior to moving to opnsense and after stabilising with opnsense.
@Linwood I made sure everything uses ip addresses, but I will check that to confirm. Everything is in a docker container, and exposed via host ip and port. I appreciate your thought process, when you are inside a bug it's hard to step back.
I have 5 servers running Debian, each of which uses a static IP:
- edge1 - 192.168.1.91
- edge2 - 192.168.1.92
- services - 192.168.1.93
- security - 192.168.1.94
- media - 192.168.1.95
The routers (Netgear or opnsense) have never needed to issue an ip address, and do recognise the servers on the network.
opnsense is using unbound and dnsmasq for DNS and DHCP. I've tried to change this to using my two pi-holes but it fails each time, that's a separate issue.
opnsense is configured to:
- use domain home.arpa
- have no specified dns servers
- serve the web gui on port 8443
Unbound is configured to:
- Be Enabled
- Override example.com to 192.168.1.91 where traefik will reverse proxy requests
dnsmasq is configured to:
- Be Enabled
- Listen on LAN
- DHCP FQDN
- DHCP local domain
- DHCP register firewall rules
- Register the 5 servers under Hosts
- Server ip addresses in the range 192.168.1.100 to 192.168.1.245
Here's how dig resolves for the hostnames:
$ docker exec homeassistant dig +noall +answer homeassistant.example.com
homeassistant.example.com. 3600 IN A 192.168.1.91
$ docker exec homeassistant dig +noall +answer mqtt.example.com
mqtt.example.com. 3600 IN A 192.168.1.91
$ dig +noall +answer homeassistant.example.com
homeassistant.example.com. 3600 IN A 192.168.1.91
$ dig +noall +answer mqtt.example.com
mqtt.example.com. 3600 IN A 192.168.1.91
Here's how nc feels about the ports:
$ nc -vz 192.168.1.93 1883
services.example.com [192.168.1.93] 1883 (?) open
$ nc -vz 192.168.1.93 8123
services.example.com [192.168.1.93] 8123 (?) open
$ nc -vz 192.168.1.93 5000
services.example.com [192.168.1.93] 5000 (?) open
$ nc -vz 192.168.1.93 1880
services.example.com [192.168.1.93] 1880 (?) open
$ docker exec homeassistant nc -vz 192.168.1.93 1883
192.168.1.93 (192.168.1.93:1883) open
$ docker exec homeassistant nc -vz 192.168.1.93 8123
192.168.1.93 (192.168.1.93:8123) open
$ docker exec homeassistant nc -vz 192.168.1.93 5000
192.168.1.93 (192.168.1.93:5000) open
$ docker exec homeassistant nc -vz 192.168.1.93 1880
192.168.1.93 (192.168.1.93:1880) open
This is a real puzzler.
It now sure looks like a DNS problem. If you configured the MQTT clients to DNS names instead of IPs, then it would be clear.
Also: why do all of the server names resolve to 192.168.1.91 but then you check for the open ports on another server (192.168.1.93)?
Keep in mind that resolution of local, unqualified names has its quirks, like if some clients add a search domain to names and others do not. Thus, a request for "mqtt" might result in "mqtt.example.com" on one machine, but just "mqtt" on another (and one might fail). Also, you seem to mix example.com and home.arpa within your network.
Quote@cookiemonster Services on the same server but different port is relatively standard and I am confirming that the same ip address and ports are accessible from the network prior to moving to opnsense and after stabilising with opnsense.
Yes it is pretty standard. I wasn't saying otherwise ;)
Network connectivity at ip level seems OK then. And it has been established that they are on the same network segment (and same host). By DNS is another matter so we might need to diagnose that. No routing required of course. The basic tests I was thinking you have now accomplished so I'm leaning on the application side now.
BTW if you have a flat network, may I ask why are you using those plugins which are normally to relay broadcast traffic between networks? Unrelated of course, just in case it shines some strange light.
I apologize for being dense but do you have an example of the actual problem occurring?
For example, if Home Assistant is supposed to connect to the MQTT server, do you have a log of that failing that shows HOW it connects? Like from the integration page, showing it uses IP (vs name)?
If MQTT connection is the problem AND you are using static IP, we can stop talking about DNS.
Alternatively, run MQTT Explorer and enter the IP and credentials, e.g. as below, and use explicit IP addresses from a PC on that same network, and see if it can connect. If it can't -- easy to debug. If it can, see what's different about HA.
I really suggest using explicit IP addresses and not names if this is all internal and on the same network, as that takes mDNS and DNS out of the picture.
(https://photos.smugmug.com/photos/i-2nRCGR3/0/L8mN3D4RC7MFtL5MFJQsNrShKrjnRkLpMk2P7S2rf/O/i-2nRCGR3.jpg)
@meyergru All the endpoints are using ip addresses to avoid any DNS confusion, which leads me to assume the issue is a firewall one rather than a DNS one. The hostnames resolve to 192.168.1.91 because of the override in Unbound DNS which forwards all requests for example.com to my Traefik reverse proxy. To be clear, all inter-machine communication uses ip addresses and ports. There are no FQDNs or hostnames.
@cookiemonster I am using all multicast DNS plugins because without them Sonos wasn't working. I plan to slowly remove them and test to confirm what I actually need. I currently have:
- A set of floating rules to allow SSDP, mDNS, GDM, Plex, Sonos, Spotify, and Windows Sharing
- IGMP Proxy between WAN and LAN for all internal ip ranges
- mDNS Repeater between WAN and LAN
- UDP Broadcast Relay for SSDP, mDNS, GDM, and Sonos
- Universal Plug and Play allowing all UPnP IGD and NAT-PMP mappings by default
My goal is to replicate what I had with my commercial router before hardening things.
@Linwood A very good question, let me define the behaviour I am experiencing and hopefully it'll add clarity.
IssueWeb connection to Home Assistant fails every 40-60 seconds. Logs indicate a websocket issue.
HostIntel NUC7CJYHN 16GB RAM 500GB SSD
ServicesService | IP and Port | MQTT Topic | HA Connection | Log Issues | Purpose |
homeassistant | 192.168.1.93:8123 | homeassistant 192.168.1.93:1883 | n/a | websocket related | home automation hub |
mqtt | 192.168.1.93:1883 | n/a | any 192.168.1.93:1883 | none | message broker |
node-red | 192.168.1.93:1880 | 192.168.1.93:1883 | 192.168.1.93:8123 | none | visual automation engine |
zigbee2mqtt | 192.168.1.93:8321 | zigbee2mqtt 192.168.1.93:1883 | mqtt discovery | none | zigbee coordinator |
zwave-js-ui | 192.168.1.93:8091 | zwave 192.168.1.93:1883 | websocket 192.168.1.93:3000 | none | zwave coordinator |
All services are run in docker containers. All connections use host IP addresses and ports.
BehaviourSince migrating from my Netgear router to opnsense, home assistant has been unstable. The ui will become unresponsive and restart multiple times.
Initial errors in the log files indicated timeouts, usually with the Zigbee Home Automation service.
homeassistant | 2025-10-13T14:15:59.106199110Z bellows.ash.NcpFailure: NcpResetCode.ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT␛[0m
homeassistant | 2025-10-13T14:15:59.229743107Z ␛[31m2025-10-13 08:15:59.184 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [140318389399968] Unexpected exception
homeassistant | 2025-10-13T14:15:59.231156264Z bellows.ash.NcpFailure: NcpResetCode.ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT␛[0m
homeassistant | 2025-10-13T14:15:59.465955299Z ␛[31m2025-10-13 08:15:59.464 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [140318389399968] Error during service call to light.turn_on: Failed to send request: ApplicationController is not running␛[0m
homeassistant | 2025-10-13T14:17:52.757669048Z ␛[33m2025-10-13 08:17:52.751 WARNING (MainThread) [homeassistant.components.media_player] Updating webostv media_player took longer than the scheduled update interval 0:00:10␛[0m
homeassistant | 2025-10-13T14:17:52.769933176Z ␛[33m2025-10-13 08:17:52.751 WARNING (MainThread) [homeassistant.helpers.entity] Update of media_player.lg_webos_tv_bedroom1 is taking over 10 seconds␛[0m
homeassistant | 2025-10-13T14:17:52.919683463Z ␛[31m2025-10-13 08:17:52.910 ERROR (MainThread) [homeassistant.components.tautulli] Error fetching tautulli data: Request timeout for 'http://192.168.1.95:8282/api/v2?apikey=[REDACTED_API_TOKEN]&cmd=get_home_stats'␛[0m
homeassistant | 2025-10-13T14:17:52.937803502Z ␛[33m2025-10-13 08:17:52.931 WARNING (MainThread) [zigpy.application] Watchdog failure
homeassistant | 2025-10-13T14:17:53.302102073Z ␛[31m2025-10-13 08:17:53.301 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [140318389399968] Error during service call to light.turn_on: Failed to send request: ApplicationController is not running␛[0m
TroubleshootingIn an effort to resolve this, the following steps were taken:
- Add Firewall Floating Rule for ports 8123 and 1883 - no obvious impact
- Migrate from Zigbee Home Assistant to zigbee2mqqt - reduced errors in Home Assistant but did not change behaviour
- Change version of Home Assistant from latest to 2025.9 - no obvious impact
Current StateI am running Home Assistant 2025.9 in host networking mode. The web ui will work for 30-50 seconds then become unresponsive. The message "Connection lost. Reconnecting..." appears on the UI and 20-30 seconds later the page is responsive again.
But how can it be a firewall issue when the traffic is local on the LAN and never passes OpnSense?
I'm a network engineer. When I am out of ideas I pull the big guns. I.e. a packet trace.
Do a packet trace on OPNsense and try to find evidence for any of that traffic even leaving your HA host. If that evidence is found, then try to find out *why*.
Traffic from a host to its own IP address - even if that IP address is bound to a physical interface - is routed through the loopback IF. It should never be seen on the wire. So watch the wire. Proceed from what you find (or don't find).
Is there any way to put the old router back and remove opnsense?
If it still fails the same way you learn a lot. If it doesn't fail, the packet trace idea is perhaps the best place to go.
You mention running IGMP proxy WAN and LAn, you also talk about a reverse proxy. Is your access to Home Assistant somehow from the internet and not from your local LAN? What happens if the internet is down (say back with the old router) -- did anything break?
I thought the issue was you can't connect to MQTT -- did that start working?
My suggestion is not at odds with Patrick's but is a different dimension -- pick ONE thing that is a failure you think relates to opnsense, one single thing, and describe it fully, and see if you can get more details including packet traces. The errors shown look like zigbee related (when I search for bellows for example I get a lot of hits about zigbee). Are all your poblems actually originating with zigbee? (The UI -- you mean the HA UI not the Zigbee2MQTT UI?)
Finally, a lot of what you are talking about looks like WAN related stuff. HA is mostly local (with some cloud integrations of course), and everything you've mentioned should be local. It would be very helpful if in picking something to concentrate on, pick something unrelated to the internet. Don't access HA from the internet, access it locally (you are, right? Not with something thru nabu casa or some proxy?
But... pick one thing that fails and figure out what you can. MQTT (the service, not zigbee2mqtt) is pretty straightforward -- if HA can't talk to HA, do as I suggested and see if you can, see what happens. But pick one thing. And for us to help, try to stick with one specific failure, not jump around. It will help a lot.