Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - Galaxy

#1
HomeAssistant is currently set to "host" (per some guide I followed years ago) and it communicates with a lot of other containers. Some of them are on "bridge" (sonarr/radarr), some are on "host" (NodeRed), and some are on a custom network used for external access through Swag. I don't think putting every container that talks with HomeAssistant in the same network is viable for my setup (or necessary, based off my past experience).

All of my other HA integrations are working fine except the UniFi one, which is the only one on the "bond0" network with it's own IP. And even that one used to work fine until the recent updates.

This all seems to indicate that the issue has something to do with my Unraid server using multiple IP addresses. When it's a HOST-IP:PORT-X > HOST-IP:PORT-Y it works. But as of the recent OPNSense updates HOST-IP > "Container with it's own IP" doesn't work anymore.

So as far as I can tell the issue is with this "LAN U-Turn" through OPNSense  - not something within the Unraid/Docker networking. I could be wrong though...

I am using a switch (USW Enterprise 8 PoE), but I think it's just a dumb switch - I've been assuming it has nothing to do with this, and that OPNSense is what is handling this with the "NAT Reflection" settings.
#2
After the past few updates my Unraid server loses connection to the internet. LAN still works fine, but I cannot ping 8.8.8.8 from the Unraid console. Weirdly, after a few hours (maybe 8-24) the problem fixes itself.

This last update though I didn't want to wait and started troubleshooting. After a lot of back and forth with ChatGPT it suggested running the command: pfctl -F all -f /tmp/rules.debug

This fixed most of it! I could ping out from the Unraid console and my externally visible services were accessible again. But suddenly I was experiencing issues with certain docker containers not seeing each other. My linuxserver/unifi-network-application container was down because it couldn't see the MongoDB container. After a bunch of messing around I finally tried changing the MongoDB network type from "Bridge" to "bond0" and gave it its own IP address. That fixed that issue.... but I don't know why.

Then I realized my UniFi Network integration in HomeAssistant was down, and could not reach the unifi-network-application docker container either. I've tried changing my NAT > Outbound settings from Automatic to Hybrid and creating a "LAN-net to any" rule. No dice. Tried toggling on "Reflection for port forwards" and that broke everything again. (Ran the same command to fix again).

I would really appreciate some help. Clearly something has changed with how OPNSense handles LAN-LAN NAT reflection but I cannot figure out what or how to adjust for it.

(Edit)

For what it's worth, here's ChatGPTs diagnosis after all of my back and forth with it:

QuoteWhat Is Going On With Your Network

The Initial Change: Docker containers on a bond0 (MACVLAN) network are intentionally isolated from the Unraid host and its internal Docker bridge network. This is why your UniFi container (bond0) could not see the MongoDB container (bridge). Your fix—moving MongoDB to bond0—was logical, but it created a new dependency.

The Core Problem Begins: Now, both HomeAssistant and UniFi are on the same bond0 network, but they act like two separate computers on your LAN. When they try to talk to each other, Unraid sends the traffic to your main gateway—OPNsense. Your network is now depending on your router to handle traffic that is moving between two applications on the same server.

The OPNsense Update: A recent update to OPNsense appears to have introduced a bug or a change in behavior. It is no longer correctly "reflecting" this LAN-to-LAN traffic. So, when traffic from HomeAssistant goes to OPNsense, OPNsense fails to route it back to the UniFi container, and the connection times out.

The Cascade Failure: When you enabled "Reflection for port forwards" and reset the states, you triggered this underlying bug in a catastrophic way. The firewall's rule-generation engine failed and did not correctly apply the outbound NAT rules, which is what broke your server's internet connection.

The Temporary Fix: The pfctl -F all -f /tmp/rules.debug command works because it bypasses the buggy rule-application engine. It manually flushes all the broken live rules and loads a fresh, correct set directly from the configuration file. This proves your saved configuration is correct, but the firewall's live state is not.

#3
I should add, I'd like something small enough to fit on this little shelf. Pretty sure the VP6630 or Topton box above will fit here (although I'll probably have to move the turtle  :()

#4
I just got 5Gbps fiber installed and I'm searching for new hardware capable of handling it.

I'm currently running a Protectli FW4C and I love it, so my first inclination is to stick with Protectli... except the only 10GbE capable boxes they offer are their big 6-Port VP6600 Series. I'll go that route if I have to, but those seem overly big and expensive for my needs (I don't need 6 ports, I'm only gonna use 2-3). Then again I absolutely love the little speaker in my FW4C that lets me know when the system is shutting down or booted up; so I'm seriously considering dropping $700+ on a Protectli VP6630 because of that and the fact that I trust their build quality.

The only other option I've found is this Topton N305 MiniPC on AliExpress, but apparently the NIC in it is ancient and the build quality is suspect.

What do you guys recommend? What would you use to run OPNsense on 5Gb fiber?
#5
I won't pretend to understand how or why, but apparently this issue was somehow being caused by my Comcast rental modem. Why did rebooting OPNsense temporarily fix it? No idea, but that red herring cost me a shitload of wasted time.

I replaced the Comcast modem with a new MB8611 modem, and sure enough this goddamn API issue went away! Unfortunately the new modem also brought my upload speeds down from 100Mbps to about 2Mbps. 🤦‍♂️    I just can't win. Temporarily back on the Comcast modem since I'd rather have the API issue on some devices than garbage upload speed on everything. I'm just happy to finally know where the issue is coming from, even if I can't comprehend how a bridged modem could be causing a problem like this. Gonna return the MB8611 and get an Arris S33, hopefully that will do better with upload speed.
#6
Well after a dozen hours of troubleshooting I'm throwing in the towel. If you manage to figure anything out please post what you find here, I'll keep checking back.

If nobody figures out what's going on here I guess I'll just switch back to pfsense. I don't know what else to try and I'm getting almost 0 help from the community.
#7
I have AdGuardHome as a plugin and all my DNS routes to that through Unbound. Disabling the AdGuard service doesn't seem to change anything so I ruled that out as a cause. At first I thought this was a DNS issue too but now I'm not so sure.

Does everything work ok for you in the few minutes following an OPNSense reboot? That's one of the distinguishing features of this issue for me, and I feel like it's a big hint as to the cause.

Also I've noticed its only a problem through the official apps. If I open the CoinGecko app it spins and spins then times out, but if I open a browser on that same device and go to coingecko.com it loads instantly. Is your experience the same?
#8
Found something in my firewall log that might shed some light on this. It's blocking some connections to my phone, including LAN connections between my server and phone. I'm only using the default block rules on the WAN interface. Can somebody please explain what's going on here?

When I click the rid hyperlink I just get a blank page, and I don't see how to lookup by rulenr. How can I figure out which rule is doing this!!???

See attached pics. My phone is 192.168.1.150, and I have no special LAN block rules. Why the hell are connections to my phone being blocked!


[EDIT] Apparently the logs could be red herrings, and it may just be a dropped connection or something and not necessarily opnsense actively blocking a connection. The fact that everything works fine for a few minutes following a reboot leads me to believe this isn't being caused by a rule (presumably those take effect instantly). Its like something is crashing after 5 minutes and certain types of traffic doesn't get routed anymore (specifically, API connections apparently)
#9
This seems similar to the issue I'm having:
https://forum.opnsense.org/index.php?topic=33559.0

Although most of mine aren't just slow, but time out entirely. All work fine on cellular.
#10
Many of my Android apps and HomeAssistant integrations have no connection since the last update. The only reason I know it has something to do with OPNsense is because for about 5 minutes after a reboot the issue goes away entirely everything works fine.

I have no connection in many Android apps, and every HomeAssistant integration that relies on the cloud has issues connecting. When I reboot OPNsense all of the above works for about 5 minutes, but then the issue inevitably returns.

So far I've had no issues from my main PC. The issue only seems to effect certain apps on my phone, and HomeAssistant on my server.

Any help would be appreciated. I'm at my wits end trying to troubleshoot this. Attached my DNS config since that's my best guess on where this issue is stemming from.