Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - BoneStorm

#1
Hi,

192.168.100.x vs 192.168.1.x are close to each other in 3rd octett and easy to overlook, any chance you just have a typo confusing 1 vs 100 or vice-versa somewhere?

Primary focus should be to fix the DNS setup and avoid the NAS' resolution to the NAS only private network (.1).

In the unlikely event you cannot fix this, a workaround strategy will be to have opnsense knowing a route to these NAS IPs.

system / routes
* add 192.168.1.<NAS1-IP> via gateway 192.168.100.<NAS1-IP>
* add 192.168.1.<NAS2-IP> via gateway 192.168.100.<NAS2-IP>

This will allow opnsense to start acting as router and understand how to reach these NAS private network IPs. But please be aware this can cause asymetric routing, clients reaching NAS1 on 192.168.1.<NAS1> go via opnsense but NAS1 answers can go directly as NAS1 knows clients from 192.168.100.0/24 are directly connected. opnsense firewall connection tracking will not like this. If you really go this path you need to make the firewall stateless for this special IP communication.

Hope it helps anyway.

#2
Hi Matzke,

Quote from: Matzke on April 06, 2024, 02:49:23 PM
It's still present in OPNsense 24.1.5_3-amd64

It seems that it's enough to manually restart service "routing".
...

That makes me a bit more confident to give the upgrade a new try next few days, once spare time permits. Thanks for that information.
#3
I came here by search for my strange upgrade problem, this post seems the only reference but signature fits. Please read below for workaround.

I'm running an physical HA setup of opnsense and upgraded 23.7.12_5 to 24.1.2. I just fixed my HA setup prior the upgrade and tested that well. So I'm confident things broke on the upgrade itself.

I'm running wan with fixed private VIP with CARP enabled. WAN default GW is ping monitored. Right after the upgrade things were fine so I moved forward upgrading the other node too. Then after some minutes misbehavior became visible.

* DNS broke - no name resolution
* GW pings failed - declaring GW down
* tcpdump on wan indicate icmp packets leaving opnsense and were answered by remote successfully
* opensense shell ping however reported timeouts
* same signature on DNS - DNS leaving but unbound states server failure
* existing connections (flows in the connection table) were successfully held and also cached DNS records were served, so it was not entirely obvious things were going wrong
* tcpdump attached to pflogd0 did not indicate any drop
* for troubleshooting I added to WAN ingress permit ip any any statements - no fun
* pfctl -d - disabling pf made the opnsense shell ping to directly connected WAN default GW instantly work
* the issue persisted through multiple reboots including other HA node held artificially down do reduce noise

I tried to make sense out of pfctl rules webgui summary to see where things went wrong, but could not pinpoint an issue here.

Workaround:
* I pulled the backup from history prior the upgrade from both nodes
* fresh install of old 23.7 release (from an old stick I had around)
* load config and restore the cluster

Hope it helps to either confirming this is a real issue, or to spread the word of an workaround which worked for me(tm)