Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - db

#1
Update, I seem to have discovered what was going on, I had a wifi extender device (TP-Link RE605X) I was using to bridge an ethernet-only device into the wifi network, but it seems to have some crazy behavior of announcing as owning the IPs of basically every other device on the wifi network, including the gateway... for example:

12/18/2022 10:17:37 AM    Flip flop    A6-A2-F4-97-F2-4A takes other's IP: 10.0.1.1
12/18/2022 10:17:37 AM    Flip flop    A6-A2-F4-97-F2-4A takes other's IP: 10.0.102.1
12/18/2022 10:17:37 AM    Flip flop    A6-A2-F4-97-F2-4A takes other's IP: 10.0.0.1
12/18/2022 10:17:37 AM    Flip flop    A6-A2-F4-97-F2-4A takes other's IP: 10.0.3.1
12/18/2022 10:17:37 AM    Flip flop    A6-A2-F4-97-F2-4A takes other's IP: 10.0.4.1
12/18/2022 10:17:37 AM    Flip flop    A6-A2-F4-97-F2-4A takes other's IP: 10.0.10.1
etc...

Not sure why it's trying to proxy arp for the entire world, feeling like it's some sort of bug but not sure, I've posted in their forums but no reply yet. I have a RE600X configured in exactly the same way but it doesn't do this. In any event, what would happen is that when I thought I was pinging the gateway, I was actually going through this extender... and so was the rest of the entire network, probably. Then I suppose the gateway would garp again and I'd be good for a while, rinse/repeat.

Suppose this type of this might be a reason why an arpwatch module for opnsense would be nice. Seems like there is one for pfsense, anyone know of an effort to do the same in opnsense?
#2
Interesting, I haven't looked at that. I did learn that this issue happens on multiple hosts, not just the one, and at the same times. (I just left a ping running every 5s, sure enough, when I notice the issue on one host it's happening to _some_ of the others, but not all...).

Frustrated, I ran another cable along the top of the walls between the switches, thought I might have resolved it as the ping test was good all night but encountered the same issue this morning. Noticed the zyxel had broadcast storm control enabled, though logs don't seem to indicate that was being enforced I turned it off anyway, because I have no idea what else to do.

Seeing as it's affecting multiple hosts now however, I'm back to wondering if there might be something happening with opnsense here. I don't know what but, it's the only point of commonality that I haven't switched out. I can't log into the opnsense box directly from an affected host while this is happening, but I can ssh into a neighboring host which can then ssh into the opnsense box (?), and everything seems fine, 99% idle, no single core pegged or anything like that, tons of memory free. Pinging from opnsense to any non-affected host is fine.

The affected hosts effectively have no internet (as they can't seem to communicate with opnsense). Ping still 'works' between affected hosts and opnsense though it's quite a bit more latent than normal. Ping between other hosts and affected hosts seems normal.

Maybe something quirky happening with the firewall? That would only occur on random occasion and self-resolve? I know I'm grasping at straws here..
#3
No VLANs, it's a flat network. It's too far to plug the host directly into the zyxel.

Bad cable is something I'm starting to suspect, as the link for this switch and the zyxel is only 5Gbe. Is it possible that a bad cable would cause issues that will only show up between two hosts? What's odd to me is that the bad ping is only between the host and the gateway, but the host can ping other hosts on the network without issue. Also, it will resolve itself eventually, last two captures I have it resolved in ~20M and ~45M.
#4
Update, disabled ethernet on the host, connected via Wifi instead (so different adapter, driver, etc). Same behavior! I'm very confused. Maybe I was mistaken and this does somehow have something to do with opnsense? Completely odd to me that the high/bad ping is only between the host and opnsense, and nothing else. I seem to be able to fix it by momentarily unplugging the switch and plugging it back in. (On wifi, the host is connected to an AP which is connected to the same switch it was in before).

I've also tried swapping out the switch for another, same issue.
#5
Hi folks, I know this probably isn't an opnsense issue but thought there may be someone with ideas reading these forums anyway :).

I've been tearing my hair out with a difficult to diagnose network issue which manifests as intermittent high ping to the gateway (which is opnsense), only from one host. This will occur for seemingly random amounts of time, maybe only seconds, or maybe 20+ minutes, a few times every day. Packet captures on that host reveal nothing that seems very interesting, other than a lot of tcp retries/spurious retransmits etc which I suspect are symptoms and not cause. The host is unusable from a network standpoint when this happens (no pages load, etc).

What's odd is that when this happens, this same host can ping any other host on the network without issue, including others on the same switch. The opnsense box can also ping any other host just fine, and they can ping it. Pinging from the opnsense box to the affected host is also slow. Packet captures on the opnsense box reveal that it receives pings and replies immediately (~0.1ms), on the affected host the captures show pings and replies being far apart (10-250ms).

The opnsense box is connected to a zyxel switch, the affected host is connected to a mikrotik switch which is connected to that same zyxel switch. Nothing on either switch seems interesting either, no massive error counts or anything like that.

I'm led to believe the issue lies with the host itself and not the network, but I have no idea what... It's connected with an Asus XG-C100C adapter, I found updated drivers for this and installed them but hasn't changed the problem.

Anyone have any ideas? Things I might try, other things to look for?
#6
I suppose a sledgehammer option would be to run another opnsense box in front of at least one of these, and have it NAT through 100.64.0.1 to something else so the first opnsense box sees a different IP... unless there might be some way to do that with virtual nics on one opnsense box but I can't come up with a way how.

I am sort of confused how the load balancing is working right now. I'd say maybe the opnsense stats are lying to me but if I navigate to any 'what is my ip' site I will see both external IPs.
#7
That's unfortunate, I'm not in control of the gateway IPs. The gateway group using both as Tier 1 does seem to 'work' however (I'm seen coming from both IPs).

I'm a bit confused when you say this isn't supported in FreeBSD, does this not do what I think it's doing?

route get 208.67.222.222
route to: dns.opendns.com
destination: dns.opendns.com
gateway: 100.64.0.1
interface: igb4

route change -net 208.67.222.222 -interface igb5

route get 208.67.222.222
route to: dns.opendns.com
destination: dns.umbrella.com
interface: igb5
#8
General Discussion / Multi-WAN Bug/Oversight?
May 16, 2022, 08:24:43 PM
Background:

I have a multi-wan setup with multiple ISPs, for both fault tolerance as well as to increase available bandwidth on my network.

I recently purchased a second service from one of those providers, and added it as another WAN. Both of these are Tier 1 in a gateway group, and most of my traffic is directed through this gateway group as a load balanced group with sticky connections.

I've only had this set up for a few days, and for the most part, it seems to be working (so far as load balancing goes), but there are some.. oddities, which I believe are due to both WAN gateways having the same gateway IP (due to them being from the same ISP).

Relevant Info:

WAN4 is cgnat and has a gateway address of 100.64.0.1

WAN5 is cgnat and has a gateway address of 100.64.0.1

I manually added monitor ips for each, because otherwise having them both use 100.64.0.1 seemed like it would really only be monitoring one of them (because of routing).

WAN4 has monitor IP of 1.0.0.1

WAN5 has monitor IP of 1.1.1.1

WAN4 is on interface IGB4

WAN5 is on interface IGB5

I have DNS servers assigned to each:

WAN4 should be 208.67.220.220

WAN5 should be 208.67.222.222

Both also have IPs assigned from the ISP:

WAN4 has IP 100.68.*.*

WAN5 has IP 100.120.*.*

Issues:

Mostly, routing seems to be based on gateway IP and not interface, so I have:

default, 100.64.0.1, UGS, igb4

1.0.0.1, 100.64.0.1, UGHS, igb4

(nothing for 1.1.1.1)

208.67.220.220, 100.64.0.1, UGHS, igb4

208.67.222.222, 100.64.0.1, UGHS, igb4

Notice the issues? Monitor IP for WAN5/IGB5 of 1.1.1.1 will be routed via the default gateway, so will actually be monitoring WAN4/IGB4. Also DNS will only be using WAN4/IGB4.

I can manually edit the route table (using 'route') and make it all make sense, but something overwrites my changes after a minute or so.

Is this a bug or oversight? Am I doing something odd having two connections from the same ISP (and thus the same gateway IP)? It doesn't seem that odd.

Am I doing something wrong, or should I file this as a bug?
#9
Still hoping someone might have some insight here, a more direct question:

Has anyone ever set up opnsense with two independent WAN connections from the same provider? If so, did having the same gateway for both cause problems?
#10
(re-posting in General Discussion)

I intend to set up multiwan with two services from the same provider (for throughput, round robin). The provider is cgnat so each interface will receive ips in the same subnet and will both have the same gateway.

This leads me to believe I will have issues with routes. For one thing it seems there is an autogenerated route for the gateway ip to a specific interface, and routes for any configured dns and monitor ips to that gateway ip. Obviously that wont work.

My question is, will opnsense just perform magic to make this work somehow? ;)

Or if not,  think this is something i can resolve myself with static routes? Will i have to disable any automatic routes?

Thanks!
#11
I intend to set up multiwan with two services from the same provider (for throughput, round robin). The provider is cgnat so each interface will receive ips in the same subnet and will both have the same gateway.

This leads me to believe I will have issues with routes. For one thing it seems there is an autogenerated route for the gateway ip to a specific interface, and routes for any configured dns and monitor ips to that gateway ip. Obviously that wont work.

My question is, will opnsense just perform magic to make this work somehow? ;)

Or if not,  think this is something i can resolve myself with static routes? Will i have to disable any automatic routes?

Thanks!
#12
Thanks chbmb, I set this up on my system for the ps5 and it seems to work great. I'm not really sure what this is doing under the hood but, hey, it seems to do the trick :).

I do have a multi-wan setup, so for anyone looking to do the same, I just duplicated the rule, one for each WAN. Not sure if there might have been a more efficient way to do that.

Also, chbmb, if your alias was a screenshot from your actual setup, just thought I'd note for you that you seem to have a typo, alias content is 192.68.0.70-192.168.0.71 when you probably meant 192._1_68.0.70-192.168.0.71.
#13
Played with Sensei a bit and it seems like it _almost_ does this, but seems to only work for closed connections and doesn't take NAT into account. Unless maybe there's something there I'm missing? I can't be the only one who wants to see this sort of data, can I? ;)
#14
I'm wondering if something like this already exists and I'm missing it, or if not, if it's even possible (perhaps if only via CLI?).

What I'd like to see is a list of current connections, bandwidth use, etc, which includes the LAN client (IP and/or DNS name) all the way through to which external interface it's using in a multi-WAN set up (as in, taking NAT into account). Bonus for being able to see historic data for closed connections but I'm really mostly interested in quickly answering "what client is uploading on which particular WAN _right now_", etc.

I suppose the view I'm looking for might look something like this:

LAN Client IP | LAN Client Port | Firewall IP | Firewall Port | Destination IP | Destination Port | Protocol | WAN Interface | Bytes Up | Bytes Down | Mbps Up | Mbps Down

And if we were to have some possible historic data maybe also 'Start Timestamp | End Timestamp'.

Any thoughts?
#15
I was running into an odd issue with a balancing group in my multi wan setup, and disabling sticky connections seemed to fix it. I'm not sure if this is some sort of bug with sticky conns that I've happened to stumble across, or if it's just something unique about my set up (though I haven't changed too much from install).

I basically followed the recipe for setting up multi-wan. I have 3 WAN interfaces, WAN, OPT1 and OPT2. All seem to work fine. I created two gateway groups, one as failover (WAN - Tier 1, OPT1 - Tier2, OPT2 - Tier3). I created a rule on my LAN interface to drive all 'LAN net' traffic to the failover group, and this seemed to work fine, all traffic worked as expected, passing through WAN.

I also created a balancing group, with two GWs as T1 and one as T2 (WAN - Tier 1, OPT1 - Tier1, OPT2 - Tier2). When I switched the rule on my LAN interface to use this balancing group, odd things happened. What I would see is the firewall blocking traffic on the LAN interface, which would have the src IP of the public WAN interface (that is, these packets have src of ISP-provided public IP for the WAN interface). That is, I'd get firewall logs like this:


lan Mar 22 13:21:19 <WAN1 ip>:37526 205.196.6.165:443 tcp Default deny rule
lan Mar 22 13:21:19 10.0.1.11:62687 205.196.6.165:443 tcp Default to Download Balancing GWG
lan Mar 22 13:21:19 <WAN1 ip>:40375 205.196.6.142:443 tcp Default deny rule
lan Mar 22 13:21:19 10.0.1.11:62686 205.196.6.142:443 tcp Default to Download Balancing GWG
lan Mar 22 13:21:18 <WAN1 ip>:18871 205.196.6.165:443 tcp Default deny rule
lan Mar 22 13:21:18 10.0.1.11:62685 205.196.6.165:443 tcp Default to Download Balancing GWG


Odd, no? Trying to curl anything would result in 1/2 of the attempts hanging.

Disabling sticky connections seems to make everything work fine, as expected. Curling swaps between the balanced WANs, and so far I haven't seen any adverse effects. But sticky seems like a good idea in general... Any thoughts on what I might follow up with, to see if there really is some oddity with my set up, or if this is just something broken with opnsense?