I'm at a loss trying to understand my issues and could do with some help please.
I am running opnsense on Proxmox. I'm passing through a trunk with vlans 10,20,30,40,50,60. I also have another ethernet that I use for a PPPOE connection.
My internet works, seems reliable and has no issue from my main vlans of 30 (servers) and 50 (wifi).
I created, for each vlan a firewall rule allowing access to everything. I just wanted it to work, then the plan was to set up the rules, and disable the global rule.
But I'm getting hung ssh, nfs and cifs connections. Everything to the internet is fine, but inter vlan just seems to work intermittently and badly.
Any advice, ideas on how to diagnose please ?
Working ping but hanging bulkier traffic can be caused by MTU mismatches. Perhaps some inconsistent jumbo frames?
Maybe I dont understand correctly but
QuoteI am running opnsense on Proxmox. I'm passing through a trunk with vlans 10,20,30,40,50,60. I also have another ethernet that I use for a PPPOE connection.
My internet works, seems reliable and has no issue from my main vlans of 30 (servers) and 50 (wifi).
But then you show rules for VLAN50 as well live log.
So what exactly has the problem here? The VLAN50 or the other VLANs 10,20,40,60?
Can you ping from a device in these VLANs/Subnets to the GWs?
Do you have unique subnets for these VLANs on OPNsense?
Do you have proper MASK configured on devices in these VLANs?
Regards,
S.
Quote from: bartjsmit on November 13, 2024, 09:14:03 AM
Working ping but hanging bulkier traffic can be caused by MTU mismatches. Perhaps some inconsistent jumbo frames?
All my MTU should be 1500 or less. It's certainly less for the wireguard and PPPOE interfaces. The rest is configured as 1500
Quote from: Seimus on November 13, 2024, 11:08:58 AM
Maybe I dont understand correctly but
QuoteI am running opnsense on Proxmox. I'm passing through a trunk with vlans 10,20,30,40,50,60. I also have another ethernet that I use for a PPPOE connection.
My internet works, seems reliable and has no issue from my main vlans of 30 (servers) and 50 (wifi).
But then you show rules for VLAN50 as well live log.
So what exactly has the problem here? The VLAN50 or the other VLANs 10,20,40,60?
Can you ping from a device in these VLANs/Subnets to the GWs?
Do you have unique subnets for these VLANs on OPNsense?
Do you have proper MASK configured on devices in these VLANs?
Regards,
S.
I'm having issues with intervlan traffic. Going from 50 to 30. All traffic 30-30 which doesn't touch opnsense is fine. and surprisingly web traffic from 50-30 seems to work fine, but it just may be that the packet size is small, or the session state different to cifs/nfs/ssh
All my masks are /24, each vlan is on a different subnet 10.150.10.0/24 for vlan 10, 10.150.30.0/24 for vlan 30 etc
All devices can ping both the gateways and the upstream servers.
All the devices can see each other, it seems that the session state is being reset by the pf I think perhaps.
I've just tested it. if I turn the packet filter off. All my problems go away.
more information:
root@OPNsense:~ # pfctl -si
Status: Enabled for 0 days 02:58:23 Debug: Loud
Interface Stats for vtnet0_vlan30 IPv4 IPv6
Bytes In 78388748 0
Bytes Out 1494252352 0
Packets In
Passed 828852 0
Blocked 162 0
Packets Out
Passed 1374002 0
Blocked 0 0
State Table Total Rate
current entries 1768
searches 45657736 4265.9/s
inserts 653014 61.0/s
removals 651256 60.8/s
Counters
match 728116 68.0/s
bad-offset 0 0.0/s
fragment 0 0.0/s
short 0 0.0/s
normalize 4 0.0/s
memory 0 0.0/s
bad-timestamp 0 0.0/s
congestion 0 0.0/s
ip-option 0 0.0/s
proto-cksum 0 0.0/s
state-mismatch 37898 3.5/s
state-insert 10 0.0/s
state-limit 0 0.0/s
src-limit 0 0.0/s
synproxy 0 0.0/s
map-failed 0 0.0/s
You have there >
Quotestate-mismatch 37898 3.5/s
If a FW sees out of Order for TCP he will block it, TCP based traffic can pass thru a FW only after a Handshake is established.
S > D: TCP S
D > S: TCP SA
S > D: TCP A
Check the Live log. Create a filter with a specific source and destination from which you will test from to. Then if you see a session appear that is blocked, click the magnify glass and check the TCP Flags.
if there is really a TCP out of order it means you traffic is leaking somewhere or there is asymmetrical routing.
Regards,
S.
Quote from: Seimus on November 16, 2024, 03:01:07 AM
You have there >
Quotestate-mismatch 37898 3.5/s
If a FW sees out of Order for TCP he will block it, TCP based traffic can pass thru a FW only after a Handshake is established.
S > D: TCP S
D > S: TCP SA
S > D: TCP A
Check the Live log. Create a filter with a specific source and destination from which you will test from to. Then if you see a session appear that is blocked, click the magnify glass and check the TCP Flags.
if there is really a TCP out of order it means you traffic is leaking somewhere or there is asymmetrical routing.
Regards,
S.
Thank you !!
I found that I had a server with a foot in the VLAN 30 and VLAN 50. Both had the same default gateway but the server decided to reply on the interface that was already in the vlan rather than responding via the df gateway.
Much appreciated !
That's how it's supposed to work. A host will always prefer a locally connected interface over a static route. Don't connect hosts via more than one interface/network.
Quote from: Patrick M. Hausen on November 18, 2024, 01:36:30 PM
That's how it's supposed to work. A host will always prefer a locally connected interface over a static route. Don't connect hosts via more than one interface/network.
NM didn't read enough of the posts ;) .
Just a comment though, more specific routes are always preferred regardless of connection, but in this case, the subnet mask was the same, and then, distance / metric / connection are relevant.
Quote from: bimbar on November 18, 2024, 01:58:49 PM
Just a comment though, more specific routes are always preferred regardless of connection, but in this case, the subnet mask was the same, and then, distance / metric / connection are relevant.
Correct, I overgeneralised a bit, sorry. More specifics will take precedence over locally connected.
In hindsight I should have known. I have come across problems like this before.
The issue was caused by the fact that I had implemented vlans and wanted to maintain connectivity in case I broke a vlan and wanted connectivity from elsewhere in order to fox it. Not a dissimilar situation to having something visible on a management interface on one vlan vs a service interface on another vlan.
It is difficult sometimes to make the leap of understanding that an interface has a higher priority than a default gateway.
I apologise if you think this was a waste of peoples time. It was not intended to be.
No worries,
multi-homed setup is not unusual. You just need to make sure all is configured well from perspective of the routing.
As mentioned in your case, more specific routes will take precedence, if they are equal than Administrative distance play a huge role. Directly connected has better AD than a static route.
See >
Quote
Route Source Default Distance Values
Connected interface 0
Static route 1
Enhanced Interior Gateway Routing Protocol (EIGRP) summary route 5
External Border Gateway Protocol (BGP) 20
Internal EIGRP 90
Interior Gateway Routing Protocol (IGRP) 100
Open Shortest Path First (OSPF 110
Intermediate System-to-Intermediate System (IS-IS) 115
Routing Information Protocol (RIP) 120
Exterior Gateway Protocol (EGP) 140
On Demand Routing (ODR) 160
External EIGRP 170
Internal BGP 200
Unknown* 255