weird / slow upload speed to PBR squid

Started by Lukas L., February 11, 2022, 01:39:16 PM

Previous topic - Next topic
I have recently switched from 21.7.8 to 22.1 just to found out upload via our PBRed squid proxy is poor (just 10Mbps per single server). We use proxy for http/https inspection and slight http cacheing, terminating ads and malware connections - so realy no real MITM for users.

Not changed my configuration - so my rule only changes gateway on the matching packets. This worked fine for some time - as my download and upload speed was about right for ISP speeds (1Gbps).  But when upgraded to 22.1 it went down and some web services are malfunctioning (ie. uploading PDF forms to the goverment servers etc. timeouts etc.)

My opnsense runs on ESXi 6.7U2 host with intel I210 and I350 with igbn driver. opnsense has vnics as vmxnet3. LANs are on a single vswitch with esxi tagged vlans, wan is on a separate vswitch. All MTUs set to 1500. All offloading in opnsense disabled. Zenarmor for all the LAN ifaces and surricata for WAN. I have one interface with VLANs in opnsense - parent interface assigned and enabled. Only tunables were queue settings for vmx, but it have been dropped recently. I know (from this phorum as well) it has been replaced by dev.vmx.0.iflib.... keys anyway.

I have tested all connections with iperf3 - client to fw, client to proxy, fw to proxy with similar results ~900Mbps TCP. While testing UDP to proxy - everything went ok, but testing against opnsense printouts

iperf3: OUT OF ORDER - incoming packet = 54887 and received packet = 0 AND SP = 277090
iperf3: OUT OF ORDER - incoming packet = 221983 and received packet = 0 AND SP = 277226
iperf3: OUT OF ORDER - incoming packet = 178360 and received packet = 0 AND SP = 277228
iperf3: OUT OF ORDER - incoming packet = 229939 and received packet = 0 AND SP = 281851
iperf3: OUT OF ORDER - incoming packet = 266111 and received packet = 0 AND SP = 282071
iperf3: OUT OF ORDER - incoming packet = 222533 and received packet = 0 AND SP = 282784
iperf3: OUT OF ORDER - incoming packet = 208849 and received packet = 0 AND SP = 282826
iperf3: OUT OF ORDER - incoming packet = 258064 and received packet = 0 AND SP = 283012
iperf3: OUT OF ORDER - incoming packet = 258554 and received packet = 0 AND SP = 283081


When I set up the client to directly use the proxy - everything works fine (930/930Mbps in inet tests). But if I put it via opnsense (ie. transparent mode) I get 930Mbps/8Mbps (8Mbps single server, about 30Mbps multiple servers).

So I went haywire and looked everywhere - but as I have test installation avaible on the another server (actually upgraded to 22.1 as well) I was able to check the behaviour is the same from the other one.

Later today I've grabbed 21.7 ISO and reinstalled it over my test instalation cloned VM, imported config and tested it. Now my incoming and outgoing rates on proxy can easily reach 900Mbps.

So I think I sorted out my infrastructure, cables, VLANs, proxy instalation and it's settings (deb10) and firewall rules. But I understand opnsense underwent mayor OS version change - but the same configuration is not working well anymore.

What are the settings I should look at? Do I miss something? Not realy used to FBSD systems - but I did not find anything useful -  actualy not here...

I'm going to let my testing installation VM clone run 21.7 (I must remember to take snapshots/backups before opnsense upgrade - this is not for the first time I got screwed a little bit) and run my main rig with 22.1 - in case somebody helps me rule it out...

THX



1) data without policy based route  -  full speed (outgoing from router)
2) data send directly to proxy via browser/system setting - full speed
3) data targeted to proxy GW - download full, upload 10Mbps

3) is this the same gateway as with 1) or a different one?

no - proxy has it's own dGW - it's not sending data back to opnsense to route it out to the internet. What's more weird is that with 21.7.8 it works fast - I had no reason to question my PBR rule so far

I still have no idea about the reason, just trying to understand what could be the case.
So when you disable PBR that proxy goes out your standard default gateway it's fast.

Any cance to change the default gateway to the PBR line and test?

February 17, 2022, 02:48:14 PM #7 Last Edit: February 17, 2022, 02:49:50 PM by Lukas L.
If I disable PBR rule data goes to opnsense dGW and it's fast up and down. If I set proxy in my browser - data goes fast as well. But when I use the PBR rule - data should (should - but see the test below) go to the proxy and then out via it's dGW - download speed is ok, but upload is only 10Mbps.
And all it started with 22.1 upgrade. I let my other opnsense instance upgraded to 22.1 and reinstalled my main rig to 21.7.8 and my speeds are ok.

In fact just now I did this test - client with dGW as (I'm using iptraf-ng on the proxy):
opnsense 21.7.8 - I can confirm hundreds of Mbps (ie. 914/920Mbps) up and down going through the proxy iface (10k pps both sides)

opnsense 22.1_1 - I can read hundreds Mbps out from proxy (10k pps both sides) and then 40Mbps multistream upload (rychlost.cz) on the client, but only <500kbps (about 50pps) on the proxy in (ie. output of client).

So with opnsense 22.1 - data seems to go up elsewhere - but why and how?

Firewall : Settings : Advanced : Shared Forwarding, can you uncheck and try again?

it's unchecked in the both versions - needles to say proxy is not running inside opnsense - it's solitary instalation in the vm. opnsense has only one WAN

Oh, so it can be anything coming with FreeBSD 13 or hypervisor compatibility :/
Hm, cannot replicate such a setup here, sorry

March 02, 2022, 11:54:11 AM #11 Last Edit: March 02, 2022, 11:59:00 AM by Lukas L.
I believe I've found something. I turned on the "Log packets matched from the default pass rules put in the ruleset"  on the both installations - so I hope I have the same settings.

on the 21.7 working fast I have this in live fw log

vmx_2 Mar 2 11:38:39 10.0.0.20:52258 81.0.212.201:443 tcp PROXY redirect
vmx_2 Mar 2 11:38:39 10.0.0.20:52257 88.86.101.2:443 tcp PROXY redirect
vmx_2 Mar 2 11:38:39 10.0.0.20:52256 81.0.212.201:443 tcp PROXY redirect
vmx_2 Mar 2 11:38:39 10.0.0.20:52255 52.210.251.145:443 tcp PROXY redirect
vmx_2 Mar 2 11:38:39 10.0.0.20:52254 52.210.251.145:443 tcp PROXY redirect
vmx_2 Mar 2 11:38:39 10.0.0.20:52253 77.78.109.86:443 tcp PROXY redirect
vmx_2 Mar 2 11:38:39 10.0.0.20:52252 77.78.109.86:443 tcp PROXY redirect
vmx_2 Mar 2 11:38:35 10.0.0.20:52251 142.251.36.67:443 tcp PROXY redirect
vmx_2 Mar 2 11:38:26 10.0.0.20:52250 52.210.251.145:443 tcp PROXY redirect


but on the OPNsense 22.1.2

vmx_2 2022-03-02T11:31:51 10.0.0.20:52049 81.0.212.201:443 tcp let out anything from firewall host itself
vmx_2 2022-03-02T11:31:51 10.0.0.20:52049 81.0.212.201:443 tcp PROXY redirect
vmx_2 2022-03-02T11:31:51 10.0.0.20:52048 88.86.101.2:443 tcp let out anything from firewall host itself
vmx_2 2022-03-02T11:31:51 10.0.0.20:52048 88.86.101.2:443 tcp PROXY redirect
vmx_2 2022-03-02T11:31:51 10.0.0.20:52047 18.203.13.148:443 tcp let out anything from firewall host itself
vmx_2 2022-03-02T11:31:51 10.0.0.20:52047 18.203.13.148:443 tcp PROXY redirect
vmx_2 2022-03-02T11:31:51 10.0.0.20:52046 18.203.13.148:443 tcp let out anything from firewall host itself
vmx_2 2022-03-02T11:31:51 10.0.0.20:52046 18.203.13.148:443 tcp PROXY redirect
vmx_2 2022-03-02T11:31:51 10.0.0.20:52045 77.78.109.86:443 tcp let out anything from firewall host itself
vmx_2 2022-03-02T11:31:51 10.0.0.20:52045 77.78.109.86:443 tcp PROXY redirect
vmx_2 2022-03-02T11:31:51 10.0.0.20:52044 77.78.109.86:443 tcp let out anything from firewall host itself
vmx_2 2022-03-02T11:31:51 10.0.0.20:52044 77.78.109.86:443 tcp PROXY redirect


PBR rule has quick ticked on and sloppy states set.

I think firewall sends data to the proxy as well to (default) routing - for example
one logged packet to proxy: seq   875304436
next line: seq   875304436 (let out anything from firewall host itself)

I don't see any let out anything from firewall host itself rules in the live log view on the 21.7.x installation.

Am I missing something on the 22.x? As I wrote - I have to installations as VMs - I have recently upgraded both, but after this difficulties, I revert my main rig to 21.7 and used the other one as a guinea pig for 22.1

I've wiresharked both installations from my test client and I see many TCP out-of-orders and TCP duplicates (mainly ACK, but also some PSHs). I tried to analyze it by L2 headers - to see what host sends the data - it seems req goes to set default GW mac and data returns from the proxy. Without the PBR rules, req goes to default gw mac and returns the same way. But with the 22.1 as the dGW I see many TCP retransmits and out-of-order ACKs etc.