Bad OpenVPN performance. aesni not working

Started by cybernik, April 12, 2021, 12:41:36 AM

Previous topic - Next topic
Hello all,

I changed from pfSense to OpnSense but have problems with OpenVPN performance.
I have two server which running VMWare on it. Both servers are connected with Gigabit to Internet.
OpnSense (21.1.4) is installed as Virtual Machine with same ressources that the old pfSense. (2.4)
It seems that aesni not working because if I call the command
/usr/local/bin/openssl speed -elapsed -evp aes-256-cbc

I get following response on OpnSense:
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 80113537 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 64 size blocks: 24620182 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 6538147 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 1664795 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 8192 size blocks: 209209 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 104272 aes-256-cbc's in 3.00s
Max 14Mbit

I get following response on pfSense:
Doing aes-256-cbc for 3s on 16 size blocks: 619734 aes-256-cbc's in 0.67s
Doing aes-256-cbc for 3s on 64 size blocks: 562171 aes-256-cbc's in 0.63s
Doing aes-256-cbc for 3s on 256 size blocks: 539758 aes-256-cbc's in 0.59s
Doing aes-256-cbc for 3s on 1024 size blocks: 359694 aes-256-cbc's in 0.35s
Doing aes-256-cbc for 3s on 8192 size blocks: 126727 aes-256-cbc's in 0.15s
More then 200MBit

What I doing wrong?
I searched the wiki and tried a lot things but without success.
Has someone an idea?


You mention two servers, one running pfSense, and the other running OPNsense. Are both of these hosts identical in their physical hardware?

I fired up my VMware lab quickly to test this.

This is an ESXi 6.7 host with the following specs:
32GB RAM
Xeon E3-1220v3

Both firewall VMs are VM Hardware Version 14. Each have 2x vCPU and 4GB of RAM. Both VMs have open-vm-tools installed. Both of these were run on the same host to rule out any physical hardware differences.

OPNsense 21.1.4 results:
root@OPNsense:~ # /usr/local/bin/openssl speed -elapsed -evp aes-256-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 60248450 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 64 size blocks: 21389420 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 256 size blocks: 5989001 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 1024 size blocks: 1552439 aes-256-cbc's in 3.02s
Doing aes-256-cbc for 3s on 8192 size blocks: 194846 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 16384 size blocks: 97346 aes-256-cbc's in 3.00s
OpenSSL 1.1.1k  25 Mar 2021
built on: Mon Apr  5 11:24:23 2021 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) blowfish(ptr)
compiler: cc -fPIC -pthread -Wa,--noexecstack -Qunused-arguments -O2 -pipe  -DHARDENEDBSD -fPIE -fPIC -fstack-protector-all -fno-strict-aliasing -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -D_THREAD_SAFE -D_REENTRANT -DNDEBUG
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-cbc     320490.46k   455122.41k   509733.99k   527153.59k   530677.50k   531638.95k


pfSense 2.4.5 results:
[2.4.5-RELEASE][root@pfSense.localdomain]/root: openssl speed -elapsed -evp aes-256-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 92354172 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 64 size blocks: 24674219 aes-256-cbc's in 3.04s
Doing aes-256-cbc for 3s on 256 size blocks: 6170823 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 1024 size blocks: 1572441 aes-256-cbc's in 3.05s
Doing aes-256-cbc for 3s on 8192 size blocks: 199125 aes-256-cbc's in 3.06s
OpenSSL 1.0.2u-freebsd  20 Dec 2019
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: clang
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256-cbc     491276.22k   519617.49k   525209.16k   528469.20k   532647.18k

Are these real world values you can measure on your OpenVPN?

Doing aes-256-cbc for 3s on 8192 size blocks: 126727 aes-256-cbc's in 0.15s

It says it runs the computation for 3 seconds but finishes it in 0.15s measured time? This cannot be correct.



Cheers,
Franco

This is someone who is posting erroneous results with manually edited times with a Subject "Bad OpenVPN performance. aesni not working" so it would be an eye catcher on the forum. Here are results from my OpenBSD 6.8 firewall which is using LibreSSL 3.2.2:

bsd# openssl speed -elapsed -evp aes-256-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 77539523 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 38732198 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 256 size blocks: 9877820 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 1024 size blocks: 2479467 aes-256-cbc's in 3.01s
Doing aes-256-cbc for 3s on 8192 size blocks: 310218 aes-256-cbc's in 3.01s
LibreSSL 3.2.2
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: information not available
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256-cbc     413330.92k   823613.83k   840124.81k   843534.86k   844312.02k


You can see the times on the right side are all around 3 seconds for my tests. It would seem cybernik is here to spread misinformation and make it seem OPNsense performs poorly compared to pfSense which is not the case as shown by opnfwb.

Quote from: packet loss on April 12, 2021, 08:31:59 PM
You can see the times on the right side are all around 3 seconds for my tests. It would seem cybernik is here to spread misinformation and make it seem OPNsense performs poorly compared to pfSense which is not the case as shown by opnfwb.
Let's go easy on this person. It's their first post and they state these are virtualized instances. It's easy to have some nested issues when dealing with VMs, I don't think there is any deliberate misinformation at this point. Just someone trying to get a new config up and running. I hope they come back and post more details and hopefully we can help them get to the bottom of it.

However I do agree with you that from all the tests that we can check so far, I'm not seeing any performance difference between OPNsense and pfSense. On top of that, the posted results do seem odd for the OP's pfSense numbers.