Hi all,
I know that there are some threads discussing OpenVPN performance already, but most problems ended up to be caused by bad configurations and I don't think that's the problem in my case.
My setup is as follows:
I've got an OpenVPN server running on a quite powerful VPS (host-passthrough of two Xeon E5-2680v4 cores) which performs really well. The OS is Debian 9. OpenSSL benchmarking shows following results on that VPS:
VPS $ openssl speed -elapsed -evp aes-256-gcm
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-gcm for 3s on 16 size blocks: 48248172 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 64 size blocks: 35021689 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 256 size blocks: 18801677 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 1024 size blocks: 6310897 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 8192 size blocks: 927724 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 16384 size blocks: 473874 aes-256-gcm's in 3.00s
OpenSSL 1.1.0f 25 May 2017
built on: reproducible build, date unspecified
options:bn(64,64) rc4(16x,int) des(int) aes(partial) blowfish(ptr)
compiler: gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DRC4_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPADLOCK_ASM -DPOLY1305_ASM -DOPENSSLDIR="\"/usr/lib/ssl\"" -DENGINESDIR="\"/usr/lib/x86_64-linux-gnu/engines-1.1\""
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-gcm 257323.58k 747129.37k 1604409.77k 2154119.51k 2533305.00k 2587983.87k
Then, there is an OpenVPN client on an OPNsense running locally as KVM guest on a Pentium J3710 with all 4 cores passed through. OpenSSL benchmarks on that OPNsense are as follows:
root@OPNsense:~ # /usr/local/bin/openssl speed -elapsed -evp aes-256-gcm
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-gcm for 3s on 16 size blocks: 20936402 aes-256-gcm's in 3.05s
Doing aes-256-gcm for 3s on 64 size blocks: 9676093 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 256 size blocks: 3293017 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 1024 size blocks: 907450 aes-256-gcm's in 3.01s
Doing aes-256-gcm for 3s on 8192 size blocks: 117760 aes-256-gcm's in 3.04s
OpenSSL 1.0.2n 7 Dec 2017
built on: reproducible build, date unspecified
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -pthread -D_THREAD_SAFE -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -O3 -Wall -O2 -pipe -fPIE -fPIC -Werror -Qunused-arguments -fstack-protector-all -fno-strict-aliasing -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-gcm 109661.77k 206423.32k 281004.12k 308938.41k 317430.10k
These values sound reasonable to me. Without the -evp flag I get ~30 MB/s, makes sense. However, it is not relevant if aesni.ko is loaded. After unloading it, the speeds stay exactly the same. This is probably due to OpenSSL using some custom implementation and not aesni.ko.
The OpenVPN config looks as follows:
dev ovpnc1
verb 1
dev-type tun
dev-node /dev/tun1
writepid /var/run/openvpn_client1.pid
script-security 3
daemon
keepalive 10 60
ping-timer-rem
persist-tun
persist-key
proto udp
cipher AES-256-GCM
auth SHA512
up /usr/local/etc/inc/plugins.inc.d/openvpn/ovpn-linkup
down /usr/local/etc/inc/plugins.inc.d/openvpn/ovpn-linkdown
tls-client
client
nobind
management /var/etc/openvpn/client1.sock unix
remote xxxxx
ca /var/etc/openvpn/client1.ca
cert /var/etc/openvpn/client1.cert
key /var/etc/openvpn/client1.key
resolv-retry infinite
compress
Now comes the curious part. An iperf3 through the VPN link from server to client gives 10-25 Mbit/s, even though the DSL line this goes through has 50 Mbit/s. iperf directly through WAN (without VPN) proves the 50 Mbit/s work. The CPU of the OPNsense host is at 25%, meaning one of the four cores is under full load.
Changing the cipher to AES-256-CBC gives roughly the same results.
I also tried connecting to the OpenVPN server directly from an i7 laptop behind the OPNsense box, reaching 48 Mbit/s without problems.
So apparently OpenVPN is not using the hardware crypto while OpenSSL does... Do you have an idea why this is the case?
Thanks in advance!
Best regards
Robert
EDIT: OPNsense version is as follows:
OPNsense 18.1.5-amd64
FreeBSD 11.1-RELEASE-p8
OpenSSL 1.0.2n 7 Dec 2017