OPNsense Forum

Archive => 16.7 Legacy Series => Topic started by: hedberg on January 04, 2017, 02:44:00 pm

Title: 100.000+ NTP queries a second
Post by: hedberg on January 04, 2017, 02:44:00 pm
I have purchased a new NTP server that is able to handle 100.000 NTP queries a second. It is going to be a part of the pool.ntp.org project and I expect quite a bit of load on it.

I was warned by the manufacturer that a lot of network equipment and firewalls might have problems handling 100.000+ requests a second or about 100Mbit traffic of very small packets. I assume it is because most modern firewalls have statefull inspection and it probably require a lot of memory to server that many small packets.

Currently I have OpnSense installed on VMware on an Atom 2750 based motherboard. It has 32GB of memory with 2GB allocated to OpnSense at the moment together with 2 of 8 cores. It has 4Gbit Intel interfaces and the internet connection is 500/500Mbit. For the ones who might be interested in the NTP server it is a LeoNTP.

Has anybody tried this on OpnSense with a similar hardware platform? I would be grateful for any suggestions or concerns you might have.

(This is installed in a private home, so there is nobody else being affected if the firewall can’t cope with it).




Title: Re: 100.000+ NTP queries a second
Post by: franco on January 05, 2017, 04:43:02 pm
Hi hedberg,

That's an interesting use case! Is the OPNsense actively answering queries or just doing the routing for another server? For the latter, we've had people who route WiFi for a full stadium of people through OPNsense just fine.

It sounds a bit low with two cores and 2 GB RAM if it is answering the queries too. You'd want it to be able to process multiple threads at the same time so 4 or more CPUs should help to ease load, RAM depends on the current usage. It may be alright.

I would like to know how the system is performing under the current setup in terms of system load and CPU usage?


Cheers,
Franco
Title: Re: 100.000+ NTP queries a second
Post by: hedberg on January 06, 2017, 10:42:49 am
Currently it is configured to route it to an internal Linux server on an internal IP (NAT) and I would like it to do the same with the new appliance box (LeoNTP).

Yesterday it was made a part of the Chinese pool of NTP server together with other volunteers and the traffic went through the roof – or so I thought anyway. In reality it was only 5-10Mbit, but it sure made the firewall work. The number of states went to aproxx. 300.000 with only 5-10Mbit of traffic, so I am happy I added more memory a couple of hours before (8Gb) and 2 CPU's extra. With 8Gb the default number of states is is around 800-900.000 and it made good use of it.

Is it because it is NAT that it keep a state for a UDP packet?

Currently I am considering to re-install the firewall, so OpnSense isnt installed in a VM, but gets all the hardware to play with, but it still seems (to me) that it would fail if I actually recieved just 15Mbit of NTP traffic. 
Title: Re: 100.000+ NTP queries a second
Post by: fabian on January 06, 2017, 11:59:11 am
Just an idea: Can you try to turn off state tracking for this service (advanced firewall settings) - note that you will need to pass the reverse channel too when state tracking is disabled.

Another idea is changing the state timeout of UDP to something less so it will also free the state tracking entry earlier.

Title: Re: 100.000+ NTP queries a second
Post by: s4rs on January 06, 2017, 04:35:00 pm
Just curious if you ran iperf through the firewall for a baseline? I have a dual core 1.7GHz celeron that I can get 400Mb/s with tcp iperf (17.1.b). I would expect udp to perform better. This is running Opnsense under Fedora 25 Server as a KVM guest. BTW in my testing 17.1.b performs much better under a VM than does 16.7.11
Title: Re: 100.000+ NTP queries a second
Post by: fabian on January 06, 2017, 05:24:13 pm
Just curious if you ran iperf through the firewall for a baseline? I have a dual core 1.7GHz celeron that I can get 400Mb/s with tcp iperf (17.1.b). I would expect udp to perform better. This is running Opnsense under Fedora 25 Server as a KVM guest. BTW in my testing 17.1.b performs much better under a VM than does 16.7.11

UDP does not have connections so the state tracking is not really possible without knowing the protocol. Because of this, firewalls use a timeout for UDP to have some kind of a state for connection tracking.
Because of this, a long timeout will result in a lot of state table entries because it will wait until no Packets are forwarded for a given time range. On TCP you have a defined end of a connection: When you receive a segment with FIN or RST flag (the case of a timeout is still possible but it should not happen often - for example if the connection breaks).
Title: Re: 100.000+ NTP queries a second
Post by: hedberg on January 06, 2017, 07:56:37 pm
Just an idea: Can you try to turn off state tracking for this service (advanced firewall settings) - note that you will need to pass the reverse channel too when state tracking is disabled.

Is that for the firewall rule itself or for the entire firewall?

Another idea is changing the state timeout of UDP to something less so it will also free the state tracking entry earlier.

I can't seem to find that option. Is is a command line thing I need to add?
Title: Re: 100.000+ NTP queries a second
Post by: will on January 06, 2017, 08:05:14 pm
Question: Why are you even putting this device behind NAT, a firewall is one thing but NAT should not be used here in my frank opinion.

Anyway, really what you are going to care about here is how fast your box can forward traffic in packets per-second (PPS), not bit/s because as you have discovered the actual throughput is very low.  Also small sized packets, which will be more taxing on the CPU.

A useful tool to hammer your box with here is something like Cisco TRex (https://trex-tgn.cisco.com).

Here are a few pointers though:

1) Run the OPNsense box on bare metal, or if you must use a VM then at least use some form of direct-io to attach the NICs directly.

2) OPNsense is a software router, performance is CPU and memory bound, get the fastest you can in both cases - the Atoms are great boxes but if outright pps is what you are chasing then an E3 or E5 Xeon is what you should be going for, look for the "frequency optimised" chips perhaps, more GHz less cores.

3) Set the firewall to expire state entries aggressively - Firewall > Settings > Advanced "Firewall Optimization - Aggressive"

Title: Re: 100.000+ NTP queries a second
Post by: hedberg on January 06, 2017, 08:17:22 pm
Just curious if you ran iperf through the firewall for a baseline? I have a dual core 1.7GHz celeron that I can get 400Mb/s with tcp iperf (17.1.b). I would expect udp to perform better. This is running Opnsense under Fedora 25 Server as a KVM guest. BTW in my testing 17.1.b performs much better under a VM than does 16.7.11

I have done any systematic testing on it. I just did some basic speed tests. However it can easily move much more data when it is larger packets. Using FTP it very easily transfers 500Mbit from the Internet to an internal VM on the same host and place it on a SMB share on a Synology box. I did a test on a smaller box with the same type of nics. A J1900 using ntttcp It could quite easily move 1Gbit between interfaces - but again larger packets.

I realize it sounds like I am complaining - I am not - I just find it fascinating and would like to optimize it as much as possible.

 
Title: Re: 100.000+ NTP queries a second
Post by: hedberg on January 06, 2017, 08:40:48 pm
Question: Why are you even putting this device behind NAT, a firewall is one thing but NAT should not be used here in my frank opinion.

I do not have any other option. This is installed in a private home and the Internet connections that is payable doesnt offer the possibility for multiple IPs.

Anyway, really what you are going to care about here is how fast your box can forward traffic in packets per-second (PPS), not bit/s because as you have discovered the actual throughput is very low.  Also small sized packets, which will be more taxing on the CPU.

A useful tool to hammer your box with here is something like Cisco TRex (https://trex-tgn.cisco.com).

Here are a few pointers though:

1) Run the OPNsense box on bare metal, or if you must use a VM then at least use some form of direct-io to attach the NICs directly.

2) OPNsense is a software router, performance is CPU and memory bound, get the fastest you can in both cases - the Atoms are great boxes but if outright pps is what you are chasing then an E3 or E5 Xeon is what you should be going for, look for the "frequency optimised" chips perhaps, more GHz less cores.

3) Set the firewall to expire state entries aggressively - Firewall > Settings > Advanced "Firewall Optimization - Aggressive"

I'll definitely try bare-metal. I configured the Firewall Optimization option to aggressive and will monitor if it gives me problems. Xeon will probably be a little to expensive - power is unfortunately quite expensive here and most of the models seems quite expensive - but I'll see how much performance I can get from the existing box.

Thanks for the pointers.
Title: Re: 100.000+ NTP queries a second
Post by: fabian on January 07, 2017, 04:55:22 pm
Just an idea: Can you try to turn off state tracking for this service (advanced firewall settings) - note that you will need to pass the reverse channel too when state tracking is disabled.

Is that for the firewall rule itself or for the entire firewall?

This is for the rules - one forward, one backward (no state tracking makes rules more complex).

Another idea is changing the state timeout of UDP to something less so it will also free the state tracking entry earlier.

I can't seem to find that option. Is is a command line thing I need to add?


It is a setting for PF - if it is not in the GUI, you will need to add a feature request: https://www.openbsd.org/faq/pf/options.html (https://www.openbsd.org/faq/pf/options.html)
Title: Re: 100.000+ NTP queries a second
Post by: hedberg on January 07, 2017, 10:48:42 pm
Thanks for your help.

I have added a request at Github - I hope that is the correct way.