Will try to see if any of these make a difference. But in general I am very skeptic that it wont, and as nobody from the forum owners replied anything meaningful since this thread started (apart from basically saying its not practical to compare BSD and Linux)
https://calomel.org/freebsd_network_tuning.html# Disable Hyper Threading (HT), also known as Intel's proprietary simultaneous# multithreading (SMT) because implementations typically share TLBs and L1# caches between threads which is a security concern. SMT is likely to slow# down workloads not specifically optimized for SMT if you have a CPU with more# than two(2) real CPU cores. Secondly, multi-queue network cards are as much# as 20% slower when network queues are bound to real CPU cores and well as SMT# virtual cores due to interrupt processing inefficiencies.machdep.hyperthreading_allowed="0" # (default 1, allow Hyper Threading (HT))# Intel igb(4): The Intel i350-T2 dual port NIC supports up to eight(# input/output queues per network port, the card has two(2) network ports.## Multiple transmit and receive queues in network hardware allow network# traffic streams to be distributed into queues. Queues can be mapped by the# FreeBSD network card driver to specific processor cores leading to reduced# CPU cache misses. Queues also distribute the workload over multiple CPU# cores, process network traffic in parallel and prevent network traffic or# interrupt processing from overwhelming a single CPU core.## http://www.intel.com/content/dam/doc/white-paper/improving-network-performance-in-multi-core-systems-paper.pdf## For a firewall under heavy CPU load we recommend setting the number of# network queues equal to the total number of real CPU cores in the machine# divided by the number of active network ports. For example, a firewall with# four(4) real CPU cores and an i350-T2 dual port NIC should use two(2) queues# per network port (hw.igb.num_queues=2). This equals a total of four(4)# network queues over two(2) network ports which map to to four(4) real CPU# cores. A FreeBSD server with four(4) real CPU cores and a single network port# should use four(4) network queues (hw.igb.num_queues=4). Or, set# hw.igb.num_queues to zero(0) to allow the FreeBSD driver to automatically set# the number of network queues to the number of CPU cores. It is not recommend# to allow more network queues than real CPU cores per network port.## Query total interrupts per queue with "vmstat -i" and use "top -CHIPS" to# watch CPU usage per igb0:que. Multiple network queues will trigger more total# interrupts compared to a single network queue, but the processing of each of# those queues will be spread over multiple CPU cores allowing the system to# handle increased network traffic loads.hw.igb.num_queues="2" # (default 0 , queues equal the number of CPU real cores)# Intel igb(4): FreeBSD puts an upper limit on the the number of received# packets a network card can process to 100 packets per interrupt cycle. This# limit is in place because of inefficiencies in IRQ sharing when the network# card is using the same IRQ as another device. When the Intel network card is# assigned a unique IRQ (dmesg) and MSI-X is enabled through the driver# (hw.igb.enable_msix=1) then interrupt scheduling is significantly more# efficient and the NIC can be allowed to process packets as fast as they are# received. A value of "-1" means unlimited packet processing and sets the same# value to dev.igb.0.rx_processing_limit and dev.igb.1.rx_processing_limit . A# process limit of "-1" is around one(1%) percent faster than "100" on a# saturated network connection.hw.igb.rx_process_limit="-1" # (default 100 packets to process concurrently)
Quote from: ricsip on August 06, 2018, 02:18:30 pmHello pylox, alljust to be clear: I am testing through plain IP+NAT connection (PPPoE was mentioned as a possible bottleneck, but not tested YET), and that simple test setup has approx. only 40-50% of the max. possible throughput. If I add PPPoE, it will be even slower. That's the point of this thread, trying to find at least 1 credible person who is currently using APU2 with Opnsense, and he/she confirms their speed can reach 85-90% of gigabit (at least). Even if using over PPPoE!Then the next round will be to see, what needs to be fine-tuned to have the same perf at my ISP.......Hi ricsip,this ist very hard to find. Unfornatunatly i did not have a test setup with a APU2 (and not much time).But you can try different things:1. Change this tunables and measure...vm.pmap.pti="0" #(disable meltdown patch - this is an AMD processor)hw.ibrs_disable="1" #(disable spectre patch temporarily)2. Try to disable igb flow control for each interface and measurehw.igb.<x>.fc=0 #(x = number of interface)3. Change the network interface interrupt rate and measurehw.igb.max_interrupt_rate="16000" #(start with 16000, can increased up to 64000)4. Disable Energy Efficiency for each interface an measuredev.igb.<x>.eee_disabled="1" #(x = number of interface)Should be enough for the first time...;-) regards pylox
Hello pylox, alljust to be clear: I am testing through plain IP+NAT connection (PPPoE was mentioned as a possible bottleneck, but not tested YET), and that simple test setup has approx. only 40-50% of the max. possible throughput. If I add PPPoE, it will be even slower. That's the point of this thread, trying to find at least 1 credible person who is currently using APU2 with Opnsense, and he/she confirms their speed can reach 85-90% of gigabit (at least). Even if using over PPPoE!Then the next round will be to see, what needs to be fine-tuned to have the same perf at my ISP.......
Looks like I'm the only one willing to chip in?