Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - rmayr

#1
22.1 Legacy Series / Re: DNS aliases not resolving
March 11, 2022, 12:18:39 PM
I have exactly the same issue.
#2
Hi everybody,

I currently face a strange issue that I can't understand and hope that anybody can enlighten me.

Setting:
* Upstream connection (igb0) is behind an ISP router and hence already behind a NAT. MTU on that interface is 1500.
* Wireguard tunnel to another OPNsense instance, MTU of these interfaces manually set to 1420.
* Pinging from a Linux host on one side to a Linux server on another side using "ping -4 -s 1392 -M want <serverIP>" works as expected and replies arrive with latency of 10-11ms.
* Doing the same with "ping -4 -s 1394 -M want <serverIP>" yields on the Linux client:

PING <serverIP> (<serverIP>) 1394(1422) Bytes Daten.
Von 192.168.64.254 icmp_seq=1 Frag needed and DF set (mtu = 1420)


(192.168.64.254 is the static IP on the LAN interface of OPNsense, while the default route of the Linux client is 192.168.64.1, which is a virtual CARP IP. This seems to work as expected, with the exception of replies not being received.)

Sniffing on the firewall itself on the Wireguard interface:

root@firewall2:~ # tcpdump -n -i wg1 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wg1, link-type NULL (BSD loopback), capture size 262144 bytes
11:44:11.804655 IP <clientIP> > <serverIP>: ICMP echo request, id 77, seq 513, length 1400
11:44:11.804689 IP <clientIP> >  <serverIP>: ip-proto-1
11:44:11.815346 IP  <serverIP> > <clientIP>: ICMP echo reply, id 77, seq 513, length 1400
11:44:11.815501 IP  <serverIP> > <clientIP>: ip-proto-1


So the (fragmented) replies are coming back in through the Wireguard tunnel.

However, these are dropped by the default rule:

VPN_INS      2022-03-11T12:01:49   <serverIP>   <clientIP>   icmp   Default deny rule

(VPN_INS is the assigned interface name for wg1.)

What I don't understand: Why would unfragmented ICMP replies correctly match the state table entry caused by the ICMP echo request, but wouldn't match when fragmented? Is interface scrubbing (which is not turned off) messing up the replies here?
#3
22.1 Legacy Series / Re: IPv6 working properly???
February 02, 2022, 08:13:42 PM
I can confirm that (on an APU4d4) IPv6 PD/track based advertisements are only working without the option "Allow manual adjustment of DHCPv6 and Router Advertisements" on 22.1 at the moment. When selected, RAs just stop shortly after restarting radvd and dhcpv6 doesn't seem to respond reliably. I have not yet tried to debug the differences in radvd.conf as generated.
#4
Another confirmation that clearing cache in Firefox solved this issue.
#5
This seems to be fixed in 21.1.7. Thanks!
#6
With my standard configuration, unbound can no longer start after updating to 21.1.6. I tracked down the issue to a wrong format being exported to /var/unbound/host_entries.conf when multiple domains are listed in the "domain search list": then all the domains are added to the static lease entry in /var/unbound/host_entries.conf instead of only the "domain name" for the respective host. This prevents unbound from starting.

The current workaround is to remove additional domains from the "domain search list" and set it the same as "domain name", but this breaks some other use cases in my setup.
#8
The concerning bit is the heavy side effect of having IPsec enabled for completely unrelated traffic. It points to a general performance bottleneck in the kernel.
#9
Further tests on an 8-core ProxMox VM server with 4 cores assigned to a OpnSense test instance shows 1.6 Gbps throughput limit with the CPU not fully loaded (only 2 out of 4 cores in the VM being used). Putting traffic flow analysis and Surricata into the mix, I am not sure how a hardware like the one sold by Decisio would reach 5 Gbps with the current OpnSense version. What is the big difference we are missing?
#10
Quote from: mimugmail on April 15, 2021, 08:51:22 PM
Quote from: rmayr on April 15, 2021, 05:48:44 PM
Indeed. This happens not only for LAN->WAN traffic, but also for traffic between two different internal (e.g. LAN and DMZ) segments with no NAT involved and only directly connected routes in use. I have not yet tried with VTI instead of policy based IPsec, but this issue may make OpnSense a non-starter for the intended production use at our university institute (that is the reason why I am now spending far too much time putting OpnSense through such tests).

You really want to run a university institute in production with a APU device??  :o

No, not on an APU - this is my test device to find some of the issues in parallel to a VM installation (which seems to have the same performance issues, actually). We would only put it in production on a faster hardware, but don't expect such bottlenecks to necessarily change. We are aiming for at least 2-3, better 5Gbps throughput between some of the segments, and definitely need IPsec and flow analysis and would like (but don't necessarily require) IDS/IPS on. Given our current experience, I am not sure how likely that is.
#11
Quote from: Ricardo on April 15, 2021, 03:29:45 PM
Regarding the policy based ipsec enablement immediately halves the throughput even if the traffic is bypassing the vpn tunnel, is very concerning. I also have some policy based vpn tunnels, so it may further limit my WAN speed, even if that traffic is not getting routed into the vpn tunnel. Big mess, I have to say, and years can pass by without resolution :(

Indeed. This happens not only for LAN->WAN traffic, but also for traffic between two different internal (e.g. LAN and DMZ) segments with no NAT involved and only directly connected routes in use. I have not yet tried with VTI instead of policy based IPsec, but this issue may make OpnSense a non-starter for the intended production use at our university institute (that is the reason why I am now spending far too much time putting OpnSense through such tests).
#12
Further datapoints: Having flowd_aggregate running (with all local VLAN interfaces monitored) drops around 50Mbps throughput when samplicate is stopped and about 250Mbps when both are running. But this part is - if not good - than at least explainable, as it certainly adds CPU load. The IPSec related throughput drop for streams not hitting IPSec tunnels (which stacks with the netflow drop, i.e. when both are enabled, I only average around 250Mbps total throughput) is what puzzles me.
#13
And just to confirm: yes, there are two hosts on different sides of the box, one iperf3 server, one client.

coreboot has been updated to the latest available version. PowerD is running and normally set to Hiadaptive as I actually want to save some power for most of the time when there is little traffic. A quick comparison doesn't seem to show a measurable difference between Hiadaptive and Maximum, though performance drops when I disable PowerD altogether (probably confirming the suspicion that the CPU is stuck at 600MHz without it running).
#14
Update: I found the culprit for the drop of more than 1/3 in throughput: just enabling IPSec (with configured tunnels up and running) drops locally routed performance from 750-800Mbps to 500Mbps for traffic that doesn't go through the tunnel. This is using IPSec policies and not with a virtual network device.
#15
I have been struggling with performance on the APU4. While in initial testing, I was able to get around 700MBit/s with 2 iperf3 streams, with my fully configured firewall rule set (but minimal rules for the actual path I am testing), I am now down to around 250MBit/s and can't get it higher.

Settings from this thread, from https://www.reddit.com/r/homelab/comments/fciqid/hardware_for_my_home_pfsense_router/fjfl8ic/, and from https://teklager.se/en/knowledge-base/opnsense-performance-optimization/ have all been applied, and I am not sure when the performance drop occurred.

What is the best way to debug what's going on here? This is quite frustrating, as I know the hardware to be capable of full GBit/s routing.