Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - sepahewe

#1
I ran into similar issue.

I have VLANs on a LAGG and I want to enable IDP, but when I do network connectivity stops. The log shows:
generic netmap attach emulated adapter for lagg0 created

and a bit of googling seems to suggest that the LAGG driver doesn't support netmap which causes the issue. I then tried to enable it directly on the PHY-interfaces, but they are not visible in Suricata and I can't assign them in Interfaces as they are busy due to the LAGG.

Edit: I'm running 23.7.9
#2
Hi,

having had a few posts in this thread, here's how I resolved my issues.

TLDR; RTFM and the mlx4 driver doesn't support RSS on FreeBSD.

I have two dual port cards that are involved: Intel x552 (driver ixgbe) and Mellanox ConnectX-3 Pro (driver mlx4). When I replaced ConnectX-3 with a ConnectX-5 (driver mlx5), everything works.

What fooled me is when reading up on capabilities on mlx4, information provided, even by Mellanox, isn't always clear if it refers to Linux or BSD so I got the understanding it should work. The NIC supports RSS, mlx4 on Linux supports RSS, however on FreeBSD mlx4 does not support RSS.

mlx5 supports RSS on FreeBSD, so changing NIC to ConnectX-5 solved it.
#3
Hi,

I had some time to kill so I reran my tests. One difference I see is that with RSS enabled the firewall closes its own outgoing connections with an RST.

My firewall is 192.168.192.1 and in my test I ran curl http://192.168.192.30:8123 while capturing packets.

RSS disabled:
19 2.395679 192.168.192.1 192.168.192.30 TCP 74 52726 → 8123 [SYN] Seq=0 Win=65228 Len=0 MSS=1460 WS=128 SACK_PERM TSval=2821030256 TSecr=0
20 2.395947 192.168.192.30 192.168.192.1 TCP 74 8123 → 52726 [SYN, ACK] Seq=0 Ack=1 Win=65160 Len=0 MSS=1460 SACK_PERM TSval=2913189312 TSecr=2821030256 WS=128
21 2.396029 192.168.192.1 192.168.192.30 TCP 66 52726 → 8123 [ACK] Seq=1 Ack=1 Win=65792 Len=0 TSval=2821030256 TSecr=2913189312
22 2.396311 192.168.192.1 192.168.192.30 HTTP 148 GET / HTTP/1.1


RSS enabled:
68 24.248066 192.168.192.1 192.168.192.30 TCP 74 19224 → 8123 [SYN] Seq=0 Win=65228 Len=0 MSS=1460 WS=128 SACK_PERM TSval=187982256 TSecr=0
69 24.248327 192.168.192.30 192.168.192.1 TCP 74 8123 → 19224 [SYN, ACK] Seq=0 Ack=1 Win=65160 Len=0 MSS=1460 SACK_PERM TSval=2911919337 TSecr=187982256 WS=128
70 24.248375 192.168.192.1 192.168.192.30 TCP 66 19224 → 8123 [ACK] Seq=1 Ack=1 Win=65792 Len=0 TSval=187982256 TSecr=2911919337
71 24.248517 192.168.192.1 192.168.192.30 TCP 66 19224 → 8123 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 TSval=187982257 TSecr=2911919337


With RSS disabled curl works, TCP is established.

With RSS enabled, the firewall itself kills TCP with an RST. I don't understand why...
#4
Having done some more reasearch and Google-foo. I don't have a solution, but at least some more insight.

It seems BSD doesn't add RSS support for outgoing connections and that causes issues for HAProxy. I did find others having experienced the same issue with proposed patches for HAProxy but unfortunately those patches weren't accepted:
https://www.mail-archive.com/haproxy@formilux.org/msg34548.html
https://lists.freebsd.org/pipermail/freebsd-transport/2019-June/000247.html

Both links have the same author. His thinking sounds sound, and his patches work for him, but my own 5 cents says that they rely on using a symmetric Toeplitz key even though he doesn't mention that. If my hunch is right, then the default Microsoft key doesn't work well. A symmetric key is also required for Suricata to ensure each thread sees both in and out flows of the same conversation. What he does is using the same RSS hash to assign outgoing ports to the same CPU and RSS queue as incoming traffic would hit, thereby not only assuring connection, also preventing CPU context switching even in HAProxy.

I tried limiting HAProxy to 1 process and 1 thread hoping that could work as a very quick, but performance limited, fix, but unfortunately not.

You could argue that solving this within HAProxy is not the right place as it intertwines the layers, but HAProxy RSS awereness also adds the prevention of CPU context switches between net.inet and HAProxy. His patches also hard codes the default hash key. Instead he should have asked the kernel for the key (I don't know if that's possible today) to make sure the same key is used. To satisfy Suricatas requirement, a wish would be for the possibility of setting a symmetric key.
#5
Hi,

I tried enabling RSS and Suricata works. Better spread of CPU load and better performance. However, haproxy runs into issues. HAProxy can't connect to anything, not for health checks and not for live traffic. Based on earlier comment on so_reuseport, I changed my config to simple binds and enabled noreuseport for haproxy, but haproxy still fails to connect.

It gets very sporadic, ~10%, successes but that's rare enough for a health check not to clear. Since I have 8 RSS queues it is almost like haproxy only gets traffic from 1 queue which would amount to 12.5% success.

I've tried all combos of net.inet.rss.enable, noreuseport, with health checks, w/o health checks and success/failure depends completely on net.inet.rss.enable. The error reported from haproxy is "Layer4 timeout"

driver: ix
NIC: Intel D-1500 soc 10 gbe, (X552)
Opnsense: 22.1.7_1

I more than happy to help testing but would appreciate any suggestions in what direction to start.
#6
Run into the same issue, and can verify that it is indeed the combo of a spoofed MAC and IDS that triggers it.
#7
Hi,

thanks for pointing me in the right direction  :) While gathering the files you asked for I noticed in configd.log the following error:

QuoteScript action failed with Command '/usr/local/bin/upsc 'ups@oldupsserver.domain.com''

The hostname is incorrect (oldupsserver.domain.com) and should be localhost. Looking at my config I noticed that oldupsserver.domain.com was still in my config for NUT netclient and that fooled the diagnostics page to query the remote NUT server instead of my usbhid UPS.

I did use netclient before but have now moved my UPS to the firewall instead. When doing so I disabled netclient, enabled usbhid and configured usbhid. Netclient is disabled but the config line för remote host was still left configured. Removing the FQDN from the netclients "ip adress" made everything work.
#8
Hi, i'm reusing this thread as I have the very same problem, I hope it's ok...

The issue is the same, that the diagnostics page is empty. Nut itself works and if I run 'upsc ups@localhost' I get my UPS's info.

Turning on debugging in my browser I see several requests to upsstatus and the response is 200 with '{"response":null}' in the payload.

So the network works, the diagnostics page can access upsstatus, upsc works, but something between the api and upsc fails...

Any suggestions to where I should look?