I don't think this is specific to ssh connections, but ssh is where I am experiencing the problem.
ssh connection from a host on [LAN] to a host on [Untrusted] is dropped after 30-60 seconds.
In the ssh window, I run `watch uptime` to generate activity every 2 seconds. I can see the terminal updating every 2 seconds for a while, before it freezes. After some more time, my ssh client exits with `client_loop: send disconnect: Connection reset by peer`.
See attached images for data from Firewall: Log Files: Live View.
My setup
internal network [LAN]: 192.168.32.0/24
internal network [Untrusted]: 192.168.33.0/24
Firewall:Rules:LAN has "Default allow LAN to any rule" which allows any IPv4 traffic from LAN to anywhere
If I ssh from LAN to a server on the internet, the connection works fine.
The host on the [Untrusted] lan sees the real IP of the [LAN] host; there is no NAT performed.
Questions:
1. Why does the initial ssh connection hit the "let out anything from the firewall itself" rule? I thought it would hit the "Default allow LAN to any rule" rule.
2. Isn't opnsense supposed to keep track of the allowed connection, until the connection is closed? Why does the connection hit the "Default deny rule" after a while?
(https://forum.opnsense.org/index.php?action=dlattach;topic=27848.0;attach=21569;image)
Firewall:Diagnostics:States claims that the connection has been closed. But I can still use the connection when this happens (for a few seconds more). See below image.
(https://forum.opnsense.org/index.php?action=dlattach;topic=27848.0;attach=21577;image)
try to set Firewall Optimization (Firewall -> Settings -> Advanced) to conservative ...
Thanks! That does seem to do the trick. The ssh connection has been working for 8 minutes now. Before, it never lasted more than about a minute.
I'm a bit curious why this setting works, and how much impact it will have on the firewall performance.
I did not realize any impact. But we use this on quite performant hardware, so on smaller devices the experience might differ ...
Seems like I spoke too soon. The ssh connection was dropped after 15 minutes. Doing another test now to verify if the same happens again.
can you post the output of pfctl -st?
Yup, got kicked off again after 15 minutes.
# pfctl -st
tcp.first 3600s
tcp.opening 900s
tcp.established 432000s
tcp.closing 3600s
tcp.finwait 600s
tcp.closed 180s
tcp.tsdiff 60s
udp.first 300s
udp.single 150s
udp.multiple 900s
icmp.first 20s
icmp.error 10s
other.first 60s
other.single 30s
other.multiple 60s
frag 30s
interval 10s
adaptive.start 0 states
adaptive.end 0 states
src.track 0s
LGTM ...
Maybe other settings mess it up (check other settings in Firewall -> Settings -> Advanced)
Is there another device involved in between?
Otherwise you can play with the ClientAlive* settings on the ssh server side and/or ServerAlive* settings on the ssh client side ...
I have now verified multiple times that the connection is dropped after approximately 15 minutes.
The tcp.opening parameter is 900s, which is 15 minutes. I wonder if that is a coincidence.
If I change back to "normal", tcp.opening becomes 30s, which matches pretty good with the timeouts I had in the beginning.
So it seems the firewall closes the connection after the tcp.opening timer expires, instead of identifying the traffic as established.
I don't think ClientAlive/ServerAlive settings will make any difference, since there is continuous traffic on the ssh connection. Those settings (afaik) are only applicable if you leave the terminal idle. Besides, if I move the host to the same lan, there is no need to tweak the settings.
There are two managed (VLAN-capable) switches in between. But would that make a difference when chainging the timeouts on the firewall change how soon the connection is dropped?
tcp.opening 900s
... is the same here and I have no problems ...
If you start ssh with '-vvv' do you see any keepalive packages?
I usually deploy a client keepalive sshd setting to my servers via ansible
ansible main ± git grep ClientAlive
roles/bootstrap/templates/sshd_config.j2:ClientAliveInterval {{ ssh_ClientAliveInterval | default(15) }}
roles/bootstrap/templates/sshd_config.j2:ClientAliveCountMax {{ ssh_ClientAliveCountMax | default(3) }}
This, in combination with the conservative settings mentioned above works for me since years ... so maybe you should give the ClientAlive stuff a try?
I added ClientAliveInterval 15 and ClientAliveCountMax 3 to sshd_config on the remote host and restarted sshd. Keepalives are posted every 15 seconds, but the firewall still expires the connection:
debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
client_loop: send disconnect: Connection reset by peer
Edit: same, but with the conservative setting:
Apr 08 17:58:34 debug1: Sending command: while true; do uptime; sleep 60; done
Apr 08 17:58:34 17:58:33 up 29 days, 22:35, 1 user, load average: 0.57, 0.16, 0.04
Apr 08 17:58:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 17:59:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 17:59:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 17:59:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 17:59:34 17:59:33 up 29 days, 22:36, 1 user, load average: 0.21, 0.13, 0.04
Apr 08 17:59:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:00:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:00:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:00:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:00:34 18:00:33 up 29 days, 22:37, 1 user, load average: 0.12, 0.12, 0.04
Apr 08 18:00:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:01:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:01:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:01:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:01:34 18:01:33 up 29 days, 22:38, 1 user, load average: 0.04, 0.10, 0.03
Apr 08 18:01:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:02:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:02:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:02:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:02:34 18:02:33 up 29 days, 22:39, 1 user, load average: 0.02, 0.08, 0.03
Apr 08 18:02:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:03:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:03:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:03:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:03:34 18:03:33 up 29 days, 22:40, 1 user, load average: 0.00, 0.06, 0.02
Apr 08 18:03:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:04:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:04:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:04:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:04:34 18:04:33 up 29 days, 22:41, 1 user, load average: 0.05, 0.06, 0.02
Apr 08 18:04:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:05:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:05:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:05:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:05:34 18:05:33 up 29 days, 22:42, 1 user, load average: 0.02, 0.05, 0.02
Apr 08 18:05:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:06:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:06:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:06:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:06:34 18:06:33 up 29 days, 22:43, 1 user, load average: 0.00, 0.04, 0.01
Apr 08 18:06:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:07:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:07:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:07:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:07:34 18:07:33 up 29 days, 22:44, 1 user, load average: 0.00, 0.03, 0.00
Apr 08 18:07:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:08:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:08:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:08:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:08:34 18:08:33 up 29 days, 22:45, 1 user, load average: 0.05, 0.04, 0.00
Apr 08 18:08:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:09:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:09:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:09:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:09:34 18:09:33 up 29 days, 22:46, 1 user, load average: 0.02, 0.03, 0.00
Apr 08 18:09:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:10:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:10:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:10:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:10:34 18:10:33 up 29 days, 22:47, 1 user, load average: 0.00, 0.02, 0.00
Apr 08 18:10:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:11:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:11:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:11:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:11:34 18:11:33 up 29 days, 22:48, 1 user, load average: 0.00, 0.02, 0.00
Apr 08 18:11:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:12:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:12:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:12:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:12:34 18:12:33 up 29 days, 22:49, 1 user, load average: 0.00, 0.01, 0.00
Apr 08 18:12:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:13:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:13:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:13:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:13:34 18:13:33 up 29 days, 22:50, 1 user, load average: 0.00, 0.00, 0.00
Apr 08 18:13:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:14:08 client_loop: send disconnect: Connection reset by peer
A case of asymmetric routing, possibly?
Thanks @pmhausen, that was it! The remote host was connected to the LAN through the wireless adapter, so all traffic from the remote host was delivered locally, without passing through opnsense. No wonder the state table couldn't keep up.
After turning off the wlan adapter, I can use ssh without keepalives and with Firewall Optimization set to "normal".
Quote from: pmhausen on April 08, 2022, 06:11:01 PM
A case of asymmetric routing, possibly?
Brilliant! Thanks a million. I've been fighting this problem for months. No issue when remote, but internally, my sessions expire quickly. Low and behold my system has multiple interfaces, and only one of which listens for ssh. Looks like its time to do some more network tweaking!
Quote from: pmhausen on April 08, 2022, 06:11:01 PM
A case of asymmetric routing, possibly?
I have also found this to be a problem with other protocols. In particular IMAP and SUBMISSION when I had those services running on a DMZ network with a server that had a NIC on both LANs.
In my case the session dropped at 64K bytes rather than an elapsed time, but I presume this too is related to asymmetric state tracking - or lack thereof.
It's a very subtle problem and hard to solve, so someone should give pmhausen's quote some stars or pin it or something so it stays fresh in our minds.