[SOLVED] ssh connection between internal networks dropped after 30-60 seconds

Started by mfalkvidd, April 08, 2022, 01:44:30 PM

Previous topic - Next topic
I don't think this is specific to ssh connections, but ssh is where I am experiencing the problem.

ssh connection from a host on [LAN] to a host on [Untrusted] is dropped after 30-60 seconds.

In the ssh window, I run `watch uptime` to generate activity every 2 seconds. I can see the terminal updating every 2 seconds for a while, before it freezes. After some more time, my ssh client exits with `client_loop: send disconnect: Connection reset by peer`.

See attached images for data from Firewall: Log Files: Live View.

My setup
internal network [LAN]: 192.168.32.0/24
internal network [Untrusted]: 192.168.33.0/24

Firewall:Rules:LAN has "Default allow LAN to any rule" which allows any IPv4 traffic from LAN to anywhere

If I ssh from LAN to a server on the internet, the connection works fine.

The host on the [Untrusted] lan sees the real IP of the [LAN] host; there is no NAT performed.

Questions:
1. Why does the initial ssh connection hit the "let out anything from the firewall itself" rule? I thought it would hit the "Default allow LAN to any rule" rule.
2. Isn't opnsense supposed to keep track of the allowed connection, until the connection is closed? Why does the connection hit the "Default deny rule" after a while?


Firewall:Diagnostics:States claims that the connection has been closed. But I can still use the connection when this happens (for a few seconds more). See below image.

try to set Firewall Optimization (Firewall -> Settings -> Advanced) to conservative ...

Thanks! That does seem to do the trick. The ssh connection has been working for 8 minutes now. Before, it never lasted more than about a minute.

I'm a bit curious why this setting works, and how much impact it will have on the firewall performance.

I did not realize any impact. But we use this on quite performant hardware, so on smaller devices the experience might differ ...

Seems like I spoke too soon. The ssh connection was dropped after 15 minutes. Doing another test now to verify if the same happens again.


Yup, got kicked off again after 15 minutes.
# pfctl -st
tcp.first                  3600s
tcp.opening                 900s
tcp.established          432000s
tcp.closing                3600s
tcp.finwait                 600s
tcp.closed                  180s
tcp.tsdiff                   60s
udp.first                   300s
udp.single                  150s
udp.multiple                900s
icmp.first                   20s
icmp.error                   10s
other.first                  60s
other.single                 30s
other.multiple               60s
frag                         30s
interval                     10s
adaptive.start                0 states
adaptive.end                  0 states
src.track                     0s

LGTM ...
Maybe other settings mess it up (check other settings in Firewall -> Settings -> Advanced)
Is there another device involved in between?
Otherwise you can play with the ClientAlive* settings on the ssh server side and/or ServerAlive* settings on the ssh client side ...

I have now verified multiple times that the connection is dropped after approximately 15 minutes.

The tcp.opening parameter is 900s, which is 15 minutes. I wonder if that is a coincidence.

If I change back to "normal", tcp.opening becomes 30s, which matches pretty good with the timeouts I had in the beginning.

So it seems the firewall closes the connection after the tcp.opening timer expires, instead of identifying the traffic as established.

I don't think ClientAlive/ServerAlive settings will make any difference, since there is continuous traffic on the ssh connection. Those settings (afaik) are only applicable if you leave the terminal idle. Besides, if I move the host to the same lan, there is no need to tweak the settings.

There are two managed (VLAN-capable) switches in between. But would that make a difference when chainging the timeouts on the firewall change how soon the connection is dropped?

tcp.opening                 900s
... is the same here and I have no problems ...
If you start ssh with '-vvv' do you see any keepalive packages?

I usually deploy a client keepalive sshd setting to my servers via ansible
ansible main ± git grep ClientAlive
roles/bootstrap/templates/sshd_config.j2:ClientAliveInterval {{ ssh_ClientAliveInterval | default(15) }}
roles/bootstrap/templates/sshd_config.j2:ClientAliveCountMax {{ ssh_ClientAliveCountMax | default(3) }}

This, in combination with the conservative settings mentioned above works for me since years ... so maybe you should give the ClientAlive stuff a try?

I added ClientAliveInterval 15 and ClientAliveCountMax 3 to sshd_config on the remote host and restarted sshd. Keepalives are posted every 15 seconds, but the firewall still expires the connection:
debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
client_loop: send disconnect: Connection reset by peer



Edit: same, but with the conservative setting:

Apr 08 17:58:34 debug1: Sending command: while true; do uptime; sleep 60; done
Apr 08 17:58:34  17:58:33 up 29 days, 22:35,  1 user,  load average: 0.57, 0.16, 0.04
Apr 08 17:58:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 17:59:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 17:59:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 17:59:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 17:59:34  17:59:33 up 29 days, 22:36,  1 user,  load average: 0.21, 0.13, 0.04
Apr 08 17:59:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:00:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:00:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:00:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:00:34  18:00:33 up 29 days, 22:37,  1 user,  load average: 0.12, 0.12, 0.04
Apr 08 18:00:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:01:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:01:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:01:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:01:34  18:01:33 up 29 days, 22:38,  1 user,  load average: 0.04, 0.10, 0.03
Apr 08 18:01:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:02:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:02:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:02:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:02:34  18:02:33 up 29 days, 22:39,  1 user,  load average: 0.02, 0.08, 0.03
Apr 08 18:02:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:03:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:03:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:03:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:03:34  18:03:33 up 29 days, 22:40,  1 user,  load average: 0.00, 0.06, 0.02
Apr 08 18:03:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:04:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:04:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:04:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:04:34  18:04:33 up 29 days, 22:41,  1 user,  load average: 0.05, 0.06, 0.02
Apr 08 18:04:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:05:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:05:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:05:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:05:34  18:05:33 up 29 days, 22:42,  1 user,  load average: 0.02, 0.05, 0.02
Apr 08 18:05:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:06:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:06:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:06:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:06:34  18:06:33 up 29 days, 22:43,  1 user,  load average: 0.00, 0.04, 0.01
Apr 08 18:06:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:07:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:07:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:07:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:07:34  18:07:33 up 29 days, 22:44,  1 user,  load average: 0.00, 0.03, 0.00
Apr 08 18:07:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:08:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:08:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:08:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:08:34  18:08:33 up 29 days, 22:45,  1 user,  load average: 0.05, 0.04, 0.00
Apr 08 18:08:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:09:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:09:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:09:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:09:34  18:09:33 up 29 days, 22:46,  1 user,  load average: 0.02, 0.03, 0.00
Apr 08 18:09:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:10:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:10:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:10:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:10:34  18:10:33 up 29 days, 22:47,  1 user,  load average: 0.00, 0.02, 0.00
Apr 08 18:10:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:11:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:11:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:11:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:11:34  18:11:33 up 29 days, 22:48,  1 user,  load average: 0.00, 0.02, 0.00
Apr 08 18:11:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:12:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:12:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:12:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:12:34  18:12:33 up 29 days, 22:49,  1 user,  load average: 0.00, 0.01, 0.00
Apr 08 18:12:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:13:04 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:13:19 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:13:34 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:13:34  18:13:33 up 29 days, 22:50,  1 user,  load average: 0.00, 0.00, 0.00
Apr 08 18:13:49 debug1: client_input_channel_req: channel 0 rtype keepalive@openssh.com reply 1
Apr 08 18:14:08 client_loop: send disconnect: Connection reset by peer

A case of asymmetric routing, possibly?
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Thanks @pmhausen, that was it! The remote host was connected to the LAN through the wireless adapter, so all traffic from the remote host was delivered locally, without passing through opnsense. No wonder the state table couldn't keep up.

After turning off the wlan adapter, I can use ssh without keepalives and with Firewall Optimization set to "normal".