UPDATE: The issue was a Zenarmor policy that was blocking SSH connections.
I recently set up my own Opnsense server at home, and everything has been going great except for a couple small issues. The main one is that I cannot SSH from a device on one subnet to a device on another subnet. Before I explain the issue in detail, I will explain my current network setup.
My network is as follows:
- Internet comes into house to ISP router (subnet 192.168.0.1/24).
- Opnsense box
- WAN: Attached to ISP router (address 192.168.0.10)
- LAN1: Goes to switch for wired network (subnet 192.168.10.1/24).
- LAN2: Goes to wireless AP for wireless network (subnet 192.168.11.1/24).
I have a box with SSH open on the LAN1 subnet, and I realized I could not SSH into it from my laptop on the LAN2 network (wireless). The specific error shown is "Connection closed by 192.168.10.x port 22". Now if I instead SSH from a machine on the LAN1 subnet (same as the SSH server) it works with no issue. I assumed at first that a firewall rule was blocking it, but I have confirmed from the logs that the firewall is passing the traffic correctly.
To make it more interesting, I observe similar behaviour when I try and SSH from my laptop (LAN2) to my cloud server which has a public IP address. If I try and SSH from LAN2, I get the same "Connection closed" error, but if I instead connect to the wireless hotspot on my phone and SSH again, there is no issue.
Because of these symptoms, I am inclined to believe this is a NAT issue, but I am confused for a couple reasons:
Reason 1: There is connectivity between subnets. I can ping between subnets no problem, and I even setup a netcat tunnel on the SSH port between the machines on separate subnets without issue.
Reason 2: My previous network setup had my AP in router mode with the ISP router plugged into it's WAN port, so I did not have any separate subnets like I do now, but NAT wise it was pretty much the same setup as far as the SSH connection to my cloud server goes, which did work on that setup.
In my NAT configuration, I have a the following rules:
- Interface: WAN, Source: Loopback net, Destination: *, NAT address: Interface address.
- Interface: WAN, Source: LAN net, Destination: *, NAT address: Interface address.
- Interface: WAN, Source: LAN2 net, Destination: *, NAT address: Interface address.
- Interface: LAN, Source: LAN net, Destination: LAN2 net, NAT address: Interface address.
- Interface: LAN2, Source: LAN2 net, Destination: LAN net, NAT address: Interface address.
Am I missing something here? What stumps me is that if I can successfully create a netcat tunnel between the two devices, why would SSH not work? I am relatively new to firewalls and opnsense so any help is appreciated. Thanks.
I would not create NAT rules for LAN1 and LAN2, but instead create firewall rules and set up routing. The way you are doing it will probably confuse the machines, because the IPs are NATed both ways, which cannot work.
Outbound NAT is only for WAN interfaces where you have to "hide" behind the one routeable IP your ISP gives you. The direction LAN -> LAN2 should work right from the start, because in default, there is an "allow all" rule in place for the first LAN. For any new LAN, you must create it yourself or be more specific in what you want to allow.
@meyergru Thanks for great explanation. I see what you are saying. Now that I think about it, you're right I shouldn't need NAT between the two interfaces. I have now removed those NAT rules. I had already set up a LAN2 to any firewall rule to match the auto generated LAN rule of the same kind. They are as follows:
- Interface: LAN, Protocol: IPv4 *, Source: LAN net, Destination: *.
- Interface: LAN2, Protocol: IPv4 *, Source: LAN2 net, Destination: *.
That should be sufficient right? I can even see the firewall passing this traffic in the logs but its still not working. To me (someone inexperienced with networking) this seems like it should be working because I can't find the root of the issue. My initial assumption is that something on the SSH configuration was not set up correctly, but I have SSH configs for these hosts that worked no problem before I setup this firewall, and SSH from the same subnet works without issue, its only across subnets that is problematic. So to me it seems like it has to be something firewall related. I am interested to know your thoughts.
If you can access the ports with tools like netcat or nmap, you should be good network-wise. But also, there might be some other local firewall rules in place from things like ufw, network restrictions that only allow connections from your local subnets or maybe wrong ciphers when the source and destination disagree on what ciphers are allowed. The latter would apply only to SSH traffic, obviously.
Are you still getting this connection closed error or does ssh just hang on the client side?
If you had enabled logging on your IN rules on the LAN2 interfaces, you would see an entry in the log first.
IN on LAN2 (from the client), OUT on LAN (towards the server).
Observing reply traffic is a little more difficult but not that hard with SSH.
Head to interfaces > diagnostic > Packet capture and filter on port 22.
You should observe the packets triggering the log entries above, then the replies back (from the server on LAN, to the client on LAN2)...
If you don't see a reply hitting LAN (from the server), you'll have to dig deeper on that server.
@meyergru I agree that it should be fine network wise at this point. No firewalls are setup on the servers, and if I connect my laptop to the LAN network instead (via wired ethernet) I can ssh no problem. It's certainly stumping me.
@EricPerl Yes I am still getting the connection closed error. Specifically "Connection closed by 192.168.10.91 port 22" in which 192.168.10.91 is the SSH server on the LAN network. I ran the diagnostics like you recommended and as far as I can tell it seems like the reply is getting back from LAN to LAN2. I've attached images of the diagnostic run for you to look at. One is for the LAN network and the other is for LAN2 (OPT1). It seems they are sorted by interface name rather than timestamp but looking at the timestamps seems to show a reply being sent. I am interested to know your thoughts. Thanks.
@user88 please try and report on ssh -vvv <login>
It looks to me like Opnsense is fine, by your report, while the message is typical of incorrect credentials although it may be something else.
Hi @passeri, thanks for the reply. I have attached an image of the ssh -vvv output you requested. By my own analysis, it seems that it's failing in the key exchange identification function. I was able to find a related post (https://serverfault.com/questions/1015547/what-causes-ssh-error-kex-exchange-identification-connection-closed-by-remote) on serverfault in which the chosen answer describes the root cause of the error message. So it seems that based on this error, the client side determined that there was no process listening on the other side of the connection, whether that's because the server closed it or something else did.
What's odd is that the auth logs at "/var/log/auth.log" on the SSH server show "Connection closed by <client ip address> port 52958 [preath]" when I try to SSH from the client. So it seems each side is saying that the other closed the connection. Obviously both can't be true, so either the error wording is misleading or maybe something else is closing the connection? This would be far less puzzling if I wasn't able to SSH into this server easily from the LAN subnet. Is there any other process that could be interfering with this? I do have zenarmor running but I don't see how that could interfere with it.
I believe you have no matching key. Observe you have "type -1" after each test of a key or certificate type. If I run the same command to SSH into my server on a different subnet through Opnsense I get a "type 0" for id_rsa and login succeeds at the point where your connection fails.
You will see in the link you provided, where there is successful connection and subsequent failure from another cause, that the first listing has "/id_rsa type 0" hence it found a matching key and could continue. -1 is the standard POSIX error code.
The fact you can login from another subnet is a distraction. It offers no guarantee that you have keys set up correctly on the machine on the second subnet. I would check all of those carefully, or try moving the working machine across.
I am making a best estimate here from the information. There are other possibilities. My "not a guru" status is unthreatened. :-)
The thing is I am not using an SSH key as authentication. This SSH server uses password auth. When I say I can successfully login from another subnet, I mean that when I take my laptop which is unable to SSH from subnet 1 and turn off the wifi and connect it to subnet 2 (via ethernet), I can successfully SSH into the SSH server that resides on subnet 2.
Looking at the -vvv output for SSH on this successful connection from subnet 2, I see that the type is also showing as -1 (error) because each of the keys it's checking are for other servers, and this SSH server uses password auth, which it eventually defaults to when I SSH from subnet 2, but for some reason it never reaches this stage when initiating from subnet 1.
To make things more interesting, I just tried to push some changes to my github over SSH and I get the following error:
Connection closed by 140.11.121.3 port 22
fatal: Could not read from remote repository.
This issue persists whether I am connected to subnet 1 or subnet 2. But if I instead connect to the wireless network from my ISPs router, bypassing my opnsense firewall, I can push over SSH with no problem, and SSH to my cloud server as well with no issue. So something has to be wrong with my opnsense setup. If its not firewall rules and its not the NAT configuration, what could it be?
My erroneous assumption about use of keys, sorry.
I will go back to read again. The new test about accessing Github is interesting although the difference in error message also creates some uncertainty.
Given I and others cross subnets using SSH through Opnsense without a problem, there should be a comparison, test or review we can make somewhere to locate your issue.
I know that you can forbid password authentication for SSH for specific users or root. In the latter case, you would get that message if you did not provide a key.
You can also forbid password authentication for different subnets, for example with something like "Host 192.168.10.*", see ssh_config(5) manpage. I would suggest looking at the SSHD configuration on the host. The specific setting can be hidden in different places, like e.g. /etc/ssh/sshd_config, /etc/ssh/sshd.d/*.conf, ~user/.ssh/....
Guessing from the "fatal: Could not read from remote repository." message, your SSH host is a local Git repository, so I bet the configuration is more complex than you might think.
FWIW, I bet this is not a network problem any more.
Quote from: meyergru on March 10, 2025, 08:32:09 AMFWIW, I bet this is not a network problem any more.
This ☝️
As soon as you see verbose output when you try to connect, the network part is out of the way. Info flows forth and back.
This is confirmed by packet capture.
If configuration requires the use of keys when accessing from a different network, you know what to do...
I agree with your sentiment but I am having a really hard time deciphering exactly what else could be the problem. The ssh configuration on this server is extremely basic. The only uncommented lines of the configuration are as follows:
Include /etc/ssh/sshd_config.d/*.conf
Port 22
PasswordAuthentication yes
KbdInteractiveAuthentication no
UsePAM yes
X11Forwarding yes
PrintMotd no
AcceptEnv LANG LC_*
Subsystem sftp /usr/lib/openssh/sftp-server
In fact the only thing I changed after the fresh install of ubuntu on the SSH config was uncommenting "Port 22" and "PasswordAuthentication yes".
@meyergru The github host is not local, it is the actual github.com server. I certainly was able to push to it before installing my opnsense firewall, and nothing else has changed on my network except the addition of the firewall. To push my changes, I could not even push from the LAN network after connecting my laptop to it, I had to connect to my ISP access point and I was able to push with no issues.
I am no network expert but it seems like the issue is only presenting itself when the destination is outside the current subnet, which keeps making me think its a NAT issue. I plan to set up a fresh ssh box on my LAN2 network and try to SSH from the LAN network. If that doesn't work, I don't see how it could be anything other than a network problem.
Hmm, after the IP_TOS output, there's this:
debug1: Connection established.
If you didn't have any connectivity (e.g. change the port), you wouldn't get that.
A bit further down, your output diverges from mine here:
debug1: Local version string SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.5
debug1: Remote protocol version 2.0, remote software version OpenSSH_9.6p1 Ubuntu-3ubuntu13.5
I don't see the remote part.
You might want to compare with output when it works.
I did not follow all the details (sorry) but this smells like elves ... er ... like some overlooked NAT rule that leads to the SSH connection going to a different server to me.
Is it possible that your client is now jailed by the server because it didn't complete auth?
https://www.openssh.com/txt/release-9.8 (https://www.openssh.com/txt/release-9.8)
Check out the new features section.
Of course, I don't know that the server is 9.8...
It does feel like something NAT related, but then again I can create a netcat tunnel on port 22 no problem and send data back and forth...
OpenSSH version is 9.6p1, but I don't think it could be that anyways because I can SSH in successfully by just moving my laptop from the wireless network (LAN2) to the wired network (LAN1). To me, that removes all possibility for issues with the server or client config and leaves something else in the middle. What that is, I don't know yet. Everything seems to be checking out except it's still not working...
The packet captures didn't indicate anything like NAT issues. IP:port was equal on both interfaces.
The only thing worth checking there was that MAC addresses matched NICs involved.
If your server is 9.6p1 (like mine), then it can't be that. Your client showed 9.8 so I thought it might apply to your server as well.
Apparently, for that feature, the penalty is tracking clients by IP/subnet so changing network would be sufficient to clear that hurdle.
I don't know how early the connection is dropped.
MTU issue? As a quick test, try lowering the MTU on the client manually and see if SSH works...
libressl incompat perhaps? I suggest put another ssh machine without libressl and instead openssl on the same network segment and try again.
Debug info seems to suggest not a networking problem due to the successful connection messages and unsuccessful ones relating to not matching keys. The successful connection to it from within the same segment tells different, I get that. An interesting one. Diagnosing application nevertheless. The usual elimination process applies ;)
Do you have a port forward from WAN to port 22 that could be misconfigured and catches your SSH connections?
No luck with the MTU unfortunately. I have no port forward on the WAN for port 22.
@cookiemonster What part of the setup concerns you with libressl? My limited knowledge seems to recall that it's an OpenBSD fork of OpenSSL. My laptop and server both use OpenSSL as far as I know. Maybe I'm missing something?
User88, I re-read your opening post. What happens if you disconnect the AP from Opnsense, connecting your computer directly to that interface? I did not spot that you had tried that.
Also, is your OPNSense default apart from configuring interface IPs and a single firewall rule allowing LAN2 net to Any? On your description nothing more should be required for operation.
Sorry, TL;DR, but... have we confirmed that the ssh connection is actually making it to the intended destination host? Could do this with either a packet capture (on the ssh server itself), or maybe ssh server logs? I'm leaning towards some sort of NAT redirect now too....
My read of reply #5 (per my request) was that some traffic was flowing forth and back through both interfaces involved, without any NAT or PF.
I suggested to check if the MAC addresses matched systems involved. Not confirmed yet.
Reply #7 has a Connection established line (right before the big white block) that matches. IME, you don't get there with broken connectivity.
It somehow seems to fail early during key exchange, not even displaying the remote version.
Apparently, the same systems can complete the connection when on the same LAN.
IMO, that eliminates issues around incompatible clients.
I thought I was onto something when I noticed 9.8 and the new fail2ban feature but the server is 9.6p1...
I'm out of ideas.
Given it is known to work on the same subnet, I proposed above to simplify by eliminating the wireless access point and configure the most basic working Opnsense. That will eliminate doubt or distractions around the actual problem. In their recent post user88 was still referring to a potential NAT problem, which (if it or anything similar exists) must lie in configuration or in the wAP.
@dseven Yes, we have confirmed that. I did a packet capture (shown on the previous page of this thread) and it does reach the destination server. @EricPerl I just went to double check and yes the MAC addresses all match up with what they should be so that must not be the issue.
@passeri I will try disconnecting the AP and just connecting my laptop tonight. Sorry for the late replies, have been busy. Will update soon.
I just tried connecting my laptop directly to the LAN2 port of my opnsense box. SSH still disconnects the same as before when I try to connect to my server on LAN1. There is something that is wrong with the whole opnsense here in my opinion, because not only is it doing this across subnets, but I cannot use SSH to push code to github.com either. It fails with the same symptoms. So basically if the SSH connection goes through an interface then it doesn't work.
As I said, I'm out of ideas as to what's out of whack with your entire setup, but SSH is in fact the first use case I typically try when troubleshooting.
Port forwarding, inter-VLAN, VPN home... On Ubuntu, Windows, a bunch of networking devices and so on.
I've had issues with incompatible crypto and password availability. Nothing like your issue.
@user88, that clarifies to a point. For the other half, have you tried with a fresh, purely default Opnsense?
Hmm, after additional research, I'm no longer sure the "Connection established" log indicates a successful roundtrip.
Couple more ideas:
1. Check /etc/hosts.allow & /etc/hosts.deny
2. Start a second ssh deamon with full debug on an alternate port, eg: sshd -ddd -p 2222 (you may need to modify your FW rules. You also have to specify that port on the client side). The server debug output may reveal interesting data.
3. The packet captures were promising. We only looked at the existence of packets though. The first few may actually be clear since key exchange has not occurred yet. Can you get the content? It would be nice to see the remote version in the high details output.
For example:
Screenshot 2025-03-15 115043 SSH.png
Interestingly, when using the Windows client, the remote string is sent first. The WSL client does this the other way around.
I saw something along these lines in the code, but I have not looked for the trigger for that behavior.
IMO, seeing a packet with server issued content would exonerate OPN.
Such capture is best done on the interface of the client.
Alright, after carefully going through my full configuration of Opnsense it turns out that Zenarmor was the issue. There was a policy configured under "App Controls -> Remote Access" which was blocking SSH. Sure enough I turned it off and everything was fine. A wild goose chase that should never have happened... Thanks to everyone here for all their help and to @passeri for suggesting I just try with a default Opnsense!
If there is a way to mark this as solved, please let me know and I will do that.
Geez... How the hell does this align with the packet captures showing traffic????
I'm aware of the order in which Zenarmor and OPN handle traffic, but for traffic to be visible in OPN and blocked by Zenarmor, packets have to be dropped on the way out... which is pretty weird.
Anyway, I never saw the Zenarmor mention.
It's also the first thing I would have disabled in this scenario.
Overlapping layers of security just make troubleshooting a pain.
Oh, just edit the subject of the OP.
Ya, puzzles me too. Definitely should have just disabled it. Thanks again for the help. Marked as solved!