Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - mickbee

#1
Hi Franco, again thanks for following up!

I tried the upgraded kernel and it doesn't seem to change much but then again, the VM machine sits behind a physical opensense box which I can't really touch (several thousand km away and no one to help fix it in case it becomes irresponsive) and which has the issue in question.

I have another one next to me which suffers from the same bug but this one needs to be up for the next 7 hours. I'll try as soon as people go home and report back.

Still, since the patch will already be included, i suppose thorough testing can wait for the weekend?

regards,
m.

#2
I have one device i can test and rollback (vm) without disrupting some realtime stuff - i can do that tomorrow morning CET and feed back immediately. This is exciting!

@Franco, really appreciate your efforts and help in this space!
#3
Quote from: franco on May 22, 2017, 08:14:26 AM
What's your rule on the IPsec tab? Isn't it easier to use any -> any there?

Hi guys and Franco, any news on this? this is an issue since last december already and no sign of a fix for this :(

@Franco, i never thought that i'll see the day when someone suggest using any>any rules on an opensource/firewall product forum!?
#4
Quote from: franco on February 03, 2017, 01:20:51 PM
I missed this thread, sorry :(

Try the IPsec sysctl fix too:

# sysctl net.inet.ipsec.filtertunnel=1

There are some fixes we're testing right now, takes some time to gather conclusive data. But we'll report back soon. The noroute kernel works in the meantime.


Cheers,
Franco

No worries at all Franco, feel free to close the duplicate thread.

I tried the sysctl command and makes no difference. I will run the any any any on IPSEC for the time being (as it seems to work) and wait for the kernel level fix in the next release.
#5
Quote from: franco on February 03, 2017, 12:07:04 PM
It's indeed the commit, thanks for analysis and testing to djGrrr and Martinez!

The issue is a bit tricky. I think we're seeing something new in the network stack. On FreeBSD the packages for specific gateways were hi-jacked and never saw the rest of the stack, which made them completely unusable with the Captive Portal or Traffic Shaping. Since the routing is now only tagged, there is a priority issue with whether the policy route is being enforced or not. In this case not so much anymore.

In any case, this kernel will retain the old behaviour:

# opnsense-update -kr 17.1-noroute
# /usr/local/etc/rc.reboot

This is a priority item for 17.1.1 and something that did not come up in testing and all through RC1. djGrrr, do you know why this could be? It's part of a configuration difference that's not clear yet.


Cheers,
Franco

Hi Franco - i reported this while 17.1 was in beta and rc:
https://forum.opnsense.org/index.php?topic=4313.0

in any case, 17.1 stable still has the issue, IPSEC rules don't trigger UNLESS i put a rule on the top with any any any but that's kind of not the point.

I will try the other kernel tonight once there's no traffic on that one box. Holding off with upgrading the few other boxes until this is fixed. Let me know if i can help and provide logs to support the troubleshooting!

Thanks!
#6
16.7 Legacy Series / Re: IPSEC ICMP
January 22, 2017, 08:03:10 PM
Hi John,

did this get fixed since? curious as I've had a similar issue with routing tables being broken with one VM pfsense (and now opnsense) instance with 2 WAN connections

thanks!
Mike
#7
17.1 Legacy Series / IPSEC fw rules don't trigger
January 21, 2017, 11:46:05 PM
Hi guys,

another odd issue i came across; the scenario is as follows:

APU2 running OPNsense 16.7.13-amd64 on FreeBSD 10.3-RELEASE-p14, connected via an IPSEC v2 LAN to LAN tunnel with a:
Soekris 5501-70 running OPNsense 17.1.r1-i386 on FreeBSD 11.0-RELEASE-p5

Tunnel seems to be up at the time when I'm making my tests - this is confirmed by seeing the traffic on the Soekris box in the fw log; both boxes' config has apropriate rules for allowing ICMP from a number of networks (using aliases) to networks TCP;ICMP;

The log confirms that pings arrived at the remote box and got blocked; clicking on the green arrow in the log entry creates an easy rule and even after filter reload, all ping attempts get blocked.

Note that the same applies to all (around 10) rules within the IPSEC tab - most rely on aliases for source/destination/dest.port but two are IP -> IP / any and those don't work either.

No other 17.1 boxes (have 2 more but diff hw/vm and on 10.3 instead) display the same behavior.
#8
16.7 Legacy Series / Re: IPSEC issues latest stable
January 15, 2017, 11:05:46 PM
actually, just to demonstrate how bad this is, have a look at the attached graph; i'm using librenms running on a vmware esxi vm, behind one opnsense box, polling numerous opnsense and pfsense hosts - amongst others;

see how i'm getting vpn tunnel traffic drops frequently over a 24h period; note the recent snmp poll logs which demo the same symptom - there does seem to be a pattern there?

2017-01-17 18:15:04   Device status changed to Down from icmp check.   
2017-01-17 17:35:14   Device status changed to Up from check.   
2017-01-17 17:25:07   Device status changed to Down from icmp check.   
2017-01-17 16:35:05   Device status changed to Up from check.   
2017-01-17 16:30:05   Device status changed to Down from icmp check.   
2017-01-17 13:30:04   Device status changed to Up from check.   
2017-01-17 13:25:05   Device status changed to Down from icmp check.   
2017-01-17 12:00:05   Device status changed to Up from check.   
2017-01-17 11:10:05   Device status changed to Down from icmp check.   
2017-01-17 10:25:05   Device status changed to Up from check.

on the contrary, have a look at the same librenms host snmp polling another pfsense 2.3.2 host; note that the one break was due to me changing settings and rebooting a few hosts at the same time with network convergence taking a while;

both using the same setup (v2, main mode, ips as identifiers, pf groups, cryptos); both have global ips assigned to their interfaces (no nat) and both using a lan to lan phase2 setup; i'm lost :(
#9
16.7 Legacy Series / Re: IPSEC issues latest stable
January 15, 2017, 07:23:13 PM
no difference really from an end user perspective; inconsistent ipsec tunnel behavior across 16.7.11-13 and 17.1b (both on 10.3 and 11.0 bsd)
#10
16.7 Legacy Series / Re: IPSEC issues latest stable
January 15, 2017, 06:13:38 PM
happy new year as well franco! :)

i went from pfsense 2.3.2-RELEASE-p1 to opnsense 16.7.11 and through to .13 but afaicr the changelog mentioned nothing relevant to charon or strongswan;
#11
16.7 Legacy Series / Re: IPSEC issues latest stable
January 15, 2017, 03:36:13 PM
guess that there's no fix for this;

bottom line is, i had ipsec tunnels stable for days with latest pfsense and migrating to opnsense broke them even though the same settings are in use; what's funny is that i still have 2 pfsense devices and those are able to keep ipsec tunnels stable with my opnsense boxes

so it seems that opnsense -> opnsense ipsec has issues;

what logs can i provide to have someone much smarter than me look at it? :)
#12
so, i finally have a bit more time to dig around and noticed something odd;

first, the setup explained however - the topology looks as follows

ISP fibre -> media converter -> WAN on an APU1 board running OPNsense -> two local subnets each with dedicated ETH interface;

now, there's an ESXi box sitting somewhere on a local network attached to ETH2 which has a few virtual networks for separation behind yet another OPNsesne instance, this time virtual.

so the VM has its WAN on the same switch as the ISP and APU WAN links - hope this isn't confusing

finally the ISP assigns (by DHCP) an IP which the APU WAN port receives; that's a X.Y.A.39;
additionally, the ISP routes an entire X.Y.B.88/28 network towards the same IP

so as before with PFsense, I assign the .89 and .90 IPs as Virtual IP aliases on the same APU WAN port; i use those for IPSEC and traffic whereas the actual DHCP given IP is for pure management

now, the VM instance has a static IP of X.Y.B.91 on its WAN link (again, on the same broadcast domain) and .92 .93 and .94 assigned as virtual IPs on the same WAN port. those are used for port forwarding (web and email hosting) and IPSEC tunnels;

obviously, since the APU needs to act as the gateway for the /28 network, i have proper firewall rules in place; there is no NATing of any sort and I see the traffic from the outside (http or email) desitned towards the .91/.94 passing through it on the WAN if; that works

Analogically, the VM has a default route of .89 and is able to go out to the internet just fine for upgrades or time etc.

Both OPNsense hosts have the other one's MAC addresses in their arp tables just fine; ping works both ways; all good;

The VM does gateway monitoring (arpinger) for the APU .89 and that shows as online - once I configure gateway monitoring on the APU to check .91 for the VM it however always shows Offline; note that there are some subnets behind the VM and behind the APU and routing between those (with static routes) also works so I have no idea where this comes from;

Now, the reason why i think it might be related to my non-working IPSEC tunnels is that when i check log entries containing one of the other 7 nodes which are remote geographically and use different ISPs, the VM IPSEC log shows:
charon: 14[KNL] Z.V.X.41 is not a local address or the interface is down

i know that this is a lot, any guidance for what else to check or what to expect out of OPNsense would be helpful though!
#13
16.7 Legacy Series / Re: IPSEC issues latest stable
December 31, 2016, 11:27:42 AM
it's actually set for IKEv2 for all tunnels except the mobile one (which does work in the 2 locations where deployed).

All settings are equal, own and peer indentificators are set to reflect the IP addresses devices in question have and those aren't behind any NAT (globally routable IPs assigned to the interfaces which tunnels should bind two).

I have a total of 9 devices in a few countries, randomly after migrating to OPNsense some went up the second i clicked save/apply and the other ones never did. all entries created by hand so i rule out pfsense config xml parsing issues. Random! :) I hate random when computer systems should be deterministic ;)
#14
16.7 Legacy Series / Re: IPSEC issues latest stable
December 31, 2016, 12:40:22 AM
thanks for having a read fabian!

logs, obviously; i set the log level at control for most items on the list; i get a lot of the following messages:
charon: 11[KNL] unable to query SAD entry with SPI XXXXXXXX: No such file or directory (2)
charon: 11[JOB] deleting half open IKE_SA after timeout

one other 17.1b box (thought i'd try to see if the upgrade changes anything but it didn't) also reports:
charon: 13[MGR] checkin and destroy IKE_SA (unnamed)[19]
charon: 13[IKE] IKE_SA (unnamed)[19] state change: CONNECTING => DESTROYING

i undersand that without context that's still perhaps not detailed enough - other messages are the usual i'd expect to see strongswan generate;

as for DPD i never had good experience and for the past years it only made my ipsec tunnels unstable when using pfsense; gave it a try following your suggestion and was about to say that it's much better now for the tunnels that go up but need more testing to be sure; still the other ones just won't ever get up and it's not a config mismatch issue... happy to hear your thoughts
#15
16.7 Legacy Series / IPSEC issues latest stable
December 28, 2016, 03:09:30 PM
Hi guys,

Thanks for all the great work you're doing, OPNsense is awesome! Saying that after I've been using PFSense for many many years on all sorts of platforms.

To the point, I migrated some PFSense boxes to OPNsense the other day whilst retaining my IPSEC mesh config (with around 9 boxes doing network to network as required). Most settings are as follows:

v2, default conn, IPv4, via the WAN interface (or a virtual IP on the WAN if), main, mutual PSK, IP addresses as identifiers, AES128/SHA1, DH2, default lifetime, no DPD or NAT-T;

Phase2 is LAN to LAN IPv4 tunnel, ESP, Blowfish128/MD5, PFS2 with default lifetime and ping set to target remote gateway internal IPs.

All worked fine on PFsense being super stable, now i'm getting tunnels dropping every few minutes or hours at random, being offline for a few minutes and then going back; some tunnels never go up anymore (always same ones) but examining their respective configs on both ends of that given link (dump xml and check what's in it) shows no inconsistencies.

Is there a known bug? (tried looking, nothing seems to be that) hence my question - unless there are any reasons why the above settings would yield poor results on OPNsense?

thanks and happy new year everyone!