Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - kpiq

#1
General Discussion / Re: Link fault detection
December 27, 2023, 10:10:07 AM

@Franco


When you get a chance, I've posted a reply to your August statements, dated December 7.


Regards


Pedro

Quote from: franco on August 31, 2023, 09:05:36 PM
Hi Pedro,

Yes, there are high thresholds for both packet loss and latency that when reached will mark the connection as "down" regardless of the actual disrupted link state (which is the traditional packet loss 100%). It also depends on what gateway group trigger is used. All these values can be tweaked per gateway if that helps.

Let me know how this progresses on your end in any case.


Thanks,
Franco
#2
General Discussion / Re: Link fault detection
December 07, 2023, 08:39:53 PM

Franco,

Sorry it took so long to share the results of our testing.  Link fault detection in OPNsense 23.7 and up works fine, but there is something still not right with FRR/OSPF. 


We finally scheduled testing for ISP fiber disconnect of one of the firewalls.  Monitored the OSPF LSA notifications.  Finally saw OSPF removing the external LSA record for the firewall where the fiber was disconnected. 


For a minute our routing tables got adjusted to use the other firewall. I was able to ping a few Internet sites.  But then OSPF started a shutdown/restart loop in the firewall where the fibers were disconnected, and OSPF started announcing its external LSA as if it were connected... that resulted in full connectivity loss (the disconnected firewall had the higher OSPF gateway metric), even when the other firewall was up.


Team decided that we will not rely on OSPF for gateway switchover.  We may try carp, and we have the ethernet links needed, but I'm not sure carp is meant for two firewalls which are 1,000 miles apart with a latency of 25ms.


I'm sorry this did not work as expected this time around.  Maybe it was the wrong way of achieving our goal of reliable uptime thru gateway redundancy.

Thanks for all your help. 

Regards

Pedro
#3
General Discussion / Re: Link fault detection
August 31, 2023, 05:30:37 PM
Franco,


My apologies for jumping the gun.  It seems we have circuit trouble with one of our Internet providers, with latency and packet loss just outside our gateway.   That must be why OPNsense was bringing the WAN port down, as it should.  If I recall correctly OPNsense uses latency and packet loss in the gateway monitoring calculation, right?


Thanks for your continued efforts, time, and support.  It seems like now we'll have to wait to perform the fiber disconnect testing until the ISP trouble is over.


Regards


Pedro
#4
General Discussion / Re: Link fault detection
August 30, 2023, 01:35:18 PM
No.  We use upstream gateway and a /30 static public ip address.


Regards


Pedro
#5
General Discussion / Re: Link fault detection
August 30, 2023, 07:19:11 AM
Well, 23.7.2 absolutely killed my preparation, before I got a chance to demonstrate that the fiber disconnect will trigger default gateway changes and that those will propagate via OSPF causing a proper failover from one firewall/ISP connection to another.


After the 23.7.2 update, the gateway monitoring - with the default settings and the same Monitoring IP as before - forced my firewall to believe that the WAN link was down, intermittently.  We did thorough troubleshooting: cleaned the fibers connected to the WAN port,  replaced the SFP. 


Was about to call the ISP to troubleshoot the circuit when I took a tcpdump and saw traffic traversing the WAN port.  Decided to try disabling gateway monitoring.   That patched our trouble.  Would love to use Gateway Monitoring, but it broke my network.  Will apreciate your help.
#6
General Discussion / Re: Link fault detection
July 31, 2023, 07:27:36 PM
Quote from: kpiq on June 01, 2023, 02:50:03 PM
@franco Great.  Will be monitoring this conversation.

Cheers

@franco

I just upgraded my lab with OPNsense 23.7, will be testing and observing it for a week or two before proposing the upgrade to the production firewalls that were previously not succeeding to failover.

Appreciate all the hard work, your time, and attention. Will keep you posted.

Regards

Pedro
#7
General Discussion / Re: Link fault detection
June 21, 2023, 04:50:23 PM
Quote from: franco on June 02, 2023, 02:31:26 PM
Hi Pedro,

I'm not sure. I wouldn't use a gateway on an interface that's not supposed to reach an external router, but perhaps this works. All I'm trying to say this seems like an uncommon approach.


Cheers,
Franco

Thanks for your prompt action to commit frr8 (https://github.com/FRRouting/frr/issues/13597).  I previously suggested a very unorthodox configuration as a shortcut to the need for gateway failover/switchover.  Instead of playing around with features I don't understand well I'll propose something else, more directly related with Gateway Monitoring than Link Fault Detection.

I understand that relying on someone else's technology is not the first choice for sound development.  But, there are at least two methods widely used over the Internet to verify network connectivity:   Microsoft's NCSI and Android's GoogleConnectivityCheck.

Would it be open to consideration to add a feature to OPNsense that would add choices to the "Monitor IP" option in the Single Gateways?  The feature would be to choose between an IP address to ping (dpinger), the methods used by Microsoft's NCSI and GoogleConnectivityCheck, or to use the FQDN of your ISP's speedtest site.

I'm sure there will be legal and other reasons that will weigh in when making that choice.  Hope it's something feasible.

Regards

Pedro
#8
General Discussion / Re: Link fault detection
June 08, 2023, 03:22:59 AM
Hmmm.... I can't find a single reference to frr in that case.  Will read in more depth.

Thanks.
#9
General Discussion / Re: Link fault detection
June 07, 2023, 11:14:57 PM
I know.   Very unorthodox.  Trying to get some hardware to test it in the lab.

Appreciate all the help!

Gracias...
#10
General Discussion / Re: Link fault detection
June 02, 2023, 05:25:02 AM
Franco,

I guess I answered my own question (above) about "link fault detection" by tinkering around with my home network.  Ran ifconfig before and after disconnecting the WAN cable, saw the carrier loss detected by freebsd.

Now, I was just reviewing the WAN_GW gateway configuration (system > gateway > single) and noticed the "Far Gateway" choice, described as "This will allow the gateway to exist outside of the interface subnet".  WAN_GW is using the WAN interface and is defined as Upstream Gateway, with Far Gateway unchecked.

I definitely don't know what I'm talking about here, but please hear me out.   Even if I wasn't using frr, would this re-route traffic to the other firewall if WAN is down, and vice versa ?

- create another Single Gateway, this one tied to the LAN interface, with a staitic IP address pointing to the default gateway of my other firewall, Upstream unchecked and Far Gateway checked,
- include this new single gateway as a Tier2 gateway in the Gateway Group that already has WAN_GW defined as a Tier1 member.
- Replicate the same setup on the other firewall, reversing the gateway order in the Gateway Group.

Thanks for your patience.

Regards

Pedro
#11
General Discussion / Re: Link fault detection
June 01, 2023, 02:50:03 PM
@franco Great.  Will be monitoring this conversation.

Cheers
#12
General Discussion / Re: Link fault detection
June 01, 2023, 01:37:18 PM
Thanks.  The situation encountered is that, upon 1) WAN link down, 4) frr appears as if nothing happened.  Because I am not certain whether this is an frr problem or an OPNsense problem I entered issues in the respective github repositories, for frr and OPNsense.  Here are the links.

https://github.com/opnsense/plugins/issues/3445
https://github.com/FRRouting/frr/issues/13597

Frr already answered asking to upgrade frr 7.5.1 to a newer release, but in OPNsense you can't choose plugin versions.  So I'm at a loss for helping the ones who are already trying to help.

Most carriers don't support BFD for consumer-grade Internet service.  I end up depending on dpinger, but ICMP is not really meant to detect link faults.  Anyway, the problem at hand seems to be somewhere in a gray area. 

If only I knew which link fault detection protocols are used by OPNsense (I guess they would apply to all interfaces, but I'm interested in the WAN link) I could talk to my ISPs intelligently about it. 

Will appreciate your time and effort.
#13
General Discussion / Link fault detection
June 01, 2023, 12:43:46 PM
Running OPNsense 23.1 on four firewalls, each on separate sites with their respective Internet connections.  Some ISPs update and reboot their cable modes without warning. 

I've configured the WAN interfaces with their Monitor IP, so the dpinger should catch faults.  Monit does a fair job of catching the interruptions, but its limited to running the gateway_alert script on the "cron" schedule, every minute of the hour.

dpinger is obviously catching the faults because I can see the WAN interface down on the dashboard.  Through trial and error OSPF doesn't seem to be noticing the faults.  It's not generating LSA's for the events.

I'm interested in learning about the link fault detection methods used by OPNsense, whether they're configurable, and if they're not, what to expect.  I run LLDP and CDP on my LAN, yet OPNsense doesn't seem to be talking with my other devices even though my rules allow any to any for all protocols in my LAN.  If my ISPs support BFD, CDP, or others it could benefit, but it's important to know what to expect from OPNsense in order to be educated and not waste other's time.
#14
It seems like there are not very many options with the hardware I have.  Don't know if this contaminates this subject.  Let me know if I should open a new topic for this.

The goal is to have route (default gateway) redundancy over multiple Internet connections on different firewalls.  Based on this my next question is, with two OPNsense firewalls in different states, each with its own ISP (static public IPs), is CARP an option when one of them loses Internet connectivity?  I am under the impression that CARP (or VRRP in other cases) would be used for firewall failover,when the firewalls are colocated and are using the same Internet connection, not route redundancy.

Appreciate your time and attention.

Quote from: nzkiwi68 on May 28, 2023, 10:54:22 PM
This is not really the answer you are looking for, I know....

I have over the years had many issues with OSPF, running it on switches, pfSense and OPNsense and made the decision a few years ago to move to BGP.

#15
Wow, excellent, thanks. 

From what I've researched it seems that there's been a long term issue with FREEBSD is not deleting IP
addresses removed by frr/ospf 7.5 thru 8.4.1.  That seems to be the most likely issue with my network.

I would consider BGP if our MultiLayer Switches supported it, but they don't.

Really appreciate it, expands my perspective.

Regards

Pedro