Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - kbrennan1

#1
General Discussion / Re: Gateway Monitoring
September 15, 2015, 01:25:57 PM
Ok, so I have a rough workaround to this problem.

I've added two cron jobs to run every 60 seconds via the config xml file.
1:  killall -9 apinger
and
2: /usr/local/sbin/apinger -c /var/etc/apinger.conf

It is pretty ugly and the failure event can take up to 60 seconds before it is noticed. The failback is instant.

I still think this is a bug, but my skills do not allow me to go any further.
Any tips/suggestions etc would be great!

Thanks

Kevin
#2
General Discussion / Gateway Monitoring
September 11, 2015, 11:35:37 AM
Hi All,

I am a recently landed m0n0wall migrant trying to get gateway group failover working!

I'm having an issue with gateway groups and monitoring upstream IP addresses.

My setup is running on a Deciso A10 SSD appliance with version 15.7.11. There are two WAN interfaces to different ISP's. One is Tier 1 and the other is Tier 2 in the gateway group.  I have disabled the default "disable gateway monitoring" and I have no other non-default firewall or nat rules set at the moment as this is a new installation.

I can not monitor the upstream gateway ip as it will always be available due to the fact that it is on the ISP CPE and there is ethernet presentation, so even if the ISP fails, the monitor will always work. For the same reason, monitoring the link up/down events will not work either. I have set the gateway monitor to use packet loss as the only metric.

My issue is that when I monitor 8.8.8.8 from the tier 1 interface and 8.8.4.4 from the tier 2 interface, they never fail - even when I disconnect the ISP side of the CPE. The only things that will cause the failure condition to trigger are either the physical WAN port on OPNSense being disconnected, or restating the apinger service. The gateway system logs do not show the failure (untit I unplug the cable of restart the service)

Once the failover condition has been triggerd, outbound routing is as expected. The failback process works with no issues.

I initially tested this configuration in VMware and I put it down to a virtualisation oddity, but now that I can recreate the same issue on a physical device I'm not so sure.

I found a few other monitoring problems on these boards, but there were realted to the service not actually starting.

I'd be grateful for any suggestions.

Cheers

Kevin


**EDIT**
I've recreated this setup with a packet sniffer and I can see that the only time apinger attempts to send an icmp packet is either on service startup or when there is an active failure condition. It *never* sends an ICMP packet when it thinks the gateway is up.

Another oddity I noticed was that the gateway section in the XML config file only closed the tags if the default configuration was present. I configured the gateway explicitly using the default values and the config xml file was correct.

Has anyone had any issues like this in the past?
I was wondering if a cron job to restart the apinger service every X seconds would work, I think it would, but I lack the knowledge to script that in suc a way that it would persist after a reload.