Hi,
since around 23.7.4 dpinger is broken for an IPv4 Gateway on two of my machines. Hardware is entirely different (Intel CPU + NIC vs. AMD CPU + Mellanox NIC).
I can't see anything in the logs that would help identify the root cause. This is particularly annoying as one system is setup with a gateway group to allow failover to another WAN interface.
Oddly enough, the other WAN interface (also IPv4) doesn't exhibit this issue. Could this be related to the interface settings? At least those are a factor that's identical in both systems experiencing this issue (ISP Vodafone Germany, supersede dhcp-server-identifier 255.255.255.255, reject leases 192.168.100.1, custom MAC address).
Patches fb336e3 and 89ee410 didn't solve this either.
Any support to get to the bottom of this is highly appreciated!
Hi,
could you be more specific on your issue:
- Is DPinger still "RUNNING" in services or stopped?
- Does it help to restart it?
- Is there any information in your log file about dpinger when you reconnect?
- What kind of IPs are you pinging? Is it the next hop or some far host?
Hi,
Long time no see. Hope you are doing good!
Can you be a little more specific?
What's the error message? What does it try to start on the command line? How are your gateways set up (far gateway used)?
Cheers,
Franco
For me, enabling 'Disable Host Route' on my problematic gateway (I think) helped.
Maybe my setup is similar to yours:
- 2 ISP connections
- 2 separate opnsense routers
- Failover network between them, to facilitate 'cross' failover if 1 ISP is down
On the failover network, just a /30 between then, I have to enable 'Disable Host Route' - otherwise, if I reboot router a), Dpinger on router b) sees the gateway as down but does not recover to an 'online' status on it's own (without a Dpinger restart).
I mentioned at the beginning 'I think it helped' - I can now reboot either router, the other router detects the gateway as down and now DOES recover when the rebooted-router comes back on line (without a Dpinger restart).
Hi,
sorry for the delayed reply, it's been a busy start of the week!
Quote from: tron80 on October 23, 2023, 09:31:59 AM
- Is DPinger still "RUNNING" in services or stopped?
- Does it help to restart it?
- Is there any information in your log file about dpinger when you reconnect?
- What kind of IPs are you pinging? Is it the next hop or some far host?
-Dpinger is still listed as running, in fast the second interface is being monitored as always.
-The Dpinger attached to the affected WAN interface is stopped and doen't come up when restarted (tried via GUI).
-The only informating from the gateway log is "Reloaded gateway watcher configuration on SIGHUP".
-The monitoring IPs used are 8.8.8.8, 1.1.1.1 and others. I tried several to no avail.
Quote from: franco on October 23, 2023, 09:36:32 AM
Long time no see. Hope you are doing good!
Can you be a little more specific?
What's the error message? What does it try to start on the command line? How are your gateways set up (far gateway used)?
Thank you very much I'm well! I hope the same is true for you :D
It's been quite some time indeed, one could say OPNsense has been running too well ;D
Oh I wish there was an error message :( Other than "Reloaded gateway watcher configuration on SIGHUP" I can't see anything, even directly from the command line. A second dpinger thread required for the affected WAN simply never comes online. ???
The affected WAN gateway (Vodafone Germany, DOCSIS 3.1, TC4400 modem) is setup as upstream and far. This worked for years before the update but one should never discount the possibility of the ISP breaking things... Thus I played around with several variations of these settings as well as "disable host route" just to be sure. The WAN gateway that isn't affected (i.e. failover 5G) is configured as far gateway and nothing else.
Quote from: iMx on October 24, 2023, 08:57:09 AM
For me, enabling 'Disable Host Route' on my problematic gateway (I think) helped.
Maybe my setup is similar to yours:
- 2 ISP connections
- 2 separate opnsense routers
- Failover network between them, to facilitate 'cross' failover if 1 ISP is down
Thanks for chiming in! :)
My setup is quite different as both ISP uplinks are attached to the same OPNsense box with the second OPNsense box (just one ISP) at another independent location.
Right now I can only test on the primary box as messing around with gateways breaks remote access.
good day
I have the same problem with my gateway, I already updated the latest version today, and I still have the same problem
Quote from: bulmaro on October 25, 2023, 04:46:57 PM
good day
I have the same problem with my gateway, I already updated the latest version today, and I still have the same problem
Hi,
the issues are different as I don't get such an error message. Hence, I suspect different root causes.
Hi,
just updated to OPNsense 23.7.8, unfortunally the issue persists :-\
Any ideas on how to proceed as I can't seem to get any useful info from logs?
Thanks!
I have the same issue I am running business edition :
I moved from Pfsense about 2 years ago since I bought the dec670... this used to work and automatically configure the dpinger to ping the default gateway of the connection. it no longer does this
it was suggested to add public dns servers as the monitor IP. I did this, for example 8.8.8.8. well it works until you reconnect and then it shows offline. you have to go back to system ≥ gateway ≥ single. and click edit (do absolutely nothing). just click save and it comes back online.
I boot up my previous used sg 3100 and updated it to the latest release and noticed it still shows the default gateway as the dpinger server to monitor... if I click reconnect, it automatically updates the dpinger to the new gateway address to be pinged.
Hi,
just to add another "same for me" to this thread:
I have a MultiWAN setup, too. One line fibre with DHCP and static IP and the other a DSL leased line with periodic reset every 24h.
When the reset of the DLS line appears, my MultiWAN setup will not route any traffic the the DSL line.
I have to manually disable the device and re-enable it. Once reset, the distribution is running fine again.
Someone able to fill a bug report?
/KNEBB
Hi,
is someone going to report a bug for this?
I use 23.7.8_1 and still have to dis- and re-enable the DSL device manually.
Thanks!
/KNEBB
If you feel it needs to be reported as such, what prevents you from reporting it yourself?
Hi,
the issue persists in OPNsense 23.7.9.
Unfortunately logging is proving less than helpful. Any idea on how to diagnose this?
As it stands, Multi-WAN failover is broken because of this issue. :(
Well, turns out this one is a combined OPNsense AND layer 8 issue:
https://github.com/opnsense/core/issues/6907 (https://github.com/opnsense/core/issues/6907)
Increasing "Time period" to a higher value allowed Dpinger to come up again.
got it.
so I increased probe time to 7 seconds.
seems simple enough
This is an issue still with both the latest business and community
I've changed probe time to 8 and 10
If a vpn connection suddenly stops or is restarted manually it is never monitored again until you edit the gateway and simply click save
It goes green again and starts pinging at the set interval
Quote from: DEC670airp414user on January 08, 2024, 11:40:03 AM
This is an issue still with both the latest business and community
I've changed probe time to 8 and 10
It goes green again and starts pinging at the set interval
when you say you changed probe time to 8 and 10? did you mean:
Probe Interval: 8
Time Period: 10
?
Quote from: Mr.Goodcat on December 31, 2023, 01:52:03 PM
Well, turns out this one is a combined OPNsense AND layer 8 issue:
https://github.com/opnsense/core/issues/6907 (https://github.com/opnsense/core/issues/6907)
Increasing "Time period" to a higher value allowed Dpinger to come up again.
I was looking at this, but has everyone having issues actually changed their "Time Period" from the default of 60? because that's well above the 2.1 times the default probe time of 1.
I hadn't touched the time period, everything was default but if a gateway went offline it doesn't come back up without intervention.
Is the github link a separate issue and all we need to do is up the probe time so Dpinger doesn't just stop?
I'm going to test the following:
Probe Interval: 5
Time Period: 30
Loss Interval: 4
Not sure if I need to change the Loss Interval though from default for this longer Probe Time.
upon further testing and much waiting, yes i believe this is actually working
not they way i had expected but it does recover and go back online for all 4 tunnels i have
Dpinger is back to normal functionality
No longer requiring external dns....
Update to 24.1