Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - RTSW

#1
Little update on this, on wednesday i went to: System: Gateways: Group, just clicked on edit to the group for failover, didn't change anything and saved it.

Also on the montior ip used 1.0.0.1 for isp 1 and 8.8.8.8 far isp 2, saved (before i was using 1.0.0.1 and 1.1.1.1).

After this changes did some tests to trigger the failover, all errors are gone and online availability behaved like expected.

I will update this if something happens again.
#2
Quote from: mircsicz on May 22, 2024, 01:33:08 AM
I was just hit by this after upgrading to 24.1.7


2024-05-21T19:29:59-04:00 Warning dpinger send_interval 1000ms loss_interval 4000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 0ms loss_alarm 0% alarm_hold 10000ms dest_addr 8.8.4.4 bind_addr 100.99.yy.xx identifier "WAN_SL_DHCP "
2024-05-21T19:29:59-04:00 Warning dpinger send_interval 1000ms loss_interval 4000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 0ms loss_alarm 0% alarm_hold 10000ms dest_addr 8.8.8.8 bind_addr 192.168.1.64 identifier "WAN_MX_DHCP "
2024-05-21T19:29:59-04:00 Warning dpinger exiting on signal 15
2024-05-21T19:29:59-04:00 Warning dpinger exiting on signal 15
2024-05-21T19:29:59-04:00 Warning dpinger exiting on signal 15
2024-05-21T19:13:59-04:00 Warning dpinger send_interval 1000ms loss_interval 4000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 0ms loss_alarm 0% alarm_hold 10000ms dest_addr 1.1.1.1 bind_addr 100.99.yy.xx identifier "WAN_SL_DHCP "
2024-05-21T19:13:59-04:00 Warning dpinger exiting on signal 15
2024-05-21T19:13:57-04:00 Warning dpinger WAN_SL_DHCP 1.1.1.1: sendto error: 22
2024-05-21T19:13:56-04:00 Warning dpinger WAN_SL_DHCP 1.1.1.1: sendto error: 22


I've set the Starlink GW as a far GW for now...

Also there's another similar post

EDIT: Setting it as a far GW doesn't help at all! :-(

Bump on this, updated to 24.1.8 and the problems still happens.
#3
Can confirm tihs.

I was on 23.7.12-5 with a dual wan configuration,  since i use OPNSense as the core of my enterprise network i take some test too serious. One of them is testing very deep the failover beheaviour, i have two different isps both via cablemodem.
I know that disconnecting the coax cable for the cablemodem makes the Sense boxes crazy when configured to failover, but none of this happen on 23.7.n series.

Since upgraded to 24.1.5_3 some of that beahaviours came back, this comprends:

Interface with public ip address but marked as down, no response when tried to ping monitor ip.
Lots of sendto error: 65 on the gateway marked as down in the gateway log.
Suddently high ping and then sendto error: 65
Sometimes when unpluging and pluging the coax cable from a cablemodem it takes very long time to OPNSense to mark the gateaay as up again and then it starts to flap.

Some logs:

2024-05-03T12:12:20-03:00 Warning dpinger FIBERTEL_DHCP 1.1.1.1: sendto error: 65
2024-05-03T12:05:44-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: loss -> down RTT: 10.9 ms RTTd: 2.8 ms Loss: 30.0 %)
2024-05-03T12:05:34-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: none -> loss RTT: 10.8 ms RTTd: 2.6 ms Loss: 12.0 %)
2024-05-03T07:28:59-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: loss -> none RTT: 10.6 ms RTTd: 3.0 ms Loss: 3.0 %)
2024-05-03T07:28:49-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: down -> loss RTT: 10.7 ms RTTd: 3.3 ms Loss: 20.0 %)
2024-05-03T07:21:42-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: loss -> down RTT: 11.1 ms RTTd: 1.9 ms Loss: 32.0 %)
2024-05-03T07:21:31-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: none -> loss RTT: 11.3 ms RTTd: 2.2 ms Loss: 12.0 %)
2024-05-03T06:41:05-03:00 Notice dpinger ALERT: TELECENTRO_DHCP (Addr: 1.0.0.1 Alarm: loss -> none RTT: 16.3 ms RTTd: 4.9 ms Loss: 10.0 %)
2024-05-03T06:40:20-03:00 Notice dpinger ALERT: TELECENTRO_DHCP (Addr: 1.0.0.1 Alarm: none -> loss RTT: 15.4 ms RTTd: 6.6 ms Loss: 12.0 %)
2024-05-03T03:32:54-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: loss -> none RTT: 10.6 ms RTTd: 3.0 ms Loss: 3.0 %)
2024-05-03T03:32:44-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: down -> loss RTT: 10.8 ms RTTd: 3.3 ms Loss: 20.0 %)
2024-05-03T03:28:40-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: loss -> down RTT: 13.8 ms RTTd: 22.9 ms Loss: 30.0 %)
2024-05-03T03:28:30-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: none -> loss RTT: 13.1 ms RTTd: 20.5 ms Loss: 12.0 %)
2024-05-03T02:18:13-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: loss -> none RTT: 10.7 ms RTTd: 3.0 ms Loss: 3.0 %)
2024-05-03T02:18:03-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: down -> loss RTT: 10.7 ms RTTd: 3.3 ms Loss: 20.0 %)
2024-05-03T02:06:51-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: loss -> down RTT: 11.4 ms RTTd: 2.9 ms Loss: 32.0 %)
2024-05-03T02:06:41-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: none -> loss RTT: 11.2 ms RTTd: 2.6 ms Loss: 12.0 %)
2024-05-03T00:17:12-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: loss -> none RTT: 11.9 ms RTTd: 4.0 ms Loss: 3.0 %)
2024-05-03T00:17:02-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: down -> loss RTT: 11.9 ms RTTd: 4.3 ms Loss: 20.0 %)
2024-05-03T00:12:29-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: loss -> down RTT: 10.6 ms RTTd: 1.4 ms Loss: 30.0 %)
2024-05-03T00:12:18-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: none -> loss RTT: 10.6 ms RTTd: 1.2 ms Loss: 12.0 %)
2024-05-02T23:16:49-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: loss -> none RTT: 10.4 ms RTTd: 0.8 ms Loss: 3.0 %)
2024-05-02T23:16:38-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: down -> loss RTT: 10.4 ms RTTd: 0.8 ms Loss: 20.0 %)
2024-05-02T23:11:37-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: loss -> down RTT: 10.5 ms RTTd: 0.8 ms Loss: 32.0 %)
2024-05-02T23:11:27-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: none -> loss RTT: 10.7 ms RTTd: 1.0 ms Loss: 12.0 %)
2024-05-02T23:01:44-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: delay -> none RTT: 10.4 ms RTTd: 1.4 ms Loss: 0.0 %)
2024-05-02T23:01:33-03:00 Notice dpinger ALERT: FIBERTEL_DHCP (Addr: 1.1.1.1 Alarm: down -> delay RTT: 498.6 ms RTTd: 1574.7 ms Loss: 3.0 %)
2024-05-02T23:00:40-03:00 Warning dpinger FIBERTEL_DHCP 1.1.1.1: sendto error: 65
#4
First let me thank you for this amazing enterprise grade solution that you make. I migrated from a pfSense running for 5 years in production to OPNSense 6 months ago, ahead that was a start from zero reconfiguration, your product gave me back that feel of a team actively working on updates an that listens the community, something that pfSense lost since the past years.

I use dns blocklists, vlans, gateway failover, nat, firewalling, openvpn server for road warrior and site 2 site. Everything worked just fine in 23.1.

Since 23.1 was working so well that i had fear on the past updates to update it, yesterday finally decided to upgrade to 23.7

Everything worked, except for the site 2 site tunnels that i have for remote offices, i was having ping from them to me, but not from me to them, i went to check the tunnels configuration, everything fine, but in Client Specific Overrides, everything was gone.

If you experience this, you will need to re add these configurations.
#5
23.1 Legacy Series / Re: Failover problem 23.1
June 10, 2023, 08:04:00 AM
Well it seems that no one can give any feedback on this, in fact i got a reply from a reddit user that gave me some usefull information, this is a known issue that persists across versions.

To anyone looking for feedback o anything to workaround this, you may need to do a custom script to list the states and then kill them all or just the states that are stuck.

ATM, im listing the states with "pfctl -s state -vv" this gives not only the states but with the unique id corresponding for each one, see tag "id:"
Then i take the id that is stuck and kill them with "pfctl -k id -k ID" where ID is the value.

See:
https://forum.opnsense.org/index.php?topic=10385.0
https://github.com/opnsense/core/issues/4652
https://forum.opnsense.org/index.php?topic=31985.0
https://man.freebsd.org/cgi/man.cgi?query=pfctl&sektion=8




#6
23.1 Legacy Series / Failover problem 23.1
June 09, 2023, 04:56:59 AM
Greetings everyone,

I have an issue with failover configuration. Basically, I have followed the steps outlined in this link: https://docs.opnsense.org/manual/how-tos/multiwan.html

It's worth noting that the only parameters I have configured are the ones described in that link. Additionally, the installation is new, and I'm setting up the router from scratch.

At the moment, everything is working fine. Even the DNS rule is functioning correctly. I have tested by hot-disconnecting ISP1 or ISP2, and the failover is working as expected.

The problem arose when I tried to test the failover behavior by pinging both 8.8.8.8 and 8.8.4.4 from a PC continuously. Let's assume that everything was going through ISP1. When I disconnected the ISP1 link, the failover switched to ISP2, and I was able to confirm it by accessing a webpage. However, the pings stopped responding and never resumed. The only way to make them work again was to cancel the infinite ping command, wait a few seconds, and then run it again. Only then did I receive responses.

This behavior is the same if I perform the failover in reverse.

I understand that this behavior should not be the case because if I have client software within the LAN that connects to an internet server, and if the software has some form of keep-alive, the connections should never be able to reestablish unless the software is manually paused. For example, SIP phones that use a cloud-based PBX.

Has anyone experienced this? Am I doing something wrong or missing some configuration?

Thank you very much in advance.