[isolated: see #91] PPPoE reconnect loop

Started by schnipp, February 11, 2018, 02:46:04 PM

Previous topic - Next topic

May 06, 2018, 01:05:54 PM #91 Last Edit: May 06, 2018, 01:08:02 PM by schnipp
I succeeded in isolating the fault  :)

The problem is a timing issue. The details I have described in the github bug tracker (https://github.com/opnsense/core/issues/2267#issuecomment-386871115).

As a dirty work around I moved the ppp-linkup script to execute in the background to prevent blocking the mpd5 daemon process. So rename the the script "/usr/local/sbin/ppp-linkup" to "/usr/local/sbin/ppp-linkup2" and create a new ppp-linkup script in the same directory with the following content:


#! /bin/sh

nohup /usr/local/sbin/ppp-linkup2 "${1}" "${2}" "${3}" "${4}" "${5}" "${6}" "${7}" "${8}" "${9}" "${10}" &
exit 0


Warning: The work around is not well tested and could result in any side effects due to unknown dependencies to system states and process related synchronization issues.
OPNsense 24.7.11_2-amd64

Not sure if we should talk here or in the GitHub issue... but only fair to follow up here as well.

Did your workaround work better than the test patch proposed in GitHub or are they virtually the same? It looks there is an underlying issue as well.


Cheers,
Franco

As not everybody is reading on github we can continue here as well :-)

Currently, my work around works better than proposed patch at github. The patch seems to have problems with setting the default route (and or any services?) in case of a reconnect (without reboot).

But generally, the patch pursues the right way by outsourcing long time running tasks in the background.


Best regards,
  schnipp

OPNsense 24.7.11_2-amd64

Hi schnipp,

Okay, thanks, this is funky indeed. Let me sleep on it.


Cheers,
Franco

Today I rebootet my firewall and was offline .. again a reconnect loop. I got a IPv6 address but not IPv4. Rebootet again, nothing changed. I edited some interfaces to force a reconnect, nothing changed. After some time I saw in Assignments it was again vlan7 assigned to WAN and not pppoe.
I changed it and then it worked again ... strange.

At one time I got the following in my ppps.log

http://dpaste.com/1JJY3JD

Quote from: mimugmail on July 12, 2018, 08:43:07 AM
Today I rebootet my firewall and was offline .. again a reconnect loop. I got a IPv6 address but not IPv4. Rebootet again, nothing changed.
[...]
At one time I got the following in my ppps.log

It is also a reconnect loop, but a different issue.
It looks like an invalid configuration, however it happened.
OPNsense 24.7.11_2-amd64

Unfortunately, the planned bugfix moved to milestone 19.1. So, every update overwrites my workaround (modifications to /usr/local/sbin/ppp-linkup script). Is there a neat way, to prevent this?
OPNsense 24.7.11_2-amd64

I was under the impression https://github.com/opnsense/core/issues/2267#issuecomment-387167501 said that we do not have a workable solution just yet.

I don't think it's a good idea to push a fix into a release that we haven't fully understood yet. I'll be back in September to look into it. In any case, please keep prodding. Most work we do is prioritised and ordered by the amount of help and discussion from reporters. If your updates are missed, please prod again.


Cheers,
Franco

Franco, you're totally right. Please don't misunderstand me.

I only looked for a hint to keep the WAN connection running in case of an update while forgetting to apply the workaround afterwards. I'll write a short script wich applies the workaround after rebooting the machine.
OPNsense 24.7.11_2-amd64

January 29, 2019, 06:16:25 PM #100 Last Edit: January 30, 2019, 06:09:23 PM by schnipp
Opnsense 19.1 will be released the next time. Thank you to all who made this happen. I was really happy that the problem with PPPoE reconnect loops will get solved in this version. Unfortunately, a bugfix has been postponed again.

Without a bugfix opnsense stops working after every update and it is difficult to patch the system by hand because every reconnect attempt rises the state reset feature to clean up the NAT tables. This was introduced to keep SIP communication working after ISP initiated IP address change.

Is there a chance to get the problem solved before release 19.7?

[Edit]
The topic is still relevant for upcoming release Opnsense 19.1, so further discussion will move to this forum (see here). Please answer there.
OPNsense 24.7.11_2-amd64

Last ticket update on https://github.com/opnsense/core/issues/2267 on 26 Jun 2018. As stated elsewhere, please complain early *and* often.

My work queue that is based on 100% self-funding after my 40 hour day job entirely away from OPNsense is maxed out either way. I can only prioritise according to user feedback and progress.


Cheers,
Franco