ppp LTE connection going down after certain MB of outgoing traffic, ICMP broken

Started by ktk, December 02, 2019, 07:53:49 PM

Previous topic - Next topic
I have a super weird behavior with my PcEngines APU & LTE based ppp WAN connection (Sierra Wireless MC7430 Qualcomm Snapdragon X7 LTE-A):

I get the connection to work just fine, sets up fast after a reboot and I get a IPv4 address (no v6 and v4 is private IP). I can resolve names & do tcp/udp just fine.

When I do speedtest, download is fine, not as fast as my Wifi router but ok. When I do upload, the connection is slow and drops to zero very fast. After that my link is in nirvana mode for a while, still shown as ppp0 interface but can't do anything anymore. While monitoring the network I figured it happens after around 2.5MB outgoing traffic in general.

In ppps.log I see this the moment the link stalls: https://pastebin.com/tA6kystN

system.log:

Dec  3 01:28:12 bidul opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ppp0'
Dec  3 01:28:12 bidul opnsense: /usr/local/etc/rc.newwanip: On (IP address: 10.131.180.114) (interface: WAN[wan]) (real interface: ppp0).
Dec  3 01:28:12 bidul opnsense: plugins_configure hosts ()
Dec  3 01:28:12 bidul opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'wan'
Dec  3 01:28:12 bidul opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Dec  3 01:28:12 bidul opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv4 default route to 10.64.64.0
Dec  3 01:28:12 bidul opnsense: /usr/local/etc/rc.newwanip: ROUTING: keeping current default gateway '10.64.64.0'
Dec  3 01:28:12 bidul opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to wan
Dec  3 01:28:12 bidul opnsense: /usr/local/etc/rc.newwanip: ROUTING: skipping IPv6 default route
Dec  3 01:28:12 bidul opnsense: plugins_configure monitor ()
Dec  3 01:28:12 bidul opnsense: /usr/local/etc/rc.newwanip: The WAN_DHCP6 monitor address is empty, skipping.
Dec  3 01:28:12 bidul opnsense: /usr/local/etc/rc.newwanip: The WAN_PPP monitor address is empty, skipping.

The link seems to come back after a while but I can very easily trigger it again. I am quite convinced that it's related to how much traffic I *upload*, not to what I download. After around 2MB of outgoing traffic it goes down again, quickly tested it by scp-ing a file to a server.

When I have large downloads it happens at one point as well but I'm quite sure that's when I hit the ~2MB on ACKs or alike.

The interesting thing is that there seems to be something related to ICMP as well. Sometimes I can ping hosts in the Internet, sometimes not. But even if I can after a fresh reboot (and it's not always like this for whatever reason) I definitely won't be able to ping anything *after* the link went down the first time. I see outgoing echo requests on the ppp interface but never an answer.

I'm a bit lost here, this hardware used to work fine before. I recently started from scratch as it was not used for about a year and the only difference IMO is that I'm on a more recent opnsense release. I think the last one was not hardenedBSD yet, could that be a reason? If so, do we still have images I can get pre-hardened so I could test?

FWIW I've also updated the firmware on my LTE card, it was almost 3 years old. But same behavior with the latest release. LTE Signal is in general very good, MIMO antenna attached to it. All values are in Good or Excellent range (tx & rx).


Update: Tested the same setup on pfSense as well, same behavior. I start to suspect this is the u3g driver acting up.


Hi,
some LTE providers just block ICMP traffic and they put you in a private network so they can have more custommers connected with one public IP. This is called Carrier-grade NAT (CGN).
I suggest to disable Gateway Monitoring on the gateway menu, so your gateway wont be set as down after some failed PINGs.

Thanks for the tip, however this seems disabled by default on PPP according to what I see in System->Gateways.

However, I see this in mpd_wan.conf:

set link keep-alive 10 60

This seems related to the

LCP: no reply to 1 echo request

messages I see in the log before the link goes down. Setting this to 0 should disable the check according to the mpd docs.

However, I can't seem to override this as the file gets recreated on every restart.

kill HUP does not help either, mpd seems to get killed that way, not re-configured.

Anyone has an idea how I can override that for testing? can't find it in the UI so I guess this is hard coded.

Latest update: It is stable for 2 days no and I do not see the LCP errors anymore. Also upload speed is much better.

Strange thing is that I did not change anything IMO, maybe provider glitch.