OPNsense Forum

English Forums => General Discussion => Topic started by: schnipp on May 25, 2018, 04:42:23 pm

Title: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on May 25, 2018, 04:42:23 pm
Hi all,

I have problems with the builtin SIP client of my Fritzbox in conjunction with dynamic IP changes triggered by my ISP after 24 hrs. There are already a lot of similar discussions in this forum, but not covering all aspects. The Fritzbox works in IP client mode an gets a private IP address from the Opnsense which itself controls the PPPoE connection to the ISP.

In IP client mode the Fritzbox behind the firewall is not aware of upcoming IP changes. After the latter has occured no incoming SIP connections are possible anymore. This is reasonable because the ISP's SIP registrar knows only the previously registered IP address which is now outdated.

Normally the SIP clients (Fritzbox) must perform a new registration to the SIP registrar. Unfortunately, this does not happen due to the missing information regarding IP address change. By the way the Fritzbox does not allow the to manually the registration expiration time. The siproxd in proxy mode is also no solution.

Has anybody an idea how to solve this issue?

Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on May 27, 2018, 04:16:46 pm
I did some more investigation and tracked this issue down. The Fritzbox behaves correct and tries to initiate a new SIP registration to the registrar after the public IP has changed. Wireshark showed up a lot of SIP registration requests sent by the Fritzbox which are not answered by the registrar.

A closer look to the packet capture file of the WAN interface revealed multiple source IP addresses used as sender address of overall sent packets. I identified these addresses as orphans from previous WAN connections (before IP change).

The cause are stale entries in the NAPT table after the WAN IP has changed. Outdated but still existing entries translate the private source IP of packets to be sent with an outdated public source IP address. These packets will not be forwarded by the ISP due to invalid source IP.

So it seems to be a bug in Opnsense. Manually flushing the state table after changing the public IP solves the issue and lets the Fritzbox successfully re-register its SIP acounts. I will open a new entry in the issue tracker on github.
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: marjohn56 on May 27, 2018, 05:35:28 pm
Have you tried enabling the KIll States in the gateway monitoring in Firewall->Settings->Advanced.
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on May 27, 2018, 07:23:26 pm
Quote
Have you tried enabling the KIll States in the gateway monitoring in Firewall->Settings->Advanced.

Thanks for the hint, I have tested it, but without success.

In my eyes killing the NAPT table states by monitoring the Gateway would not make any sense due to possible race condictions when cleaning the NAPT table.

The gateway monitoring and renewal of the clients IP address do not have a causal dependancy. In detail, the process of renewing a client's IP address could be fast enough that the parallel acting monitoring process does not recocnize any interruption (because the gateway address does not necessarily change). Furthermore, after performing state killing due to a non responding gateway, new entries with the outdated IP address could introduce into the NAPT state table if the WAN interface still has the old IP address (which is not valid anymore). 

So, it is important to do NAPT state killing after the WAN interface is cleaned up (old IP address removed).
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: marjohn56 on May 28, 2018, 08:22:13 am
OK, well it was worth a try. So what you really need is that when dhclient ( I'm assuming that this is IPv4 ) gets a new IP you want the states cleared at that point?
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: franco on May 28, 2018, 11:15:31 am
You mean like someone requested 10 days ago and already finished for 18.1.9 this week?

https://github.com/opnsense/core/issues/2414

:D
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: marjohn56 on May 28, 2018, 11:18:37 am
Yes, exactly that. 8)
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: franco on May 28, 2018, 11:20:17 am
What are the odds, huh...


Cheers,
Franco
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on May 28, 2018, 06:36:31 pm
The github post looks like the same issue, so I do not need to open a new item on the issue tracker.
And a patch is already available 8). I will test it the next weekend.

I had a short look at the code snippet and asked myself if there is a need for a configuration option. In general, if a WAN interface gets a new IP and the old IP gets outdated, related entries in the NAT (and firewall) state table should always be removed.
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: franco on May 29, 2018, 08:17:11 am
This would pose a huge issue for Multi-WAN, especially when you are load-balancing. ;)

I checked with other solutions code and while they flush the NAT table for said IP the connection tracker is not NAT so that's why this is needed alongside... NAT comes after connection tracking so clearing NAT states does not help...


Cheers,
Franco
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on May 29, 2018, 03:48:39 pm
Quote
This would pose a huge issue for Multi-WAN, especially when you are load-balancing.

In my eyes it depends.

If a network link goes down and the ISP assigns you a new IP address (as a replacement) when the link comes up again, the old IP address becomes invalid. In case a link of a multilink connection goes down the assigned IP address is in general still valid.

Only the entries related to an invalid IP address have to be removed from the NAPT's and firewall's state tables. All other entries should remain. If you trigger a full cleaning of state tables, connections which are still alive will break (e.g. internally routed connection among network interfaces).
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: franco on May 29, 2018, 06:35:57 pm
Connections are tracked via internal IP / External IP, not via NAT'ed IP. There is no way to easily separate the state table.


Cheers,
Franco
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on May 29, 2018, 07:47:25 pm
The following command should work, or?

pfctl -k <old_ip> 
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: franco on May 30, 2018, 09:21:52 pm
Hi schnipp,

Yes and no. What is old_ip? The old WAN IP? The connection is initiated by the client, the firewall tracks (internal IP, external IP) for the state, then hits NAT. The return packet hits NAT and is translated to internal IP where the state is found.

So you can:

Kill the state of the firewall's own connections
Kill the state of the internal IP's connections
Kill the state of the external IP's connections

But you can't:

Kill the state of the NAT IP and at the same time flush the state table for unrelated entries internal IP and external IP.


Cheers,
Franco
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on May 31, 2018, 07:45:12 pm
With old_ip I mean the previous WAN IP assigned by my ISP. Thanks for the explanation in which order firewall and NAT work together. In this case there is no need to remove states from the firewall, but related ones from the NAT table.

My first workaround used pfctl -F states which was too much. But I'll try the patch. Maybe, both of us mean the same  :)



Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: franco on May 31, 2018, 08:06:08 pm
pfSense does it the way you describe always, but despite this there is still the override that was introduced in 18.1.9 so I think there is a problem with only killing NAT states or it simply does not solve the issue for persistent connections.


Cheers,
Franco
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on June 03, 2018, 07:52:40 pm
I tested the patch, clearing the states work well. But now, internal connections between interfaces will also be reset. So, in case the WAN connection breaks (e.g. DSL signal loss, 24h reconnect, …) a reset of internal connections occur (e.g. remote backups etc. will fail).
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on August 16, 2018, 06:55:28 pm
[...] But now, internal connections between interfaces will also be reset.

As I updated the Opnsense (v.18.7) last days, I trapped again into the PPPoE reconnect loop (https://github.com/opnsense/core/issues/2267) because I forgot to reinstall my workaround. This time it was not possible to install the workaround, because the reconnect loop triggered killing the NAT states which regularly dropped my internal SSH session, too  :'(
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: karl047 on August 16, 2018, 09:58:37 pm
@schnipp: exactly... it was my problem too with PPP0E reconnect loop that I had with 18.7.1_3 too.
I reported earlier that the PPP0E RE-Connection was stable just on 18.1.9 & it worked fine without any problem, I've tried it over a couple of days & than more 10 times a day, later with the another Updates of 18.1 or with 18.7 doesn't work anymore.

For your Fritzbox after the change of the dynamic IP on WAN interface, have you tried the option:
"Reset all states when a dynamic IP address changes" in Firewall -> Settings -> Advanced ? the last option there?
you can simply check it, & let the option "Disable State Killing on Gateway Failure " checked, it was the solution of this issue for me, and my Fritz works continually fine after the change of the IP Address (I mean of course the IP Address of my ISP).
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on August 20, 2018, 10:00:59 pm
@karl047: I think there is a misunderstanding. The issue with the reconnect loop triggered state killing every 30 seconds (which also dropped LAN connections like SSH to Opnsense console). So it was hard to apply my workaround for the reconnect loop issue.

My wish is to limit state killing only to WAN connections. If I understand correctly, this will be a lot of work in scripting etc.
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: glasi on October 09, 2021, 08:04:29 pm
Sorry for pulling this old thread out again.

Historically, I had enabled the setting "Reset all states when a dynamic IP address changes" to avoid any stale states which would lead to problems with my VoIP setup.

I completely missed out that I don't need this setting any longer as since OPNsense 21.1 WAN IP address changes are detected by rc.newwanip script and states of outdated WAN IPs will be removed from the state stable.

The beauty with the code changes in rc.newwanip script is that LAN connections now keep alive on WAN IP address changes while state entries originating from the old WAN IP will still be killed.

However, I am asking myself if the respective code snippet in rc.newwanip should be amended by mwexec('/sbin/pfctl -k 0.0.0.0/0 -k ' . $cacheip); to ensure that also all state entries destinating at the old WAN IP will be killed. The code snippet would then look like as follows:

Code: [Select]
if (is_ipaddr($cacheip) && $ip != $cacheip && !isset($config['system']['ip_change_kill_states'])) {
        log_error("IP address change detected, killing states of old ip $cacheip");
        mwexec('/sbin/pfctl -k ' . $cacheip);
        mwexec('/sbin/pfctl -k 0.0.0.0/0 -k ' . $cacheip);
}


During my testing I've experienced one unsightly issue. States won't be killed when the files with the cached IP addresses are deleted from /var/db. Now one might wonder if that can happen. Unfortunately, the answer is yes. The files will be deleted once the pppoe interface will be removed (maybe due to pulling out the WAN network cable or when clicking pppoe disconnect in the GUI).

For that reason I would like to suggest, that the cache files should remain untouched when the pppoe interface will be removed.
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on October 10, 2021, 11:49:58 am
@glasi:
Thanks for your investigation. My first tests also show that the following two commands are sufficient during a dynamic IP change on the WAN interface:

Code: [Select]
mwexec('/sbin/pfctl -k ' . $cacheip);
mwexec('/sbin/pfctl -k 0.0.0.0/0 -k ' . $cacheip);

Opnsense should proceed with the following work flow in case a dynamic IP change occurs on the WAN interface, no matter what reason triggered the change (e.g. loss of link, ppp link down event sent by ISP, rejected renewal of DHCP lease etc.)


Draft:
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: Fright on October 10, 2021, 05:57:18 pm
@glasi @schnipp hi!
Quote
mwexec('/sbin/pfctl -k 0.0.0.0/0 -k ' . $cacheip);
imho although it seems logical to me in terms of freeing up space in the states table, from a practical point of view: what a chances that such a packet will appear on the interface after dynamic ip change?)

Quote
Opnsense should proceed with the following work flow in case a dynamic IP change
hmm. may I ask for a comment?
3.1. ip address is explicitly specified in the rules?
3.2. static routes on dynamic interface?
3.3. can you give an example please?
3.4. isn't that happening now? (mwexec('/sbin/pfctl -k ' . $cacheip);)
the problem is that the 'pfctl -k' can kill the state by the source or target ip. but not by mapped address
so actually we have to parse the 'pfctl -ss' output and kill states by id?
or run some custom script with "pfctl -k internal_client_ip -k target_server_ip"?

imho it is worth remembering that we force the opnsene to do what it should not do.
if the application claims to be NAT-aware then it should take care of such situations
(for example, if the PBX maintains states with frequent keepalive\heartbeats, then there should also be configurable delays available (for example. "send keepalive packets every 'n' seconds. if the answer is not received then try 'k' times with 'm' sec interval, then keep silent for 'p' seconds and initiate new registration"))
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on October 10, 2021, 07:19:04 pm
@glasi @schnipp hi!
Quote
mwexec('/sbin/pfctl -k 0.0.0.0/0 -k ' . $cacheip);
imho although it seems logical to me in terms of freeing up space in the states table, from a practical point of view: what a chances that such a packet will appear on the interface after dynamic ip change?)

In theory no packets with the old IPv4 address will arrive, but from practical perspective there is no global answer. But, you can treat this old IPv4 address like an additional (temporary) bogon.

[...]. may I ask for a comment?
3.1. ip address is explicitly specified in the rules?
3.2. static routes on dynamic interface?
3.3. can you give an example please?
3.4. isn't that happening now? (mwexec('/sbin/pfctl -k ' . $cacheip);)
the problem is that the 'pfctl -k' can kill the state by the source or target ip. but not by mapped address
so actually we have to parse the 'pfctl -ss' output and kill states by id?
or run some custom script with "pfctl -k internal_client_ip -k target_server_ip"?

3.1: There can be firewall or forwarding rules on the WAN interface which integrate the dynamic IPv4 address which need to be updated.
3.2: Possibly yes. Static routes bound to the WAN interface (internally need an update of the gateway IP address. Ok the latter one is dynamic  :) ).
3.3: This step is optional but shortens the time that local processes on the Opnsense run into a timeout.
3.4: Yes, it's happening now, but can be extended by the second command as glasi already mentioned  :). Killing the states identified by the old WAN IP address is sufficient.


imho it is worth remembering that we force the opnsene to do what it should not do.

I don't think so. SOHO DSL routers do the same.

if the application claims to be NAT-aware then it should take care of such situations
(for example, if the PBX maintains states with frequent keepalive\heartbeats, then there should also be configurable delays available (for example. "send keepalive packets every 'n' seconds. if the answer is not received then try 'k' times with 'm' sec interval, then keep silent for 'p' seconds and initiate new registration"))

NAT-awareness of applications do not matter, because as I already mentioned, the issue resides at OSI layer 3 and 4. Packets sent by the applications (also keep-alives at application level) reset the timeout counter of outdated NAPT entries which then will never be deleted.
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: Fright on October 10, 2021, 07:57:44 pm
the discussion is becoming more theoretical, but I hope this is ok)
Quote
you can treat this old IPv4 address like an additional (temporary) bogon
do not agree. it is just an old address, new connections to which will not be allowed, and old ones will expire according to the timeout settings. in scenarios with high load, freeing up space in the states table is probably logical, but nothing more IMHO
Quote
3.1: There can be firewall or forwarding rules on the WAN interface which integrate the dynamic IPv4
but the rules contain parentheses for the interface address. so there is no need for additional actions (like reloading rules) to pick up a new address?
Quote
3.2: Possibly yes. Static routes bound to the WAN interface
spooky config )
Quote
3.3: This step is optional but shortens the time that local processes on the Opnsense run into a timeout
shouldn't the local process connect to the localhost?
Quote
I don't think so. SOHO DSL routers do the same.
in my opinion the same thing: routers devs have to workaround others issues because of:
Quote
the issue resides at OSI layer 3 and 4
and higher levels too. if app uses stateless proto then the flow control falls on the application.
(in tcp its somewhat simpler)
if the connection for some reason stops working, what is the point in continuing to persistently and quickly knocking on the broken door? the application should provide different actions in response to events (especially since it is considered nat-aware and it is this awareness that creates problems)
but of course this is more a question of terminology.

imho there may be another workaround for this: make a very low timeout for the nat-rule and add a separate rdr-rule for incoming packets


Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on October 12, 2021, 06:18:09 pm
the discussion is becoming more theoretical, but I hope this is ok)

Ok, let's shorten the discussion and try yourself if you still don't believe me  :)

Preparation:

Testing: ('$' means execute the following command on the command line)

Result:
The DNS query to 8.8.8.8 times out, because it hits the outdated NAT state table entry and the request will be translated to the wrong (outdated) source IPv4 address. These packets will be dropped by the ISP. As long as the client fires DNS requests to 8.8.8.8 the existing NAT state table entry will NEVER expire, because the timer resets to its starting value on every DNS query.

After manually deleting the outdated NAT state table entry ($ pfctl -k 0.0.0.0/0 -k 8.8.8.8 ), DNS queries to 8.8.8.8 will be successfully answered :-)

For this reason, deleting the outdated NAT state table entries is essential after the WAN IP has changed.

Edit:
- I have added an excerpt of a wireshark packet capture
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: Fright on October 12, 2021, 07:21:24 pm
Quote
try yourself if you still don't believe me

I didn’t seem to say anywhere that I don’t believe you or that pf doesn’t work the way you say it  ;)
everything is exactly like that.

I said that pf works as it should (from my point of view). and the application should take into account the possible change of the external address (or other circumstances), especially if it is supposed to be nat-aware (and intentionally keeps the state alive)
the example with dns, by the way, is quite indicative. Any dns client allows to specify several DNS resolvers and this removes the need to solve the states "problem" on the firewall side (and this despite the fact that the dns client does not intentionally send requests to save the state).
so I am talking only about two things: the correct application should provide settings for working with a nat-device with a dynamic address IMHO and that there may be ways to solve the "state issues" without resetting (and i cannot test this assumption since my PBXs works on static addresses)
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: schnipp on October 18, 2021, 03:17:15 pm
@Fright: It looks like you haven't understood the demontrated DNS example and to my mind you are lacking of basic TCP/IP stack and NAT knowlege. It has nothing to do with specifying multiple DNS resolvers or changing any of them. The discussion around SIP and DNS exemplarily illustrates possible issues regarding any TCP/UDP communication in that context. Related to the states issue, internal devices do not need to be NAT aware.

It does not make sense to discuss this topic further. It's a fact that opnsense has bug in managing the NAT states.  I'll raise a github ticket that this issue gets solved
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: Fright on October 18, 2021, 07:41:59 pm
Quote
It does not make sense to discuss
now for sure
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: glasi on October 18, 2021, 09:01:17 pm
Testing: ('$' means execute the following command on the command line)
  • [...]

Result:
The DNS query to 8.8.8.8 times out, because it hits the outdated NAT state table entry and the request will be translated to the wrong (outdated) source IPv4 address. These packets will be dropped by the ISP. As long as the client fires DNS requests to 8.8.8.8 the existing NAT state table entry will NEVER expire, because the timer resets to its starting value on every DNS query.

After manually deleting the outdated NAT state table entry ($ pfctl -k 0.0.0.0/0 -k 8.8.8.8 ), DNS queries to 8.8.8.8 will be successfully answered :-)

For this reason, deleting the outdated NAT state table entries is essential after the WAN IP has changed.

Thanks for the example, which illustrates the problem well.

I agree that it basically affects all TCP / UDP communication.

IMHO, the state table should always be cleaned up for invalid entries when changing the IP address. OPNsense actually does it quite well. For 100% perfection, however, any entries that are referenced should really be removed from the state table.

I therefore suggest that we add the following line to the rc.newwanip script, as already mentioned:

Code: [Select]
mwexec ('/ sbin / pfctl -k 0.0.0.0/0 -k'. $cacheip);
With this additional line we really don't break anything.
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: glasi on October 18, 2021, 09:04:12 pm
During my testing I've experienced one unsightly issue. States won't be killed when the files with the cached IP addresses are deleted from /var/db. Now one might wonder if that can happen. Unfortunately, the answer is yes. The files will be deleted once the pppoe interface will be removed (maybe due to pulling out the WAN network cable or when clicking pppoe disconnect in the GUI).

For that reason I would like to suggest, that the cache files should remain untouched when the pppoe interface will be removed.

Anyone having a clue which script or code snippet deletes the IP cache files when the (pppoe) interface is removed?
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: Fright on October 18, 2021, 10:05:51 pm
@glasi
in the end of interface_bring_down() function
https://github.com/opnsense/core/blob/dfe3932166f8bf0658964f588dc4713a0678aa1b/src/etc/inc/interfaces.inc#L981
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: glasi on October 19, 2021, 06:08:30 pm
Thanks. I'll have a look.
Title: Re: Fritzbox (IP-Client mode) SIP + NAT + dynamic IP Change
Post by: glasi on October 19, 2021, 09:02:32 pm
I digged a bit further.

States on the PPPoE interface won't be deleted by interface_bring_down() function for two reasons:


Houston, we have a problem, do we?

What I don't understand is why one would like to kill mpd5 process in this situation.