PBX loses SIP connection randomly. Restart OPNsense restores SIP connection

Started by voiping, October 25, 2023, 10:48:23 AM

Previous topic - Next topic
As the title states loses my PBX the SIP connection. Restarting the SIP-Trunk does not help. Restarting the PBX doesn't help either. Everything else like connecting to the internet or communicating between two devices, works in my network.

Following output is what I get when the SIP connection is failing

-----  Last Packet Received for this User -----

----- Last Diagnostic information for this User -----
resFE_MITOSFW_NET_SOCK_CONNECTIONLOST: Connection lost

----- Current state -----
STUN: STUN Failure
Registration: init

----- Connection List:  -----
[0]: peerAddr=217.0.149.240:5060  TCP  proxy=***.primary.companyflex.de:0  type=Provider  Number of User(s)=1 
[1]: peerAddr=217.0.149.16:5060  TCP  proxy=***.primary.companyflex.de:0  type=Provider  Number of User(s)=1 
[2]: peerAddr=217.0.150.16:5060  TCP  proxy=***.primary.companyflex.de:0  type=Provider  Number of User(s)=1 

Local TCP-port: 0
Remote TCP-addr:


After restarting OPNsense (23.7.6) the PBX can start a SIP connection again and calls in both direction can be made.

Currently there is only an outbound NAT rule for the PBX:

Interface Source Source Port Destination Destination Port NAT Address NAT Port Static Port
WAN PBX_Host  tcp/* * tcp/5060 WAN address * YES


So far I have tried multiple things.

I forwarded SIP and RTP to my PBX.

I have set an outbound rule from PBX to everything with static checked.

Interface Source Source Port Destination Destination Port NAT Address NAT Port Static Port
WAN PBX_Host  * * * WAN address * YES


I set the DNS server on the PBX to the ones that are being asigned on PPPoE registration.

So far nothing helped for the long run.

When checking the Live Logs on OPNsense, nothing is blocked and you can see that the NAT outbound rule is working for port 5060.

That sounds like an issue with a stale state. The next time it happens, instead of rebooting the OPNsense, try to look at "Firewall: Diagnostics: States" and check if there are open states that went stale. You can delete them individually, or on "Actions" "Reset state table".

If that was the problem, you could tune the behavior of states in the firewall rules that allow the traffic of the Port Forward/Outbound NAT rules (For example faster timeouts) or change it globally in:

"Firewall: Settings: Advanced - Miscellaneous - Firewall Optimization"



Hardware:
DEC740

Quote
Quote from: Monviech on October 25, 2023, 11:28:58 AM
That sounds like an issue with a stale state. The next time it happens, instead of rebooting the OPNsense, try to look at "Firewall: Diagnostics: States" and check if there are open states that went stale. You can delete them individually, or on "Actions" "Reset state table".

I will check the state table next time! Thanks for pointing that out!

Quote from: Monviech on October 25, 2023, 11:28:58 AM
If that was the problem, you could tune the behavior of states in the firewall rules that allow the traffic of the Port Forward/Outbound NAT rules (For example faster timeouts) or change it globally in:

"Firewall: Settings: Advanced - Miscellaneous - Firewall Optimization"

I will try setting the Firewall Optimization next time to aggressive. Instead of changing it for the NAT rules, is it possible to set it for an entire IP address?

Thank you so much for your help!

You dont set it on the NAT rules. You set it on the firewall rules that allow the traffic. The firewall rules create the states (check advanced features in one of your firewall rules, you will see "state types" and "state timeout" options etc...)

For TCP you could go down to a 10 minute state timeout for example.

If you set "Associated Filter Rule" on "Pass" in the NAT rules, you won't see any Firewall Rule you can edit. If you set it on "Add associated filterrule" you cant edit the connected firewall rule. You can set it on "None" and create your own firewall Rule to allow the traffic of the NAT rule. Here's a packet flow diagram to help you:
https://forum.opnsense.org/index.php?topic=36326.0
Destination NAT (Port Forward) matches before PF filter rules, that means you have to allow the internal IP address as destination in a filter rule on your WAN interface.
Hardware:
DEC740

Thank you so much for the clarification! I will give that a shot and report back.

Quote from: Monviech on October 25, 2023, 11:28:58 AM
That sounds like an issue with a stale state. The next time it happens, instead of rebooting the OPNsense, try to look at "Firewall: Diagnostics: States" and check if there are open states that went stale. You can delete them individually, or on "Actions" "Reset state table".

If that was the problem, you could tune the behavior of states in the firewall rules that allow the traffic of the Port Forward/Outbound NAT rules (For example faster timeouts) or change it globally in:

"Firewall: Settings: Advanced - Miscellaneous - Firewall Optimization"

So it happened again. I was able to receive a call but there was no audio. I reset the state table and it worked again. Thank you for the tipp!

I set the Firewall Optimization now to aggressive. I read in another thread that it should be set to conservative https://forum.opnsense.org/index.php?topic=3901.0

What would make more sense?

I only have an outbound rule to set the sip port 5060 static. How would I set the TCP state timeout for the PBX? Should I create a NAT rule or could I also just achieve the same if I create a regular rule?

Thank you so much for your help! Already helped a tone.

I think that apart from tuning the firewall to keep the connections open for a bit longer (which would correspond to "conservative"), you could also try if you can find a setting in your PBX for "keep-alive period" and set this to something low like 30 seconds.

The problem with SIP over UDP is that it is "connectionless". The outgoing UDP packets pass the firewall, but the opposite direction has to be kept open in order to allow signaling an incoming call. This can be done by telling the firewall to keep it open for a bit longer or by just keeping the SIP connection alive by telling the PBX to poll more often (aka keep-alive). Some providers allow to use SIP over TCP, which does not have this problem.
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

Quote from: meyergru on November 10, 2023, 01:47:55 AM
I think that apart from tuning the firewall to keep the connections open for a bit longer (which would correspond to "conservative"), you could also try if you can find a setting in your PBX for "keep-alive period" and set this to something low like 30 seconds.

The problem with SIP over UDP is that it is "connectionless". The outgoing UDP packets pass the firewall, but the opposite direction has to be kept open in order to allow signaling an incoming call. This can be done by telling the firewall to keep it open for a bit longer or by just keeping the SIP connection alive by telling the PBX to poll more often (aka keep-alive). Some providers allow to use SIP over TCP, which does not have this problem.

Thank you for your input!

Here is our Config for our SIP-Trunk
----- Configuration Data -----
provider name:           Telekom CompanyFlex SIP-Trunk
user name:               +49XXX
authorization user name: +49XXX@tel.t-online.de
domain name:             tel.t-online.de
transport protocol:      tcp
transport security:      Traditional
media security:          RTP only
proxy:                   tel.t-online.de:0
registrar:               tel.t-online.de:0
expiration time:         540
outbound proxy:          XXX.primary.companyflex.de:0
STUN:                    not used


We use SIP over TCP. When the outage happened, calls could be made but neither side could hear anything. Wouldn't that be more of RTP problem?

I would guess so - with TCP, your SIP signalising should always work. For that reason, I prefer to do a port forward for the RTP ports to my VoIP client.

RTP is strictly UDP, so you can either forward the known RTP ports for your device or have some kind of SIP proxy or firewall module that monitors the SIP signaling and opens the RTP ports automagically.
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+