IPSEC traffic stalling after 20.7.1 upgrade

Started by Andreas_, September 01, 2020, 03:52:20 PM

Previous topic - Next topic

October 07, 2020, 12:50:57 PM #31 Last Edit: October 07, 2020, 01:09:46 PM by fraenki
Quote from: glasi on September 29, 2020, 09:08:29 PM
IPsec settings...

Thanks for sharing your settings. Another mix of different settings, nothing that looks suspicious to me.
The only thing that we have in common are Intel NICs. Does it make any change if you disable "VLAN Hardware Filtering" (and reboot)?

It would be interesting to know the NIC brand/chipset of the other users...


Regards
- Frank


Hello,

that is my first post at this forum and i found this topic while searching to the described problem. I don't know your short hardware description, but i run opnsense at Intel hw (scope7 1510*). If i can do something to help just tell me. I am not familiar with opnsense yet. I just started replacing one of our bintec routers.

Regards, Proctor


* https://www.landitec.com/products/open-source-appliance-solutions/scope7-open-source-appliances/scope7-1510-detail/

Enough people have reported this problem, so I've created a bug report:
https://github.com/opnsense/core/issues/4415

If you are able to contribute substantial information, please add it to the bug report. But all other discussions and comments should continue here in the forums :)

Quote from: proctor on October 13, 2020, 10:18:27 AM
Hello,

that is my first post at this forum and i found this topic while searching to the described problem. I don't know your short hardware description, but i run opnsense at Intel hw (scope7 1510*). If i can do something to help just tell me. I am not familiar with opnsense yet. I just started replacing one of our bintec routers.

Regards, Proctor


* https://www.landitec.com/products/open-source-appliance-solutions/scope7-open-source-appliances/scope7-1510-detail/

Can you give a more detailed explanation? This error depends on 20.7 to 20.7 on Hardware only. You talk about replacing one bintec

October 15, 2020, 12:27:36 PM #36 Last Edit: October 15, 2020, 12:35:59 PM by proctor
Hello,

I used bintec routers for about 20 years, but the models i need are out of stock. So i started to try/use opnsense as replacement this year. - Therefore no much experience in opnsense.

I set up an opnsense box with version 20.1. x at july and upgraded to version 20.7.x  about 2 weeks ago. After that i had to struggle with broken tunnels.


  • 2 routed esp tunnel to 2 different bintec boxes
  • there are out errors at the interfaces (max about 0,1%) with both versions
  • the tunnel with low traffic is broken 2 times at the 2 weeks after upgrading
  • the tunnel with the more traffic broke more than once a day after upgrading
  • the total amount of data transfered between breaks differs
  • after downgrading yesterday no breaks till now (nearly 24 h)

Configuration attached.

Regards,
Proctor


Quote from: proctor on October 13, 2020, 10:18:27 AM
Hello,

that is my first post at this forum and i found this topic while searching to the described problem. I don't know your short hardware description, but i run opnsense at Intel hw (scope7 1510*). If i can do something to help just tell me. I am not familiar with opnsense yet. I just started replacing one of our bintec routers.

Regards, Proctor


* https://www.landitec.com/products/open-source-appliance-solutions/scope7-open-source-appliances/scope7-1510-detail/

The guys from Landitec sent me a 1510 and I added it to my VPN mesh, pushed 8GB without a problem.
Is there a chance to add your live system to my lab? It would just be ping from LAN IP to LAN IP for testing.

https://github.com/opnsense/core/issues/4415#issuecomment-712056820

I downgraded the live system a couple of days ago (no issues since). I am going to setup an additional device for testing / reproducing the issue for myself. So maybe you could use this one too (should be ready not later than tomorrow).

I think, I could reproduce the issue in my test setup with two OPNsense devices (version 20.7.3). The culprit seems to me a vlan at the lan interface. Without a vlan I could not reproduce the stalling. @mimugmail - if helpful I could give you access to my testing environment (or connect it to yours).

Just as I wrote the last reply the tunnel stalled. So please forget the vlan part. But I still have a setup to reproduce at least my issue.

- tunnel seems ok in status overview
- gateway monitoring shows offline
- no traffic (ping) through the tunnel


Quote from: proctor on October 23, 2020, 09:05:58 AM
Without a vlan I could not reproduce the stalling.

Oh, that's interesting.
@mimugmail, are you using VLANs in your test setup?

Quote from: proctor on October 23, 2020, 09:15:35 AM
So please forget the vlan part.

No problems with vlan in my setup...


                                                                        LACP
+----------------------+    IPsec          +----------------------+     Trunk         +----------------------+
| OPNsense 20.7.4      |    Tunnel         | OPNsense 20.7.4      |     VLAN          | Cisco SG250-18       |
| Intel Atom C3558     |-------------------| Intel Atom C3558     |-------------------| Switch               |
| 8 GB RAM             |    IPv4           | 8 GB RAM             |     2x 1 Gb/s     |                      |
+----------------------+    policy based   +----------------------+                   +----------------------+
                                                                                           /          \
                                                                                          /            \
                                                                                  1 Gb/s /              \ 1 Gb/s
                                                                                        /                \
                                                                                       /                  \
                                                                     +----------------------+      +----------------------+
                                                                     | File server          |      | Client               |
                                                                     | VLAN 10              |      | VLAN 70              |
                                                                     |                      |      |                      |
                                                                     +----------------------+      +----------------------+


Even VLAN on the WAN connection does not seem to be a deal breaker. My provider requires VLAN 7 on the link interface for the WAN PPPoE connection.