Home
Help
Search
Login
Register
OPNsense Forum
»
Archive
»
21.7 Legacy Series
»
Opnsense Vmotion hang
« previous
next »
Print
Pages: [
1
]
Author
Topic: Opnsense Vmotion hang (Read 3241 times)
ThyOnlySandman
Jr. Member
Posts: 85
Karma: 4
Opnsense Vmotion hang
«
on:
December 20, 2021, 11:39:40 am »
Having Opnsense crash / hang following
two
vmotions.
Have Opnsense on 2 host ESXi7 cluster. Have a switch's vlans interconnecting Internet Modem / Opnsense ESXi hosts WAN links + Opnsense ESXi hosts LAN links.
I can vmotion a single time to other host. And all is well. However when I vmotion back to original host after it completed Opnsense will hang. Internet drops. Pings latency high + drops on LAN INT. WebGUI stops responding and cannot login via console. Just hangs after pass.
The moment I go to reboot it via vsphere web console it responds again. So I did bit of troubleshooting. Believe I've got in narrowed down to IPSEC strongswan service as culprit. If I stop service I'm able vmotion back and forth without issue. The moment I start it I can vmotion 1 time. But after the 2nd vmotion - hangs.
Any ideas why strongswan is causing this behavior?
Logged
bartjsmit
Hero Member
Posts: 2018
Karma: 194
Re: Opnsense Vmotion hang
«
Reply #1 on:
December 20, 2021, 11:50:19 am »
Is the cluster homogenous? I.e. are the hosts of identical spec?
Are you using a shared datastore and if so, which type (SAN, iSCSI, NFS)?
Do you get OPNsense console warnings about network or storage latency?
Bart...
Logged
ThyOnlySandman
Jr. Member
Posts: 85
Karma: 4
Re: Opnsense Vmotion hang
«
Reply #2 on:
December 20, 2021, 12:04:47 pm »
Yes identical hosts. The Opnsense NICs are different.
Opnsense VM has VMXNET3.
VSAN 2 node. 10Gbps direct connect. Vmotion also 10Gbps via switch trunk link.
No I don't see any logging but does hang. So maybe it does. I've tried to leverage console to review logs but it won't let me login.
I've just vmotioned like 10 times back and forth. Its working perfect with IPSEC VPNs off. So it seems to me network + vmware good to go.
Something about the arp change or ? is upsetting strongswan / Opnsense to the point where only a reboot seems to resolve. Thats after Host1 to host2 to host1.
Edit: Not necessarily requiring a full reboot - that just what I've been doing to get it functional again as no console, SSH, or HTTPS. The guest restart via vmware web console opnsense will immediately begin to respond (Although its then restarting). That's why I decided to identify which service was the cause of the hang as it was clear it was one of the first services that stop in the reboot sequence.
When its hung I've attempted to restart the strongswan service via webui but it will just hang and never refresh/respond.
«
Last Edit: December 20, 2021, 12:27:29 pm by ThyOnlySandman
»
Logged
Patrick M. Hausen
Hero Member
Posts: 6826
Karma: 573
Re: Opnsense Vmotion hang
«
Reply #3 on:
December 20, 2021, 12:27:54 pm »
I'd try switching to E1000 network emulation first. There is strong evidence that VMXnet3 is not the best choice for FreeBSD guests, e.g. on the TrueNAS forum.
Logged
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do.
(Isaac Asimov)
ThyOnlySandman
Jr. Member
Posts: 85
Karma: 4
Re: Opnsense Vmotion hang
«
Reply #4 on:
December 20, 2021, 12:49:33 pm »
Yeah - I may give E1000 a go. But I'd definitely clone this install for isolated test. I've not had the greatest experience with ESXi / Opnsense NIC numbering / changing vmware adapters. Messes up config.
My Opnsense has this "Mod" done to the VMX so that all my VMXNET3 are in proper order. Post 2
https://forum.opnsense.org/index.php?topic=19585.msg91046#msg91046
Still given that issue appears directly related to strongswan service running I'm not certain its a driver issue. If its not running - no hangs at all. But maybe its combo strongswan + VMXNET3.
Logged
ThyOnlySandman
Jr. Member
Posts: 85
Karma: 4
Re: Opnsense Vmotion hang
«
Reply #5 on:
December 20, 2021, 01:18:19 pm »
Interesting. After many 20+ vmotion with strongswan stopped it finally did encounter a Internet hang. But different as opnsense was still responsive with Webui + no LAN INT latency either.
A stop of Suricata got Internet responding. Restarted Suricata and vmotion back and forth again without issue. So my issue maybe spanning more than just strongswan, but strongswan definitely seems apart of the hard hang. I'm also running Sensei - native netmap driver.
Logged
Print
Pages: [
1
]
« previous
next »
OPNsense Forum
»
Archive
»
21.7 Legacy Series
»
Opnsense Vmotion hang