OPNsense Forum

Archive => 20.7 Legacy Series => Topic started by: EHRETic on November 05, 2020, 09:47:06 pm

Title: 20.7.4 vMotion breaks network connectivity
Post by: EHRETic on November 05, 2020, 09:47:06 pm
Hi there,

Don't know where to start with, but since I've upgraded from 20.1 to 207, it seems I have more and more strange issues (I got some unbound issues too, I had to disable DNSSEC temporarily)

Usually, my FW is on a "should run on this host" in order that my SIEM can capture the traffic. Today, I had to maintain some ESXi host and I vMotioned the FW a few times... it crashed twice! ??? None of the other VM suffered a network connectivity.

I've to mention this NEVER happened before with previous versions, I could vMotion dozen of times.

I couldn't see some issues because I didn't had to move the VM since I've upgraded.

Where do we start to troubleshoot that? ;)
Title: Re: 20.7.4 vMotion breaks network connectivity
Post by: EHRETic on November 05, 2020, 10:07:09 pm
Some more info:
- vSphere 7.0U1
- VMXNET3 cards on VDS
- multiple VLANs & interfaces
- no IPS activated
Title: Re: 20.7.4 vMotion breaks network connectivity
Post by: mimugmail on November 06, 2020, 05:44:47 am
Console output during move/crash would be a good start. :)
Title: Re: 20.7.4 vMotion breaks network connectivity
Post by: EHRETic on November 06, 2020, 02:35:15 pm
Console output during move/crash would be a good start. :)

Well that is the thing: there is no "crash", only network connectivity loss to WAN interface I presume.
When it happened, I could still access the console web page but WAN_GW was not reachable (can't recall the amount of loss, but it was high)

The VM is configured as follow :
2 NICs: one WAN, one LAN with multiple VLANs/interfaces

I'll try to reproduce the issue tonight when wife and myself are not working! ::)

Any advice for logging that properly?
Title: Re: 20.7.4 vMotion breaks network connectivity
Post by: chemlud on November 06, 2020, 02:42:44 pm
I have one install of 20.7 where I deliver the opnsense on the the other end of a tunnel as an additional DNS server (both have unbound with DNSSEC and DNS-over-TLS (port 853) configured) via DHCP, as the DNS is unreliable on this box since 20.7.

I assume the ISP for this box interfers with DNSSEC/DNS-over-TLS...
Title: Re: 20.7.4 vMotion breaks network connectivity
Post by: EHRETic on November 08, 2020, 11:50:44 am
I have one install of 20.7 where I deliver the opnsense on the the other end of a tunnel as an additional DNS server (both have unbound with DNSSEC and DNS-over-TLS (port 853) configured) via DHCP, as the DNS is unreliable on this box since 20.7.

I assume the ISP for this box interfers with DNSSEC/DNS-over-TLS...

We might have something here... I've 2 OPNsense with an IPSEC tunnel between them. Both has their local Internet/DNS breakout, but I have a feeling that sometimes, something is not working properly in DNS resolution.

I hate putting something else as a workaround, but this might be my excuse to try out Pi-Hole, but I'm not sure yet if it can forward DNS resolution in a secure way... ;D
Title: Re: 20.7.4 vMotion breaks network connectivity
Post by: maurotb on March 04, 2021, 12:22:28 pm
I have same problem

Version
OPNsense 21.1.2-amd64
FreeBSD 12.1-RELEASE-p13-HBSD

On vspere7

After vmotion i lost network,
to make network work i need i need

reboot opnsense or
make ifconfig vmxX down ifconfig vmxX up on all interface or
make another vmotion in original server (note,vswitch and vsphere network are ok,other vm make vmotion correctly)

I try to replace vmxnet3 with e1000e , same problem.
No log or error in console and dmesg

Title: Re: 20.7.4 vMotion breaks network connectivity
Post by: maurotb on March 05, 2021, 05:19:56 pm
After spend some time, i have resolved my vmotion problem.
In my huawei L3 switch,i have to put
mac-address update arp
undo arp anti-attack entry-check enable
Title: Re: 20.7.4 vMotion breaks network connectivity
Post by: EHRETic on March 18, 2021, 10:24:47 am
Hi,

On my side, everything was solved when I changed my hosts NICs to 10GB/s, it never crashed again.
From an option point of view, I didn't change a thing, but OPNsense was also updated a few times since then.

Good you could fix it!  ;)