OPNsense Forum

Archive => 19.7 Legacy Series => Topic started by: unam on November 29, 2019, 02:54:34 pm

Title: Network outage randomly - Need help to investigate
Post by: unam on November 29, 2019, 02:54:34 pm
Hi,

I used opnsense for few years now and I really like it !

I run a virtual machine on Proxmox (kvm) with 2vcpu and 2gb of ram, 10Gb hdd.

On this vm, I have 4 virtual interfaces with dedicated mac address and routing on the hoster network (ovh).
These interfaces are dedicated to haproxy to deliver web services, and 3 openvpn servers.

On the lan side, I have multiple vlan on the same interface. Each of this vlan is a /30 subnet where I configure a virtual server and an opnsense ip address for gateway.

It was working without any reboot for last 4 months. And, randomly last week, our services where not available anymore and we had to stop / restart the firewall.

Today, another outage and I tried to reboot directly the virtual machine without success, our services became available for 10 seconds. Then the firewall stopped to respond.

For troubleshoot, I checked at the arp table and found that every local ip had the same mac address.

I tried to stop the vm and to start it (cold boot) again, and miracle, everything seems to be fine and working again. I checked at the arp table and every local ip has a specific mac address now.

I think that the arp table was full, and everything dropped. The reboot did not flush the table, maybe because the table is directly reloaded in case of reboot ?

Please if anyone has any king of solution, investigation, or anything else ? I do not really know how to troubleshoot quickly this problem before it appears again ?

Thanks for your reply.

Regards,
Title: Re: Network outage randomly - Need help to investigate
Post by: banym on November 29, 2019, 03:09:11 pm
I don't think this is related to OPNsense.

You would need someone to debugg proxmox, your switch environment and the VMs.

If you have support, try to ask the Proxmox guys, they do a very good job.
Title: Re: Network outage randomly - Need help to investigate
Post by: unam on November 29, 2019, 03:23:36 pm
Yup, on your advices I just checked my ovs-vswitchd.log history and find that on last week, I get

Code: [Select]
2019-11-21T20:52:15.544Z|00788|netdev_linux|WARN|veth104i0: removing policing failed: No such device
2019-11-21T20:52:15.544Z|00789|ofproto|WARN|vmbr1: cannot get STP status on nonexistent port 33
2019-11-21T20:52:15.544Z|00790|ofproto|WARN|vmbr1: cannot get RSTP status on nonexistent port 33
2019-11-21T20:52:15.546Z|00791|bridge|WARN|could not open network device veth104i0 (No such device)
2019-11-21T20:52:20.155Z|00792|bridge|WARN|could not open network device veth103i0 (No such device)
2019-11-21T20:53:20.214Z|00793|bridge|WARN|could not open network device veth102i0 (No such device)

I keep investigating that way.

Thanks for your quick reply !

Regards,
Title: Re: Network outage randomly - Need help to investigate
Post by: banym on November 29, 2019, 03:25:46 pm
You're welcome.
If it is related to OPNsenes it would be nice to keep us updated here.

Good luck with your debugging.