Network ports unresponsive at random intervals requiring a restart in VM

Started by Alexxxey, February 06, 2025, 11:36:19 AM

Previous topic - Next topic
Hello,

Wondering if someone has had a similar issue and can point me to a solution. I am relatively new to OPNsense, have been running OPNsense based router at home with fiber for a few months on Debian with QEMU VM with no issues. Now trying to replicate the setup at relatives house but having some issues.
The system is based on AMD Ryzen 5 3600 with 32GB RAM and X550-T2 10GB intel NIC card. The ISP provides 2.5G symmetrical link with their own ONU (which connects directly to OPNsense), with PPPOE authentication.
Installing OPNsense directly onto the NVME drive works without issues. However, I would like to install Debian for running other docker containers onto the same system. When I install OPNsense onto the same PC running Debian, using virt-manager, the system works for about half an hour but then all network activity stops.
I am unable to log into GUI interface, then only thing I can do is use a monitor and keyboard on host PC and use virt-manager console to log into the VM and look at the logs. So the actual VM is still running, just the network activity that stops. Resetting the VM returns back to normal but the issue re-occurs again after some time.
I pass the X550 NIC ports directly to the VM so it has full control over them. I tried virtual ports but did not get above 1G speed. The fiber here is 2.5G symmetrical.
Things I've tried which resulted in the same issue:
- different X550-t2 card
- Q35 and i440fx
- increase RAM to 16GB, HDD space to 25GB and processor count to 8
- Even tried another Intel based PC which resulted in exactly the same issue!
- Tried to enable and disable hardware offloading
- tried to set MSIX to 0 as I've seen suggested elsewhere, but this causes the VM not to boot with a storage error.
- also tried SR-IOV on the X550 ports with same result.

I have accessed the system log (/var/log/system/latest.log). Where else should I look? bear in mind I can only access the filesystem via the console and not the GUI once this happens. The system logs shows these errors right after the network goes down:
ppp-linkdown: executing on pppoe0 for inet
ppp-linkdown: executing on pppoe0 for inet6

4 minutes before this (while the internet is still working for another few minutes after):
cannot fortward src fexx:2::xxxx:xxxx:xxx:xxxx, dst xxxx:xxxx:xxxx:xxx:interace:xxxx:x:xxxx, nxt 6, rcvig ix0, outif pppoe0

I am completely stuck here, especially since i managed to replicate the issue on two separate PCs (one AMD and one Intel) and the fact that installing OPNsense directly onto the NVME drive works without issues, but inside a VM on the same hardware is causing these random crashes.

Sorry to bother everyone, in case someone comes across this in the future, the issue was that the host OS (Debian) with fresh install enables suspend on idle. Looks like this was disabling the ports for the VM. Even after coming out of suspend, the ports stayed disabled, this is why it was not obvious to me where the issue was.
Suspend disabled and everything is working now.