OPNsense as VM in Xen: Network interfaces down after Debian dom0 updates

Started by astronaut, June 11, 2022, 10:11:00 PM

Previous topic - Next topic
With the review status of the patch https://reviews.freebsd.org/D33876 and no changelog entry for FreeBSD, I am not surprised that 22.7 has the same issue.

In the meantime, I started working on building OPNsense with the patch included in order to find out if that solves our issue. However, with not much experience in building tools and even less experience in FreeBSD, I did not really get far. (I now have a running FreeBSD installation with XTerm, though. Yay. :)) My first succesful build still had the network issue, but I am not entirely sure if I did everything right. Was somebody else more successful than me?

Hi astronaut,

any news on this topic on your side with debian? I still can't boot my ubuntu 20.04 server with the newest kernel otherwise opnsense has no connectivity.

Regards Torsten

Hi Torsten,

No news from my side. I didn't get around to do more tests with compiling a kernel with the integrated patch yet.

My first successful OPNsense build _should_ have the patch included, but it didn't solve the networking issue. Because I don't have much experience in configuring kernels and bulding, even less so in FreeBSD, I am not sure if I made a mistake while compiling OPNsense, if there is some setting I need to toggle for using the new xenback driver, or if the patch doesn't solve this issue after all. I can of course share the newly created .iso file for testing if anybody is interested.

My next step woudl be to reduce complexity and test if a vanilla FreeBSD DomU has the same or a very similar networking issue. Then I would compile a patched kernel for vanilla FreeBSD to find out if the patch solves the networking issue there. If it does, then I would continue towards finding out why the patch doesn't seem to work in OPNsense. But my time is limited...

Any suggestions are welcome. :-)

Good news! I just tested my OPNsense build with patched kernel again on my test DomU, and now the network issue is gone! I can ping other networks, Dom0 bridge and DomU interfaces are up. So it seems to be confirmed that the patch solves the network issue, at least in my setup (Debian 11.5 with kernel 5.10.0-19-amd64 as Dom0, OPNsense 22.7 as DomU).

I don't know if my first test was not done properly or if something else (Dom0 kernel?) changed in the meantime...? Anyway, I will do more tests in the next couple of days on my actual firewall DomU (when my familiy is sleeping :-) and report what I learn from these tests. I can also make available my patched 13.1 kernel, if somebody is interested in testing it. Please let me know.

Hi Astronaut,

sounds good so far. I updated my opnsense to 22.7.6 but same when booting with the newest ubuntu kernel - no network.

I would be intrested in your kernel and test it under Ubuntu.

Kind regards
Torsten

More good news: I installed the newly compiled kernel today on my "productive" home firewall, and things turned out well. Dom0 is running Debian 11.5, kernel version is 5.10.0-19-amd64, so already a couple of updates after the last functional version 5.10.0-10-amd64. Network in OPNsense 22.7 DomU is up. No changes were required besides installing the patched kernel on a normal 22.7 system. I didn't do any thorough tests, though, so be cautious, there might still be hidden traps.

Here are the instructions on how to compile OPNsense: https://github.com/opnsense/tools

Here is the link to the kernel patch: https://reviews.freebsd.org/D33876

Note: I enabled options XENHVM
device xenpci
in the kernel config, but that's probably not a must.

Here is the instruction on how to install the kernel file:https://docs.opnsense.org/development/how-tos/kernel_debugging.html

If you want to try my 22.7-kernel, send me a PM. Be aware, I am far from knowing precisely what I did. :-)

Of course, it would be nice if the patch could be pulled into an official FreeBSD version and hence OPNsense soon... Anybody out there who could support the review process?



Since 22.7.7 (at least that is the version where I noticed the change), this issue seems to have disappeared. I have the standard Debian Bullseye kernel 5.10.0-19-amd64 installed on Dom0, and OPNsense is running normally, all interfaces are up. No special kernel is needed anymore. Fingers crossed. :-)

Quote from: astronaut on December 03, 2022, 08:58:43 PM
Since 22.7.7 (at least that is the version where I noticed the change), this issue seems to have disappeared. I have the standard Debian Bullseye kernel 5.10.0-19-amd64 installed on Dom0, and OPNsense is running normally, all interfaces are up. No special kernel is needed anymore. Fingers crossed. :-)

I can't confirm that at least for Debian kernel linux-image-4.19.0-20-amd64 although I am on current opnsense 22.7. The last working kernel version for me is linux-image-4.19.0-18-amd64.

QuoteI can't confirm that at least for Debian kernel linux-image-4.19.0-20-amd64 although I am on current opnsense 22.7. The last working kernel version for me is linux-image-4.19.0-18-amd64.

I'm now at kernel 5.10.0-20-amd64 and OPNsense 22.7.11, and everything is working flawlessly. You state that you are on OPNsense 22.7. Perhaps upgrading OPNsense helps? As I've written, for me, it only started working again with later versions of OPNsense (e. g. 22.7.7).

If that doesn't help, perhaps trying a newer kernel (e. g. 5.x) is also worth a try?

Quote from: astronaut on January 19, 2023, 09:00:10 PM
I'm now at kernel 5.10.0-20-amd64 and OPNsense 22.7.11, and everything is working flawlessly. You state that you are on OPNsense 22.7. Perhaps upgrading OPNsense helps? As I've written, for me, it only started working again with later versions of OPNsense (e. g. 22.7.7).

If that doesn't help, perhaps trying a newer kernel (e. g. 5.x) is also worth a try?

I'm also on latest OPNsense 22.7.11-amd64. The next downtime I can try a more recent Debian kernel version.

Have you experienced a change between 4.x and 5.x Debian kernel versions?
Beside updating Debian kernel and opnsense - have you changed any other settings?

Since I virtualize pfsense or opnsense I need to switch off tx checksumming on every opnsense interface on the host opnsense is running, so for instance if opnsense has several virtual interfaces every time I start opnsense I need to run ethtool -K ${int} tx off on the host for every single opnsense interface. With this issue discussed here there is no difference if I disable tx checksumming or not, but that's one of the changes I need to keep in mind.

Update:
I updated my virtualization host to kernel version 5.10.0-20-amd64 as well and can confirm that there are no more issues regarding "reconfiguring interface due to feature change".