[CALL FOR TESTING] Netmap generic mode queue stall fixes

Started by franco, January 27, 2023, 11:38:45 AM

Previous topic - Next topic
Quote from: franco on March 07, 2023, 08:13:38 PM
@kintaroju: You can use an older kernel without any issue, but I'll prep a new one tomorrow. The bridge support for netmap was updated so I need to adjust the branch this is built on.

@Phiolin: thanks for the update! the generic patch is still in flux it seems and I'm expecting a new version this week, but not entirely sure this will happen depending on the challenge of the stalls given at the moment.


Cheers,
Franco

Hi Franco,

Was going to downgrade the kernel today except I noticed the old kernel with your netmap kernel addition was missing. I did see the new 23.1.2-netmap version, tried it and unfortunately it produced different results where I am missing a netmap interface:

024.325300 [ 321] generic_netmap_register   Emulated adapter for igb1_vlan16 activated
024.645563 [ 321] generic_netmap_register   Emulated adapter for igb1_vlan10 activated
025.677618 [ 321] generic_netmap_register   Emulated adapter for igb1_vlan4 activated

there should also be one for vlan18. When i tried to exclude vlan18 from the zenarmor protected interfaces it still doesn't start up.

So if you need anything on my end to help with the debugging let me know.

I got heavily side-tracked since 23.1.2 came out... the current build with the latest FreeBSD review state is:

# opnsense-update -zkr 23.1.2-netmap
# opnsense-shell reboot

Notes:

1. kernel-23.1.2-netmap2-amd64.txz never existed. You mean kernel-23.1.1-netmap2-amd64.txz perhaps, which is obviously older than the current kernel-23.1.2-netmap-amd64.txz one.
2. The patch does nothing for which interfaces land in netmap mode. That is solely GUI configuration.


Cheers,
Franco

Quote from: franco on March 09, 2023, 07:39:28 PM
I got heavily side-tracked since 23.1.2 came out... the current build with the latest FreeBSD review state is:

# opnsense-update -zkr 23.1.2-netmap
# opnsense-shell reboot

Notes:

1. kernel-23.1.2-netmap2-amd64.txz never existed. You mean kernel-23.1.1-netmap2-amd64.txz perhaps, which is obviously older than the current kernel-23.1.2-netmap-amd64.txz one.
2. The patch does nothing for which interfaces land in netmap mode. That is solely GUI configuration.


Cheers,
Franco

After disabling vlan hw accelerating and changing to protect individual vlan to just the main igb0 interface zenarmor is working again.

the only thing is now i have lots of registered netmap devices:

024.325300 [ 321] generic_netmap_register   Emulated adapter for igb1_vlan16 activated
024.645563 [ 321] generic_netmap_register   Emulated adapter for igb1_vlan10 activated
025.677618 [ 321] generic_netmap_register   Emulated adapter for igb1_vlan4 activated
087.352077 [ 321] generic_netmap_register   Emulated adapter for igb1_vlan4 activated
542.812852 [ 321] generic_netmap_register   Emulated adapter for igb1_vlan16 activated
544.525316 [ 321] generic_netmap_register   Emulated adapter for igb1_vlan4 activated
620.148165 [ 321] generic_netmap_register   Emulated adapter for igb1_vlan10 activated

what would be the way to cleanup the above entries?

I believe this requires a reboot for Zenarmor to cope.


Cheers,
Franco

Quote from: franco on March 09, 2023, 08:39:00 PM
I believe this requires a reboot for Zenarmor to cope.


Cheers,
Franco

Hi Franco,

I've tried power cycling the system a few times and that didn't seem to help, any other ideas?

Contact their support? I don't have much to go on.


Cheers,
Franco

Quote from: franco on March 09, 2023, 10:29:18 PM
Contact their support? I don't have much to go on.


Cheers,
Franco

sounds good, i'll do that, and thanks again for the hard work, at least my zenarmor/surcata is basically functional again

@franco, had a quick question, when do you think the netmap kernel fix will be introduced to opnsense 23.x , just curious, thanks!

It has been accepted and should be included in the FreeBSD tree by the end of last week I hope.

Once it's there we will also release it in 23.1.x.


Cheers,
Franco

Hi Franco,

Just installed the latest 23.1.5, and it seems to be awesome, no more weird netmap issues so far, thanks for the awesome work!

Not sure whether the Netmap-kernel is now part of 23.1.5 as it is not mentioned in the Changelog, so I'd assume that's still in the queue?

For me, all my Zenarmor issues remain.
In native mode with the igb Intel driver, Zenarmor will either stall or crash after 1-2 days, breaking all my inter-VLAN connections until Zenarmor is restarted.
In emulated mode, with or without the new netmap kernel, I'm seeing ever increasing MBUF usage until MBUF is topped out at 100% and practically everything just stops working.

So Zenarmor is currently more or less unusable. It used to be rock solid for me a year ago, not sure what has changed that led to it becoming the major issue of my network connectivity. Of course I have long-running support cases open with them, but it's not really moving forward either direction.

The patch (including another mbuf leak fix) went into FreeBSD source tree now. I've built a version based on this patch:

# opnsense-update -zkr 23.1.5-netmap

It hasn't been released yet but will be released in either 23.1.6 or 23.1.7 depending on which will do a base/kernel patch round.


Cheers,
Franco

Thanks Franco. :)
I'll give that a try in emulated mode tomorrow to see if that fixes my mbuf issue.

I no longer see the mbuf leak with this version. Will keep running this one to see if there's any further issues.

Nice to hear. The current state via FreeBSD source tree has been merged to the stable branch. Almost there. :)


Cheers,
Franco