OPNsense Forum

English Forums => 24.7, 24.10 Legacy Series => Topic started by: newsense on October 13, 2024, 12:34:39 AM

Title: PSA: Test kernel with Intel fixes is available for testing
Post by: newsense on October 13, 2024, 12:34:39 AM
For those interested, Franco built a test kernel based on a sizable import of Intel related fixes that recently made it into FreeBSD. You can review the new commits on Github in opnsense/src section.

To install and verify you're on  the new kernel:

# opnsense-update -zkr 24.7.6-intel

Fetching kernel-24.7.6-intel-amd64.txz: ..... done
!!!!!!!!!!!! ATTENTION !!!!!!!!!!!!!!!
! A critical upgrade is in progress. !
! Please do not turn off the system. !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Installing kernel-24.7.6-intel-amd64.txz... done
Please reboot.

# opnsense-shell reboot

root@OPNsense:~ # uname -a
FreeBSD OPNsense.home.arpa 14.1-RELEASE-p5 FreeBSD 14.1-RELEASE-p5 stable/24.7-n267903-51187ea746d SMP amd64



In case of issues you can always boot the old kernel from the boot menu. Snapshots are also available on ZFS - although in this case it may be a bit overkill.


I'm running on this kernel for over 24h without issues, 3 FWs with igc NICs and one APU with igb.

Any feedback welcome, and potential issues will have a chance to be fixed before these patches make it into 24.7.7+
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: madj42 on October 16, 2024, 06:44:12 PM
No issues after applying this.  Will run for a few days and report any issues.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Slashing on October 17, 2024, 09:15:36 AM
similarly, works for several days without problems on this machine - https://bsd-hardware.info/?probe=d408973732 (https://bsd-hardware.info/?probe=d408973732)
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on October 17, 2024, 10:49:12 AM
Thanks for the early testing! Currently busy with the business release so I haven't been all over this yet.

Can you all also give reference to the Intel drivers you are using? I think this touches every driver so some reference might be good.


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: nghappiness on October 18, 2024, 01:17:24 PM
What specific Intel NIC issues are this kernel fix?

I upgraded to it so far; other than the WAN interface, which does not get a public IPv6 address (/128) anymore, so far, so good. 

Using iXL driver with Intel x710..
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on October 18, 2024, 01:30:50 PM
Here is the abbreviated commit list off git:

6fbe7e4dd14 e1000: Re-add AIM
ca95baff951 netpfil tests: run in parallel
5c0dbb73dd5 iflib: Make iflib_stop() static
7751551389f if_vlan: handle VID conflicts
51187ea746d ixgbe: add reset count field to HW struct
1f9500080f6 ixgbe: increase VF reset timeout
f33b5451eaa ixgbe: add missing QV defines
767371038e7 ixgbe: fix PHY ID for X550
6cd77c97666 ixgbe: fix misleading indentation in ixgbe_phy
4e9b87df887 ixgbe: update ixgbe_phy with ix-3.3.38 changes
1a6466a912a ixgbe: improve Atom C3000 SWFW semaphore acq
33d5bb2e927 ixgbe: remove circular dependency in ixgbe_mbx.h
654abbb71f5 ixgbe: increase DCB BW calculation for MTU
81fe0f394a4 ixgbe: fix fw_recovery_mode callout
eeb0c0b11e6 ixgbe: remove unused function prototypes
3e83bfba1e9 ixgbe: Remove Atom C3000 HIC FW access
c087e7b2eef iflib: Correct indentation according to style(9)
ff29eb0cc4f iflib: Fix compiler warnings
43c6e9c9f1b iflib: Prefer C99's __func__ over GCC's __FUNCTION__
3e0e7b69522 iflib: Many style fixes
47574d67a50 ixgbe: prevent PBA read over eeprom word size
1c2f9ecb475 ixgbe: use primary and block terminology
6885c77664f ixgbe: improve function comments
64d62bb4433 ixgbe: replace implicit fall-through comments
1f5c31409b6 ixgbe: Switch if_sriov read/write back to ixgbe_mbx APIs
00e3c3efac4 ixgbe: update if_sriov with ix-3.3.38 changes
5e3a969e037 ixgbe: update if_sriov to use the new mailbox apis
7ab90af94b8 ixgbe: fix compilation for VF
8ae4b93f5c9 ixgbe: update ixgbe_mbx with ix-3.3.38 changes
aef03eaa67f ixgbe: introduce new mailbox API
be5e1c89e36 ixgbe: correct register names to match datasheet
792d5e06e80 ixgbe: rename VF message type macros
73405b8ffb0 ixv: fix x550 VF link speed reported
1b4494eb64a ixgbe: update if_ix and ixgbe api with ix-3.3.38 changes
b2a392c64f5 ixgbe: update if_bypass to ix-3.3.38
90a1709624f e1000: remove NEEDGIANT from a couple sysctls
f6ce5c29b9c e1000: drop NEEDGIANT from em_sysctl_debug_info use
7ea6e25ab12 e1000: drop NEEDGIANT on em_sysctl_reg_handler uses
0ae5fc2a34c e1000: fix link power down
72d2c5f8472 e1000: Update igb driver version to 2.5.28-fbsd
8e7ea373c8f igc: Add NVM/firmware prints and sysctl
7941470ba5c e1000: Clean up ITR/EITR in preparation for AIM
653e6f0ac29 e1000: Delay safe_pause switch until SI_SUB_CLOCKS
9457b504321 e1000: Add sysctl for igb(4) DMA Coalesce
1af69a3af35 e1000: Handle igb EEE sysctl
38de8de9e60 e1000: Add sysctls for some missing MAC stats
7ceca538aee iflib: Simplify iflib_legacy_setup
82ec23b1ccc e1000: Clean up legacy absolute and packet timers
3ce46d49d19 igc: Remove non-existent legacy absolute and packet timers
f36281617b9 netmap: Make the memory ops function pointer table const
719c7eb93b3 iflib: Use if_alloc_dev() to allocate the ifnet
246cf4bd41b if_enc(4): Make enc_add_hhooks() void


I've uploaded a second one with a couple more changes (the top ones in the list).

# opnsense-update -zkr 24.7.6-intel2


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: nghappiness on October 18, 2024, 03:58:44 PM
Franco,

Thanks for the list of fixes..  That is a lot..
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: newsense on October 20, 2024, 09:15:54 AM
All good here on 24.7.6-intel2 kernel, same firewalls as in the original post.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Seimus on October 20, 2024, 10:18:31 AM
Beautiful!

New & Franco thx so much.

I will test this out today (on N100 DDR5 unit) once I have a free slot and report back.

Regards,
S.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: dinguz on October 20, 2024, 03:46:59 PM
Seems to work fine here, hardware is I211/igb https://bsd-hardware.info/?probe=85f998d2ab (https://bsd-hardware.info/?probe=85f998d2ab)
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: yeraycito on October 20, 2024, 06:07:48 PM
Quote from: franco on October 18, 2024, 01:30:50 PM
Here is the abbreviated commit list off git:

6fbe7e4dd14 e1000: Re-add AIM
ca95baff951 netpfil tests: run in parallel
5c0dbb73dd5 iflib: Make iflib_stop() static
7751551389f if_vlan: handle VID conflicts
51187ea746d ixgbe: add reset count field to HW struct
1f9500080f6 ixgbe: increase VF reset timeout
f33b5451eaa ixgbe: add missing QV defines
767371038e7 ixgbe: fix PHY ID for X550
6cd77c97666 ixgbe: fix misleading indentation in ixgbe_phy
4e9b87df887 ixgbe: update ixgbe_phy with ix-3.3.38 changes
1a6466a912a ixgbe: improve Atom C3000 SWFW semaphore acq
33d5bb2e927 ixgbe: remove circular dependency in ixgbe_mbx.h
654abbb71f5 ixgbe: increase DCB BW calculation for MTU
81fe0f394a4 ixgbe: fix fw_recovery_mode callout
eeb0c0b11e6 ixgbe: remove unused function prototypes
3e83bfba1e9 ixgbe: Remove Atom C3000 HIC FW access
c087e7b2eef iflib: Correct indentation according to style(9)
ff29eb0cc4f iflib: Fix compiler warnings
43c6e9c9f1b iflib: Prefer C99's __func__ over GCC's __FUNCTION__
3e0e7b69522 iflib: Many style fixes
47574d67a50 ixgbe: prevent PBA read over eeprom word size
1c2f9ecb475 ixgbe: use primary and block terminology
6885c77664f ixgbe: improve function comments
64d62bb4433 ixgbe: replace implicit fall-through comments
1f5c31409b6 ixgbe: Switch if_sriov read/write back to ixgbe_mbx APIs
00e3c3efac4 ixgbe: update if_sriov with ix-3.3.38 changes
5e3a969e037 ixgbe: update if_sriov to use the new mailbox apis
7ab90af94b8 ixgbe: fix compilation for VF
8ae4b93f5c9 ixgbe: update ixgbe_mbx with ix-3.3.38 changes
aef03eaa67f ixgbe: introduce new mailbox API
be5e1c89e36 ixgbe: correct register names to match datasheet
792d5e06e80 ixgbe: rename VF message type macros
73405b8ffb0 ixv: fix x550 VF link speed reported
1b4494eb64a ixgbe: update if_ix and ixgbe api with ix-3.3.38 changes
b2a392c64f5 ixgbe: update if_bypass to ix-3.3.38
90a1709624f e1000: remove NEEDGIANT from a couple sysctls
f6ce5c29b9c e1000: drop NEEDGIANT from em_sysctl_debug_info use
7ea6e25ab12 e1000: drop NEEDGIANT on em_sysctl_reg_handler uses
0ae5fc2a34c e1000: fix link power down
72d2c5f8472 e1000: Update igb driver version to 2.5.28-fbsd
8e7ea373c8f igc: Add NVM/firmware prints and sysctl
7941470ba5c e1000: Clean up ITR/EITR in preparation for AIM
653e6f0ac29 e1000: Delay safe_pause switch until SI_SUB_CLOCKS
9457b504321 e1000: Add sysctl for igb(4) DMA Coalesce
1af69a3af35 e1000: Handle igb EEE sysctl
38de8de9e60 e1000: Add sysctls for some missing MAC stats
7ceca538aee iflib: Simplify iflib_legacy_setup
82ec23b1ccc e1000: Clean up legacy absolute and packet timers
3ce46d49d19 igc: Remove non-existent legacy absolute and packet timers
f36281617b9 netmap: Make the memory ops function pointer table const
719c7eb93b3 iflib: Use if_alloc_dev() to allocate the ifnet
246cf4bd41b if_enc(4): Make enc_add_hhooks() void


I've uploaded a second one with a couple more changes (the top ones in the list).

# opnsense-update -zkr 24.7.6-intel2


As far as I can see most of the changes are related to ixgbe ( 10 Gigabit ) so I wonder what benefits they bring to other interfaces.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Seimus on October 20, 2024, 11:52:30 PM
Went with the upgrade Kernel intel2,
Installation went without problems, device booted without problems.

All looks so far good on my Prod device:
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G

Regards,
S.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on October 21, 2024, 09:09:20 AM
Thanks for testing so far. Looks quite ok for the moment, but for extra testing we will skip kernel updates in 24.7.7 this week so this batch will likely land in 243.7.8.


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: nghappiness on October 23, 2024, 01:28:04 AM
Quote from: franco on October 21, 2024, 09:09:20 AM
Thanks for testing so far. Looks quite ok for the moment, but for extra testing we will skip kernel updates in 24.7.7 this week so this batch will likely land in 243.7.8.


219 more years for the kernel update.   =)     I have time (j/k).

Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on October 23, 2024, 08:02:08 AM
Hehe  :)
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: nghappiness on October 24, 2024, 12:57:43 AM
Quote from: franco on October 21, 2024, 09:09:20 AM
Thanks for testing so far. Looks quite ok for the moment, but for extra testing we will skip kernel updates in 24.7.7 this week so this batch will likely land in 243.7.8.


Cheers,
Franco

Upgraded to 24.7.7, will I get into trouble to install the kernel patch on top of 24.7.7?
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: newsense on October 24, 2024, 01:58:21 AM
No trouble at all, 24.7.7 has no new kernel.


For those already on the 24.7.6-intel2 kernel there are two options:

a) Install 24.7.7 which will revert your kernel to 24.7.6

b) Temporary lock the kernel in the GUI until 24.7.8 is available, so you can upgrade to 24.7.7 without issues. Don't forget to unlock before upgrading to 24.7.8 though
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: newsense on October 24, 2024, 02:33:35 PM
New kernel

Quote from: 42bf31316a551230f13e247dba5cb11e8e1b06e8
igc: Add AIM

igc is derived from igb and has never had an AIM implementation. The
same algorithm from e1000 is appropriate here.

Upon more detailed study of the Linux driver which has a newer AIM
implementation, it finally became clear to me this is actually a
holdoff timer and not an interrupt limit as it is conventionally
(statically) programmed and displayed as an interrupt rate. The data
sheets also make this somewhat clear.

Thus, AIM accomplishes two beneficial things for a wide variety of
workloads[1]:

1. At low throughput/packet rates, it will significantly lower latency
(by counter-intuitively "increasing" the interrupt rate.. better
thought of as decreasing the holdoff timer because you will modulate
down before coming anywhere near these interrupt rates).
2. At bulk data rates, it is tuned to achieve a lower interrupt rate
(by increasing the holdoff timer) than the current static 8000/s. This
decreases processing overhead and yields more headroom for other work
such as packet filters or userland.

For a single NIC this might be worth a few sys% on common CPUs, but may
be meaningful when multiplied such as if_lagg, if_bridge and forwarding
setups.

The AIM algorithm was re-introduced from the older igb or out of tree
driver, and then modernized with permission to use Intel code from other
drivers.

[1]: http://iommu.com/datasheets/ethernet/controllers-nics/intel/e1000/gbe-controllers-interrupt-moderation-appl-note.pdf

https://github.com/opnsense/src/commit/42bf31316a551230f13e247dba5cb11e8e1b06e8 (https://github.com/opnsense/src/commit/42bf31316a551230f13e247dba5cb11e8e1b06e8)


# opnsense-update -zkr 24.7.6-intel3
# opnsense-shell reboot


Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Seimus on October 24, 2024, 03:30:26 PM
Quote from: newsense on October 24, 2024, 02:33:35 PM
New kernel

Quote from: 42bf31316a551230f13e247dba5cb11e8e1b06e8
igc: Add AIM

igc is derived from igb and has never had an AIM implementation. The
same algorithm from e1000 is appropriate here.

Upon more detailed study of the Linux driver which has a newer AIM
implementation, it finally became clear to me this is actually a
holdoff timer and not an interrupt limit as it is conventionally
(statically) programmed and displayed as an interrupt rate. The data
sheets also make this somewhat clear.

Thus, AIM accomplishes two beneficial things for a wide variety of
workloads[1]:

1. At low throughput/packet rates, it will significantly lower latency
(by counter-intuitively "increasing" the interrupt rate.. better
thought of as decreasing the holdoff timer because you will modulate
down before coming anywhere near these interrupt rates).
2. At bulk data rates, it is tuned to achieve a lower interrupt rate
(by increasing the holdoff timer) than the current static 8000/s. This
decreases processing overhead and yields more headroom for other work
such as packet filters or userland.

For a single NIC this might be worth a few sys% on common CPUs, but may
be meaningful when multiplied such as if_lagg, if_bridge and forwarding
setups.

The AIM algorithm was re-introduced from the older igb or out of tree
driver, and then modernized with permission to use Intel code from other
drivers.

[1]: http://iommu.com/datasheets/ethernet/controllers-nics/intel/e1000/gbe-controllers-interrupt-moderation-appl-note.pdf

https://github.com/opnsense/src/commit/42bf31316a551230f13e247dba5cb11e8e1b06e8 (https://github.com/opnsense/src/commit/42bf31316a551230f13e247dba5cb11e8e1b06e8)


# opnsense-update -zkr 24.7.6-intel3
# opnsense-shell reboot


Alright this looks juicy!

New were you able already see the benefits on the igc based NIC such as i225 i226?

Regards,
S.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: newsense on October 24, 2024, 03:53:11 PM
Things are rather quiet for now on the two FW I got to reboot, a bit hard to tell since they're only up for ~90 minutes
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Seimus on October 24, 2024, 04:03:54 PM
Still thanks for the reply.

I am currently unable to install/reboot my unit. But once I can I will go for that kernel.

Personally my setup uses LAGGs so I am looking for this.

Regards,
S.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: nghappiness on October 25, 2024, 12:38:19 AM
Quote from: newsense on October 24, 2024, 01:58:21 AM
No trouble at all, 24.7.7 has no new kernel.


For those already on the 24.7.6-intel2 kernel there are two options:

a) Install 24.7.7 which will revert your kernel to 24.7.6

b) Temporary lock the kernel in the GUI until 24.7.8 is available, so you can upgrade to 24.7.7 without issues. Don't forget to unlock before upgrading to 24.7.8 though

Thanks!  Applied intel3 patch on top of 24.7.7.   so far so good..   Intel x710-DA2.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: REB00T on October 25, 2024, 07:31:35 PM
I seem to be having issues with the intel3 kernel with PPPoE and DHCPv6 on WAN. Very similar to the problem discussed a little while ago with dhcp6c. Reverting to the intel2 kernel and restarting seems to consistently fix the issue. To re-iterate, WAN receives an IPv6 address but not a prefix. It seems weird since I use an igb interface, and most of the changes regarding that were already present in intel2. Happy to provide more info if needed.

Edit: Clicking apply in the interface settings seems to work, it's just after boot up that it is not picked up. Will try to narrow it down more.

Edit 2: Seems like an ISP issue, just something to keep an eye on.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: newsense on November 04, 2024, 02:24:05 PM
New kernel today


# opnsense-update -zkr 24.7.6-intel4
# opnsense-shell reboot
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on November 04, 2024, 02:31:43 PM
intel4 will probably be the state of the final 24.7.8 later this week. All testing is welcome.


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: staticznld on November 04, 2024, 03:19:17 PM
Running Intel4 after running Intel3 for 1.5 weeks.
No issues so far!
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: newsense on November 04, 2024, 07:51:00 PM
Same here, no issues with intel3.

Intel4 running on 3/4 FWs already and all is fine.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on November 05, 2024, 11:54:31 AM
As per special request the "intel5" kernel contains temporary test patches for e1000/igb:

https://reviews.freebsd.org/D47336
https://reviews.freebsd.org/D47337

# opnsense-update -zkr 24.7.6-intel5


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: jclendineng on November 05, 2024, 11:33:06 PM
intel5 running, no issues that I can see so far. Mellanox (igb)
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: nghappiness on November 06, 2024, 02:06:09 PM
Quote from: franco on November 05, 2024, 11:54:31 AM
# opnsense-update -zkr 24.7.6-intel5

installed intel5 running with Intel x710 ixl driver.   
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: yeraycito on November 06, 2024, 03:12:16 PM
Opnsense 24.7.8

o src: assorted FreeBSD stable patches for Intel ixgbe, igb, igc and e1000 drivers

is this the latest version of the kernel ?
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on November 06, 2024, 03:19:54 PM
It's equivalent to intel4 test.


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: jclendineng on November 06, 2024, 04:22:41 PM
I locked intel5 prior to upgrading so I wouldn't lose the kernel
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: newsense on November 07, 2024, 03:12:42 PM
Intel6 kernel available, this time with ice driver updates.

# opnsense-update -zkr 24.7.8-intel6
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on November 12, 2024, 02:55:23 PM
Apologies to invade this thread: Once on 24.7.8, is there a way to revert to the previous intel drivers for testing purposes? Since .8 I've been having CARP issues (interfaces going up and down randomly: https://forum.opnsense.org/index.php?topic=43934.0 (https://forum.opnsense.org/index.php?topic=43934.0)) and I'd like to narrow down the cause of the issue. This is a Protectli FW4B with Intel i211. Thanks!
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on November 12, 2024, 04:01:57 PM
Normal kernels are selected without -z and their respective OPNsense release version so in this case 24.7.6:

# opnsense-update -kr 24.7.6


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on November 12, 2024, 04:10:11 PM
Thank you -- worked like a charm. Now that I know how to revert, I may as well also try the intel6 kernel for good measure. Thanks!
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on November 12, 2024, 04:36:14 PM
Ok, if it's part of the kernel changes in 24.7.8 I'm afraid we will have to look into it more closely as these are queued up for FreeBSD 14.2 as well.


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: newsense on November 14, 2024, 08:01:16 PM
If using the ice driver this kernel has RSS fixes for you.

opnsense-update -zkr 24.7.8-ice

Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on November 14, 2024, 08:16:23 PM
We haven't tested the RSS bit yet. Only do this if you have time to kill. ;)


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: nghappiness on November 15, 2024, 01:14:25 PM
Quote from: franco on November 14, 2024, 08:16:23 PM
We haven't tested the RSS bit yet. Only do this if you have time to kill. ;)

Sure, but I am using the x710 card, not sure if I am helping at all..
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on November 15, 2024, 01:21:16 PM
Yes, you can safely skip this one.


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on November 15, 2024, 04:00:10 PM
Quote from: franco on November 12, 2024, 04:36:14 PM
Ok, if it's part of the kernel changes in 24.7.8 I'm afraid we will have to look into it more closely as these are queued up for FreeBSD 14.2 as well.

So I tried intel6 on both my primary and backup box, and I am seeing the following errors periodically in my DHCP logs:

2024-11-15T08:19:46-05:00 Informational dhcpd Server starting service.
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_lan: I move from communications-interrupted to startup
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_opt1: I move from communications-interrupted to startup
2024-11-15T08:19:46-05:00 Informational dhcpd Sending on Socket/fallback/fallback-net
2024-11-15T08:19:46-05:00 Informational dhcpd Sending on BPF/igb1/00:e0:67:2f:04:39/192.168.5.0/24
2024-11-15T08:19:46-05:00 Informational dhcpd Sending updates to dhcp_lan.
2024-11-15T08:19:46-05:00 Informational dhcpd balanced pool 29121b317180 192.168.5.0/24 total 154 free 74 backup 74 lts 0 max-misbal 22
2024-11-15T08:19:46-05:00 Informational dhcpd balancing pool 29121b317180 192.168.5.0/24 total 154 free 79 backup 69 lts 5 max-own (+/-)15
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_lan: Both servers normal
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_lan: I move from communications-interrupted to normal
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_lan: peer moves from normal to normal
2024-11-15T08:19:46-05:00 Informational dhcpd balanced pool 29121b317240 192.168.9.0/24 total 105 free 52 backup 48 lts 2 max-misbal 15
2024-11-15T08:19:46-05:00 Informational dhcpd balancing pool 29121b317240 192.168.9.0/24 total 105 free 52 backup 48 lts 2 max-own (+/-)10
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_opt1: Both servers normal
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_opt1: I move from communications-interrupted to normal
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_opt1: peer moves from normal to normal
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_lan: I move from startup to communications-interrupted
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_opt1: I move from startup to communications-interrupted
2024-11-15T08:19:46-05:00 Informational dhcpd Listening on BPF/igb1/00:e0:67:2f:04:39/192.168.5.0/24
2024-11-15T08:19:46-05:00 Informational dhcpd Sending on BPF/igb2/00:e0:67:2f:04:3a/192.168.9.0/24
2024-11-15T08:19:46-05:00 Informational dhcpd Listening on BPF/igb2/00:e0:67:2f:04:3a/192.168.9.0/24
2024-11-15T08:19:46-05:00 Informational dhcpd Wrote 205 leases to leases file.
2024-11-15T08:19:46-05:00 Informational dhcpd Wrote 0 new dynamic host decls to leases file.
2024-11-15T08:19:46-05:00 Informational dhcpd Wrote 0 deleted host decls to leases file.
2024-11-15T08:19:46-05:00 Informational dhcpd Wrote 0 class decls to leases file.
2024-11-15T08:19:46-05:00 Informational dhcpd For info, please visit https://www.isc.org/software/dhcp/
2024-11-15T08:19:46-05:00 Informational dhcpd All rights reserved.
2024-11-15T08:19:46-05:00 Informational dhcpd Copyright 2004-2022 Internet Systems Consortium.
2024-11-15T08:19:46-05:00 Informational dhcpd Internet Systems Consortium DHCP Server 4.4.3-P1
2024-11-15T08:19:46-05:00 Informational dhcpd PID file: /var/run/dhcpd.pid
2024-11-15T08:19:46-05:00 Informational dhcpd Database file: /var/db/dhcpd.leases
2024-11-15T08:19:46-05:00 Informational dhcpd Config file: /etc/dhcpd.conf
2024-11-15T08:19:46-05:00 Informational dhcpd For info, please visit https://www.isc.org/software/dhcp/
2024-11-15T08:19:46-05:00 Informational dhcpd All rights reserved.
2024-11-15T08:19:46-05:00 Informational dhcpd Copyright 2004-2022 Internet Systems Consortium.
2024-11-15T08:19:46-05:00 Informational dhcpd Internet Systems Consortium DHCP Server 4.4.3-P1
2024-11-15T08:19:45-05:00 Informational dhcpd failover peer dhcp_opt1: I move from normal to communications-interrupted
2024-11-15T08:19:45-05:00 Informational dhcpd peer dhcp_opt1: disconnected
2024-11-15T08:19:45-05:00 Informational dhcpd failover peer dhcp_lan: I move from normal to communications-interrupted
2024-11-15T08:19:45-05:00 Informational dhcpd peer dhcp_lan: disconnected
2024-11-15T08:19:45-05:00 Informational dhcpd failover peer dhcp_opt1: Both servers normal
2024-11-15T08:19:45-05:00 Informational dhcpd failover peer dhcp_opt1: peer moves from communications-interrupted to normal
2024-11-15T08:19:45-05:00 Informational dhcpd balanced pool 3fa62b117240 192.168.9.0/24 total 105 free 52 backup 48 lts 2 max-misbal 15
2024-11-15T08:19:45-05:00 Informational dhcpd balancing pool 3fa62b117240 192.168.9.0/24 total 105 free 52 backup 48 lts 2 max-own (+/-)10
2024-11-15T08:19:45-05:00 Informational dhcpd failover peer dhcp_opt1: I move from communications-interrupted to normal
2024-11-15T08:19:45-05:00 Informational dhcpd failover peer dhcp_opt1: peer moves from normal to communications-interrupted
2024-11-15T08:19:40-05:00 Informational dhcpd failover peer dhcp_opt1: I move from normal to communications-interrupted
2024-11-15T08:19:40-05:00 Informational dhcpd peer dhcp_opt1: disconnected
2024-11-15T08:19:40-05:00 Error dhcpd timeout waiting for failover peer dhcp_opt1


Around the same time, I am seeing this in the general log:

2024-11-15T08:20:33-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "MASTER" for vhid 4
2024-11-15T08:19:48-05:00 Notice kernel <6>carp: 4@igb2: BACKUP -> MASTER (preempting a slower master)
2024-11-15T08:19:45-05:00 Notice kernel <6>carp: demoted by -240 to 0 (pfsync bulk done)
2024-11-15T08:19:45-05:00 Notice kernel <6>carp: demoted by 240 to 240 (pfsync bulk start)
2024-11-15T08:19:45-05:00 Notice kernel <6>carp: 4@igb2: INIT -> BACKUP (initialization complete)
2024-11-15T08:19:45-05:00 Notice kernel <6>carp: 4@igb2: MASTER -> INIT (hardware interface up)
2024-11-15T08:19:44-05:00 Notice kernel <6>carp: 4@igb2: BACKUP -> MASTER (master timed out)
2024-11-15T08:19:42-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "BACKUP" for vhid 4
2024-11-15T08:19:40-05:00 Notice kernel <6>carp: demoted by -240 to 0 (interface up)
2024-11-15T08:19:40-05:00 Notice kernel <6>carp: 4@igb2: INIT -> BACKUP (initialization complete)


I then reverted the kernel of my backup box to the 24.7.6 one, and I have so far not seen these issues for the past hour three days anymore. My primary is still on intel6. Can't say 100% that the above issues are related to the new drivers, I'm merely reporting what I see.

P.S.: Both boxes are identical twins (Protectli FW4B) running coreboot. Hardware details below (hw-probe):


HOST:
arch:amd64
boot_mode:EFI
cores:4
dual_boot:0
dual_boot_win:0
filesystem:zfs
freebsd_release:14.1
freebsd_version:14.1-RELEASE-p6
kernel:14.1-RELEASE-p5
lang:C.UTF-8
microarch:Silvermont
model:FW4B
nics:4
part_scheme:GPT
probe_ver:1.6
ram_total:4194304
ram_used:274432
sockets:1
space_total:21
space_used:2
system:opnsense-24.7.8
system_version:24.7.8
threads:1
type:desktop
vendor:Protectli
year:2022

DEVICES:
pci:8086-1539;02-00-00;works;network;igb;Intel Corporation;I211 Gigabit Network Connection
pci:8086-1539;02-00-00;works;network;igb;Intel Corporation;I211 Gigabit Network Connection
pci:8086-1539;02-00-00;works;network;igb;Intel Corporation;I211 Gigabit Network Connection
pci:8086-1539;02-00-00;works;network;igb;Intel Corporation;I211 Gigabit Network Connection
pci:8086-2280-8086-7270;06-00-00;detected;bridge;hostb;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series SoC Transaction Register
pci:8086-2284-8086-7270;04-03-00;detected;sound;hdac;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series High Definition Audio Controller
pci:8086-2292-8086-7270;0c-05-00;detected;smbus;ichsmb;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx SMBus Controller
pci:8086-229c-8086-7270;06-01-00;detected;bridge;isab;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series PCU
pci:8086-22a3-8086-7270;01-06-01;works;storage;ahci;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series SATA Controller
pci:8086-22b1-8086-7270;03-00-00;detected;graphics card;vgapci;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Integrated Graphics Controller
pci:8086-22b5-8086-7270;0c-03-30;detected;usb controller;xhci;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series USB xHCI Controller
pci:8086-22c8-8086-7270;06-04-00;detected;bridge;pcib;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series PCI Express Port
pci:8086-22ca-8086-7270;06-04-00;detected;bridge;pcib;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series PCI Express Port
pci:8086-22cc-8086-7270;06-04-00;detected;bridge;pcib;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series PCI Express Port
pci:8086-22ce-8086-7270;06-04-00;works;bridge;pcib;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series PCI Express Port
usb:1d6b-0003;09-00-00;detected;hub;uhub;BSD;XHCI root HUB
0;;detected;;;;
bios:coreboot-v4-12-0-8-10-25-2022;;works;bios;;coreboot;BIOS v4.12.0.8 10/25/2022
board:protectli-fw4b-1-0;;works;motherboard;;Protectli;Motherboard FW4B 1.0
cpu:intel-6-76-4-celeron-j3160;;works;cpu;;Intel;Celeron CPU J3160 @ 1.60GHz
cpu:intel-6-76-4-celeron-j3160;;works;cpu;;Intel;Celeron CPU J3160 @ 1.60GHz
cpu:intel-6-76-4-celeron-j3160;;works;cpu;;Intel;Celeron CPU J3160 @ 1.60GHz
cpu:intel-6-76-4-celeron-j3160;;works;cpu;;Intel;Celeron CPU J3160 @ 1.60GHz
ide:transcend-ts32gmts400-serial-1d60f49bfd09be47dc06ee30ef0346ff;;works;disk;ada, ahcich;Transcend;TS32GMTS400 32GB
mem:ram-module-4gb-1600-serial-aaec96992031920b10ead184d63872b5;;works;memory;;;RAM Module 4GB 1600MT/s
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 10, 2024, 02:33:30 PM
I updated both boxes to OPNsense 24.7.10_2 and unfortunately the problem I described above is back again. At this point, I am blaming it on the new Intel driver changes in the kernel :)
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 10, 2024, 02:56:10 PM
@Grossartig Are you willing to test a kernel wit a few igb(4) reverts to get to the bottom of it?


Thanks,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 10, 2024, 03:05:37 PM
Absolutely!
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 10, 2024, 04:02:23 PM
Let's start with this one:

# opnsense-update -zkr 24.7.6_11

This is before a "benign" igb driver update was carried out.

I've marked "24.7.6" good and "24.7.8" bad for git-bisect.


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 10, 2024, 04:28:58 PM
Done on backup box. I'll let it "soak in" for a bit before I do the same on the primary.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 10, 2024, 04:57:14 PM
Installed kernel 24.7.6_11 on both primary and backup now. So far all stable and no CARP related issues. Let me know how long you want me to keep an eye on it before proceeding with additional tests. Thank you!
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 10, 2024, 05:01:39 PM
Let me build the next one as a lucky guess and upload it for you to try at your earliest convenience. For reference we're looking at

https://github.com/opnsense/src/commit/72d2c5f8472e91e40a1184d48fe9af8ba73fc58a



Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 10, 2024, 05:07:03 PM
# opnsense-update -zkr 24.7.6_13
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 10, 2024, 05:52:35 PM
I updated both boxes to kernel 24.7.6_13 about half an hour ago and all stable so far.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 10, 2024, 06:14:43 PM
Ok on to https://github.com/opnsense/src/commit/3b3f36757d2f

# opnsense-update -zkr 24.7.6_69
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 10, 2024, 06:26:51 PM
I was still on _13 and just got the following in the general logs of the primary box:


2024-12-10T12:24:33-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "MASTER" for vhid 4
2024-12-10T12:24:31-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "BACKUP" for vhid 4
2024-12-10T12:24:29-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "INIT" for vhid 4
2024-12-10T12:24:27-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "MASTER" for vhid 4
2024-12-10T12:24:22-05:00 Notice kernel <6>carp: 4@igb2: BACKUP -> MASTER (preempting a slower master)
2024-12-10T12:24:19-05:00 Notice kernel <6>carp: demoted by -240 to 0 (pfsync bulk done)
2024-12-10T12:24:19-05:00 Notice kernel <6>carp: demoted by 240 to 240 (pfsync bulk start)
2024-12-10T12:24:19-05:00 Notice kernel <6>carp: 4@igb2: INIT -> BACKUP (initialization complete)
2024-12-10T12:24:19-05:00 Notice kernel <6>carp: 4@igb2: MASTER -> INIT (hardware interface up)
2024-12-10T12:24:15-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "BACKUP" for vhid 4
2024-12-10T12:24:08-05:00 Notice kernel <6>carp: 4@igb2: BACKUP -> MASTER (preempting a slower master)
2024-12-10T12:24:05-05:00 Notice kernel <6>carp: demoted by -240 to 0 (interface up)
2024-12-10T12:24:05-05:00 Notice kernel <6>carp: 4@igb2: INIT -> BACKUP (initialization complete)
2024-12-10T12:23:56-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "INIT" for vhid 4
2024-12-10T12:23:55-05:00 Notice kernel <6>carp: demoted by 240 to 240 (interface down)
2024-12-10T12:23:55-05:00 Notice kernel <6>carp: 4@igb2: MASTER -> INIT (hardware interface down)
2024-12-10T12:18:03-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "MASTER" for vhid 4
2024-12-10T12:18:01-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "BACKUP" for vhid 4
2024-12-10T12:17:58-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "INIT" for vhid 4
2024-12-10T12:17:56-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "MASTER" for vhid 4
2024-12-10T12:17:42-05:00 Notice kernel <6>carp: 4@igb2: BACKUP -> MASTER (master timed out)
2024-12-10T12:17:39-05:00 Notice kernel <6>carp: demoted by -240 to 0 (pfsync bulk done)
2024-12-10T12:17:39-05:00 Notice kernel <6>carp: demoted by 240 to 240 (pfsync bulk start)
2024-12-10T12:17:39-05:00 Notice kernel <6>carp: 4@igb2: INIT -> BACKUP (initialization complete)
2024-12-10T12:17:39-05:00 Notice kernel <6>carp: 4@igb2: MASTER -> INIT (hardware interface up)
2024-12-10T12:17:37-05:00 Notice kernel <6>carp: 4@igb2: BACKUP -> MASTER (master timed out)
2024-12-10T12:17:36-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "BACKUP" for vhid 4
2024-12-10T12:17:34-05:00 Notice kernel <6>carp: demoted by -240 to 0 (interface up)
2024-12-10T12:17:34-05:00 Notice kernel <6>carp: 4@igb2: INIT -> BACKUP (initialization complete)
2024-12-10T12:17:27-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "INIT" for vhid 4
2024-12-10T12:17:25-05:00 Notice kernel <6>carp: demoted by 240 to 240 (interface down)
2024-12-10T12:17:25-05:00 Notice kernel <6>carp: 4@igb2: MASTER -> INIT (hardware interface down)


Edit: just happened again:


2024-12-10T12:28:44-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "MASTER" for vhid 4
2024-12-10T12:28:41-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "BACKUP" for vhid 4
2024-12-10T12:28:39-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "INIT" for vhid 4
2024-12-10T12:28:37-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "MASTER" for vhid 4
2024-12-10T12:28:33-05:00 Notice kernel <6>carp: 4@igb2: BACKUP -> MASTER (master timed out)
2024-12-10T12:28:30-05:00 Notice kernel <6>carp: demoted by -240 to 0 (pfsync bulk done)
2024-12-10T12:28:30-05:00 Notice kernel <6>carp: demoted by 240 to 240 (pfsync bulk start)
2024-12-10T12:28:30-05:00 Notice kernel <6>carp: 4@igb2: INIT -> BACKUP (initialization complete)
2024-12-10T12:28:30-05:00 Notice kernel <6>carp: 4@igb2: MASTER -> INIT (hardware interface up)
2024-12-10T12:28:26-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "BACKUP" for vhid 4
2024-12-10T12:28:19-05:00 Notice kernel <6>carp: 4@igb2: BACKUP -> MASTER (master timed out)
2024-12-10T12:28:17-05:00 Notice kernel <6>carp: demoted by -240 to 0 (interface up)
2024-12-10T12:28:17-05:00 Notice kernel <6>carp: 4@igb2: INIT -> BACKUP (initialization complete)
2024-12-10T12:28:07-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "INIT" for vhid 4
2024-12-10T12:28:06-05:00 Notice kernel <6>carp: demoted by 240 to 240 (interface down)
2024-12-10T12:28:06-05:00 Notice kernel <6>carp: 4@igb2: MASTER -> INIT (hardware interface down)


Edit 2: And more:


2024-12-10T12:32:53-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : openvpn_refresh_crls(1))
2024-12-10T12:32:52-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : core_trust_crl(1))
2024-12-10T12:32:52-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (1)
2024-12-10T12:32:51-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "MASTER" for vhid 4
2024-12-10T12:32:51-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : openvpn_refresh_crls(1))
2024-12-10T12:32:50-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : core_trust_crl(1))
2024-12-10T12:32:50-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (1)
2024-12-10T12:32:49-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "BACKUP" for vhid 4
2024-12-10T12:32:49-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : openvpn_refresh_crls(1))
2024-12-10T12:32:48-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : core_trust_crl(1))
2024-12-10T12:32:48-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (1)
2024-12-10T12:32:47-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "INIT" for vhid 4
2024-12-10T12:32:47-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : openvpn_refresh_crls(1))
2024-12-10T12:32:46-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : core_trust_crl(1))
2024-12-10T12:32:46-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (1)
2024-12-10T12:32:45-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "MASTER" for vhid 4
2024-12-10T12:32:44-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure newwanip_map:rfc2136 (,opt1)
2024-12-10T12:32:40-05:00 Notice kernel <6>arp: 192.168.9.1 moved from 00:00:5e:00:01:04 to 00:e0:67:2f:04:6a on igb2
2024-12-10T12:32:40-05:00 Notice kernel <6>carp: 4@igb2: BACKUP -> MASTER (preempting a slower master)
2024-12-10T12:32:40-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure newwanip_map (execute task : wireguard_sync())
2024-12-10T12:32:40-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure newwanip_map (execute task : webgui_configure_do(,wan))
2024-12-10T12:32:40-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure newwanip_map (execute task : vxlan_configure_do())
2024-12-10T12:32:40-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure newwanip_map (execute task : unbound_configure_do(,wan))
2024-12-10T12:32:40-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure newwanip_map (execute task : openssh_configure_do(,wan))
2024-12-10T12:32:40-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure newwanip_map (execute task : opendns_configure_do())
2024-12-10T12:32:40-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure newwanip_map (execute task : ntpd_configure_do())
2024-12-10T12:32:40-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure newwanip_map (execute task : dnsmasq_configure_do())
2024-12-10T12:32:39-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure newwanip_map (execute task : dhcrelay_configure_if(,wan,inet))
2024-12-10T12:32:39-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure newwanip_map (,wan,inet)
2024-12-10T12:32:39-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure newwanip (,wan)
2024-12-10T12:32:39-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure vpn (,wan)
2024-12-10T12:32:39-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure dns (execute task : unbound_configure_do())
2024-12-10T12:32:39-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure dns (execute task : dnsmasq_configure_do())
2024-12-10T12:32:39-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure dns ()
2024-12-10T12:32:38-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure vpn_map (execute task : wireguard_configure_do())
2024-12-10T12:32:38-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure dhcp (execute task : dhcpd_dhcp_configure())
2024-12-10T12:32:38-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure dhcp ()
2024-12-10T12:32:38-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure ipsec (execute task : ipsec_configure_do(,opt1))
2024-12-10T12:32:38-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure ipsec (,opt1)
2024-12-10T12:32:38-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure monitor (execute task : dpinger_configure_do(,[]))
2024-12-10T12:32:38-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure monitor (,[])
2024-12-10T12:32:38-05:00 Notice opnsense /usr/local/etc/rc.linkup: ROUTING: keeping inet default route to XXX.XX.XXX.X
2024-12-10T12:32:38-05:00 Notice opnsense /usr/local/etc/rc.linkup: ROUTING: configuring inet default gateway on wan
2024-12-10T12:32:38-05:00 Notice kernel <6>carp: demoted by -240 to 0 (pfsync bulk done)
2024-12-10T12:32:37-05:00 Notice opnsense /usr/local/etc/rc.linkup: ROUTING: entering configure using opt1
2024-12-10T12:32:37-05:00 Notice kernel <6>carp: demoted by 240 to 240 (pfsync bulk start)
2024-12-10T12:32:37-05:00 Notice kernel <6>carp: 4@igb2: INIT -> BACKUP (initialization complete)
2024-12-10T12:32:37-05:00 Notice kernel <6>igb2: promiscuous mode enabled
2024-12-10T12:32:37-05:00 Notice kernel <6>igb2: promiscuous mode disabled
2024-12-10T12:32:37-05:00 Notice kernel <6>carp: 4@igb2: MASTER -> INIT (hardware interface up)
2024-12-10T12:32:37-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : openvpn_refresh_crls(1))
2024-12-10T12:32:36-05:00 Notice opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for opt1(igb2)
2024-12-10T12:32:35-05:00 Warning opnsense /usr/local/etc/rc.newwanip: Interface '' (ovpns1) is disabled or empty, nothing to do.
2024-12-10T12:32:35-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : core_trust_crl(1))
2024-12-10T12:32:35-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (1)
2024-12-10T12:32:33-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "BACKUP" for vhid 4
2024-12-10T12:32:33-05:00 Notice opnsense /usr/local/etc/rc.newwanip: OpenVPN server 1 instance started on PID 70045.
2024-12-10T12:32:33-05:00 Notice kernel <6>ovpns1: link state changed to UP
2024-12-10T12:32:32-05:00 Notice kernel <6>ovpns1: link state changed to DOWN
2024-12-10T12:32:32-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure newwanip_map:rfc2136 (,wan)
2024-12-10T12:32:32-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure vpn_map (execute task : openvpn_configure_do(,wan))
2024-12-10T12:32:32-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure vpn_map (execute task : ipsec_configure_do(,wan))
2024-12-10T12:32:32-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure vpn_map (,wan,inet)
2024-12-10T12:32:26-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure monitor (execute task : dpinger_configure_do(,[WAN_DHCP]))
2024-12-10T12:32:26-05:00 Notice opnsense /usr/local/etc/rc.newwanip: plugins_configure monitor (,[WAN_DHCP])
2024-12-10T12:32:26-05:00 Notice opnsense /usr/local/etc/rc.newwanip: ROUTING: keeping inet default route to XXX.XX.XXX.X
2024-12-10T12:32:26-05:00 Notice opnsense /usr/local/etc/rc.newwanip: ROUTING: configuring inet default gateway on wan
2024-12-10T12:32:26-05:00 Notice opnsense /usr/local/etc/rc.newwanip: ROUTING: entering configure using wan
2024-12-10T12:32:26-05:00 Notice opnsense /usr/local/etc/rc.newwanip: IP renewal starting (new: XXX.XX.XXX.X, old: XXX.XX.XXX.X, interface: wan, device: igb0, force: yes)
2024-12-10T12:32:25-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure dns (execute task : unbound_configure_do())
2024-12-10T12:32:25-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure dns (execute task : dnsmasq_configure_do())
2024-12-10T12:32:25-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure dns ()
2024-12-10T12:32:25-05:00 Notice kernel <6>carp: 4@igb2: BACKUP -> MASTER (master timed out)
2024-12-10T12:32:24-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure dhcp (execute task : dhcpd_dhcp_configure())
2024-12-10T12:32:24-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure dhcp ()
2024-12-10T12:32:24-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure ipsec (execute task : ipsec_configure_do(,wan))
2024-12-10T12:32:24-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure ipsec (,wan)
2024-12-10T12:32:24-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure monitor (execute task : dpinger_configure_do(,[WAN_DHCP]))
2024-12-10T12:32:24-05:00 Notice opnsense /usr/local/etc/rc.linkup: plugins_configure monitor (,[WAN_DHCP])
2024-12-10T12:32:24-05:00 Notice opnsense /usr/local/etc/rc.linkup: ROUTING: setting inet default route to XXX.XX.XXX.X
2024-12-10T12:32:24-05:00 Notice opnsense /usr/local/etc/rc.linkup: ROUTING: configuring inet default gateway on wan
2024-12-10T12:32:24-05:00 Notice opnsense /usr/local/etc/rc.linkup: ROUTING: entering configure using wan
2024-12-10T12:32:23-05:00 Notice dhclient dhclient-script: Creating resolv.conf
2024-12-10T12:32:23-05:00 Notice dhclient dhclient-script: New Routers (igb0): XXX.XX.XXX.X
2024-12-10T12:32:23-05:00 Notice dhclient dhclient-script: New Broadcast Address (igb0): XXX.XX.XXX.XXX
2024-12-10T12:32:23-05:00 Notice dhclient dhclient-script: New Subnet Mask (igb0): 255.255.255.0
2024-12-10T12:32:23-05:00 Notice dhclient dhclient-script: New IP Address (igb0): XXX.XX.XXX.X
2024-12-10T12:32:23-05:00 Notice dhclient dhclient-script: Reason REBOOT on igb0 executing
2024-12-10T12:32:23-05:00 Notice dhclient dhclient-script: Reason PREINIT on igb0 executing
2024-12-10T12:32:22-05:00 Notice kernel <6>igb2: link state changed to UP
2024-12-10T12:32:22-05:00 Notice kernel <6>carp: demoted by -240 to 0 (interface up)
2024-12-10T12:32:22-05:00 Notice kernel <6>carp: 4@igb2: INIT -> BACKUP (initialization complete)
2024-12-10T12:32:21-05:00 Notice opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet attached event for wan(igb0)
2024-12-10T12:32:21-05:00 Notice kernel <6>igb0: link state changed to UP
2024-12-10T12:32:15-05:00 Notice opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for opt1(igb2)
2024-12-10T12:32:15-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : openvpn_refresh_crls(1))
2024-12-10T12:32:14-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (execute task : core_trust_crl(1))
2024-12-10T12:32:14-05:00 Notice opnsense /usr/local/sbin/pluginctl: plugins_configure crl (1)
2024-12-10T12:32:13-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "INIT" for vhid 4
2024-12-10T12:32:13-05:00 Critical dhclient exiting.
2024-12-10T12:32:13-05:00 Error dhclient connection closed
2024-12-10T12:32:12-05:00 Notice opnsense /usr/local/etc/rc.linkup: DEVD: Ethernet detached event for wan(igb0)
2024-12-10T12:32:11-05:00 Notice kernel <6>igb2: link state changed to DOWN
2024-12-10T12:32:11-05:00 Notice kernel <6>carp: demoted by 240 to 240 (interface down)
2024-12-10T12:32:11-05:00 Notice kernel <6>carp: 4@igb2: MASTER -> INIT (hardware interface down)
2024-12-10T12:32:11-05:00 Notice kernel <6>igb0: link state changed to DOWN
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 10, 2024, 06:44:46 PM
Ok, go back to _11 and see if that fixes it. I think this is progress :)


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 10, 2024, 07:34:28 PM
Hmm, I went back to _11 and after a while started seeing the same. No rhyme or reason why it suddenly starts but didn't show up earlier. Thoughts?
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 10, 2024, 09:04:04 PM
Let's go back further?

# opnsense-update -zkr 24.7.6_7

But if this fails you should double-check 24.7.6 so we know there is a good baseline to work with. Otherwise it could be anything (else). :)


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 10, 2024, 10:57:30 PM
Thank you Franco, applied _7 to both boxes just now and letting it soak in. Will report back in a bit.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 11, 2024, 04:00:19 AM
24.7.6_7 has been stable on both boxes for the last few hours without any CARP error messages in the logs. So perhaps the issue is upwards of _7?
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 11, 2024, 08:18:56 AM
The window between _7 and _11 is quite narrow... that's good :) let's try _9:

# opnsense-update -zkr 24.7.6_9


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 11, 2024, 02:39:25 PM
Thank you -- applied to both boxes half an hour ago and so far stable, but will run it for a bit longer this time. Do you want to give me _10 already so I can switch to it at my leisure a bit later? :)

EDIT: Nooo, the CARP issues just happened again. So I guess I need _8? :)
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 11, 2024, 05:10:45 PM
Here it is:

# opnsense-patch -zkr 24.7.6_8

Thanks a lot for doing this BTW!
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 11, 2024, 10:08:14 PM
Thanks Franco, the pleasure is all mine. Who knows, may still just be an issue only affecting me, so I appreciate you taking time out of your busy calendar with this.

I applied _8 to both boxes and saw the same CARP issues in the logs. I'll revert back to _7 to confirm that it is gone. Then I'll probably switch forward to _8 to confirm it truly starts with that build.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 12, 2024, 04:53:10 AM
So I've gone back and forth between _07 and _08 a couple of times and I can only see the issue appearing on _08, and it seems gone when I go back to _07. Smoking gun or coincidence? I'll let it remain on _07 overnight and will see if there are any issues present in the logs in the morning. Thanks!
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 12, 2024, 09:40:25 AM
Hmm, so it's the following commit:

https://github.com/opnsense/src/commit/1af69a3af3540f9f

which also seems to do something with igb...

Which Intel card is this again? Do you have EEE sysctl set?

# sysctl hw.em.eee_setting
hw.em.eee_setting: 1

Allegedly "1" means off which is / should be the default.. Turning EEE on ("0") could potentially introduce an instability for CARP / link activity.


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 12, 2024, 02:04:42 PM
It's the I211 network chip. See here (https://forum.opnsense.org/index.php?topic=43372.msg219509#msg219509) for the complete details of my system (at bottom of that post).

Output of the EEE setting is 1 on both boxes:

root@OPNsense:~ # sysctl hw.em.eee_setting
hw.em.eee_setting: 1

root@OPNsense-Backup:~ # sysctl hw.em.eee_setting
hw.em.eee_setting: 1

Still all stable btw after leaving it on _7 overnight.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 12, 2024, 02:53:56 PM
Just for fun would you mind flipping the setting to 0?

# sysctl hw.em.eee_setting=0

I don't have a strong objection of backing this out for now, but I don't quite understand what is happening yet.


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 12, 2024, 03:03:19 PM
It appears I have to set it in loader.conf?

# sysctl hw.em.eee_setting=0
sysctl: oid 'hw.em.eee_setting' is a read only tunable
sysctl: Tunable values are set in /boot/loader.conf

It's not present in that file yet -- should I just add hw.em.eee_setting="0" to the bottom of it?
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Patrick M. Hausen on December 12, 2024, 03:05:25 PM
Rather add it in the UI.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 12, 2024, 03:10:41 PM
Quote from: Patrick M. Hausen on December 12, 2024, 03:05:25 PMRather add it in the UI.

Good call. Made the tunable changes, brought kernal back up to _8 (where the "issue" starts) and am rebooting both boxes now.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 12, 2024, 04:11:56 PM
It's been almost an hour and the issue has so far not re-appeared. I'll leave it for another hour on _8 with the EEE tunable set to 0, and then I want to bring the kernel back up to the current production version to see if it continues to be stable with this tunable in the 0 position. Unless you want me to do any other testing along the way :)
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 12, 2024, 04:28:37 PM
That test plan sounds good. I need to look at the Intel vendor driver to make sense of this. If the setting is reversed then there is a bug now in the driver. Pretty weird.


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 12, 2024, 06:27:48 PM
So I'm about an hour two three hours in to having gone from kernel 24.7.6_8 back up to the current 24.7.10. The EEE tunable is still set to 0. An no issues so far (knock on wood). Will keep it like this and monitor over the remainder of the day.

I think it's plausible that someone flipped the meaning of the tunable. 0 for Energy Efficient Ethernet OFF. 1 for EEE ON. Which would also be more intuitive.

Perhaps this also explains why the issue always took some time to materialize -- the energy efficiency mode only kicked in after some timeout? Just speculating of course.

EDIT: Several hours later, still all fine. The tunable fixed it!
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 13, 2024, 02:50:51 PM
Hmm, I checked the Intel driver. It seems, correct, BUT:

From what I can tell before the EEE was enabled by default, now it is forced to disabled.

Which means this is the right fix, but the issue is that igb and e1000 are different and now the e1000 default to disable will be impossible to change for ibg since they are the same value defaulting to off.


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 13, 2024, 02:57:38 PM
So if I understand you correctly, depending on network chipset, end users (like me) will have to set this tunable specifically according to their chipset.

Or OPNsense (or BSD?) could detect the chipset and set the tunable accordingly? But perhaps it's best to leave that to end users.

Also, is this maybe only impacting people using CARP (like I was)? If so, probably just a small percentage of users impacted.
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 13, 2024, 03:25:09 PM
I think it's a bigger deal than it looks on the surface. IMO the best course of action would be to revert the commit for the time being.

Assuming you're seeing the issue others will see it too for sure.

I have to think about it. I also mailed the author of the commit.


Cheers,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 13, 2024, 03:35:40 PM
Excellent. This was a fun experience, thank you for your personal attention on this matter! :)
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: franco on December 13, 2024, 04:00:06 PM
I sent you a private message in case you didn't see. Need more debug info.


Thanks,
Franco
Title: Re: PSA: Test kernel with Intel fixes is available for testing
Post by: Grossartig on December 13, 2024, 04:25:46 PM
Sent you two private messages just now