PSA: Test kernel with Intel fixes is available for testing

Started by newsense, October 13, 2024, 12:34:39 AM

Previous topic - Next topic
Opnsense 24.7.8

o src: assorted FreeBSD stable patches for Intel ixgbe, igb, igc and e1000 drivers

is this the latest version of the kernel ?


I locked intel5 prior to upgrading so I wouldn't lose the kernel

Intel6 kernel available, this time with ice driver updates.

# opnsense-update -zkr 24.7.8-intel6

Apologies to invade this thread: Once on 24.7.8, is there a way to revert to the previous intel drivers for testing purposes? Since .8 I've been having CARP issues (interfaces going up and down randomly: https://forum.opnsense.org/index.php?topic=43934.0) and I'd like to narrow down the cause of the issue. This is a Protectli FW4B with Intel i211. Thanks!

Normal kernels are selected without -z and their respective OPNsense release version so in this case 24.7.6:

# opnsense-update -kr 24.7.6


Cheers,
Franco

Thank you -- worked like a charm. Now that I know how to revert, I may as well also try the intel6 kernel for good measure. Thanks!

Ok, if it's part of the kernel changes in 24.7.8 I'm afraid we will have to look into it more closely as these are queued up for FreeBSD 14.2 as well.


Cheers,
Franco

If using the ice driver this kernel has RSS fixes for you.

opnsense-update -zkr 24.7.8-ice


We haven't tested the RSS bit yet. Only do this if you have time to kill. ;)


Cheers,
Franco

Quote from: franco on November 14, 2024, 08:16:23 PM
We haven't tested the RSS bit yet. Only do this if you have time to kill. ;)

Sure, but I am using the x710 card, not sure if I am helping at all..


November 15, 2024, 04:00:10 PM #42 Last Edit: November 18, 2024, 07:03:20 PM by Grossartig
Quote from: franco on November 12, 2024, 04:36:14 PM
Ok, if it's part of the kernel changes in 24.7.8 I'm afraid we will have to look into it more closely as these are queued up for FreeBSD 14.2 as well.

So I tried intel6 on both my primary and backup box, and I am seeing the following errors periodically in my DHCP logs:

2024-11-15T08:19:46-05:00 Informational dhcpd Server starting service.
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_lan: I move from communications-interrupted to startup
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_opt1: I move from communications-interrupted to startup
2024-11-15T08:19:46-05:00 Informational dhcpd Sending on Socket/fallback/fallback-net
2024-11-15T08:19:46-05:00 Informational dhcpd Sending on BPF/igb1/00:e0:67:2f:04:39/192.168.5.0/24
2024-11-15T08:19:46-05:00 Informational dhcpd Sending updates to dhcp_lan.
2024-11-15T08:19:46-05:00 Informational dhcpd balanced pool 29121b317180 192.168.5.0/24 total 154 free 74 backup 74 lts 0 max-misbal 22
2024-11-15T08:19:46-05:00 Informational dhcpd balancing pool 29121b317180 192.168.5.0/24 total 154 free 79 backup 69 lts 5 max-own (+/-)15
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_lan: Both servers normal
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_lan: I move from communications-interrupted to normal
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_lan: peer moves from normal to normal
2024-11-15T08:19:46-05:00 Informational dhcpd balanced pool 29121b317240 192.168.9.0/24 total 105 free 52 backup 48 lts 2 max-misbal 15
2024-11-15T08:19:46-05:00 Informational dhcpd balancing pool 29121b317240 192.168.9.0/24 total 105 free 52 backup 48 lts 2 max-own (+/-)10
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_opt1: Both servers normal
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_opt1: I move from communications-interrupted to normal
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_opt1: peer moves from normal to normal
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_lan: I move from startup to communications-interrupted
2024-11-15T08:19:46-05:00 Informational dhcpd failover peer dhcp_opt1: I move from startup to communications-interrupted
2024-11-15T08:19:46-05:00 Informational dhcpd Listening on BPF/igb1/00:e0:67:2f:04:39/192.168.5.0/24
2024-11-15T08:19:46-05:00 Informational dhcpd Sending on BPF/igb2/00:e0:67:2f:04:3a/192.168.9.0/24
2024-11-15T08:19:46-05:00 Informational dhcpd Listening on BPF/igb2/00:e0:67:2f:04:3a/192.168.9.0/24
2024-11-15T08:19:46-05:00 Informational dhcpd Wrote 205 leases to leases file.
2024-11-15T08:19:46-05:00 Informational dhcpd Wrote 0 new dynamic host decls to leases file.
2024-11-15T08:19:46-05:00 Informational dhcpd Wrote 0 deleted host decls to leases file.
2024-11-15T08:19:46-05:00 Informational dhcpd Wrote 0 class decls to leases file.
2024-11-15T08:19:46-05:00 Informational dhcpd For info, please visit https://www.isc.org/software/dhcp/
2024-11-15T08:19:46-05:00 Informational dhcpd All rights reserved.
2024-11-15T08:19:46-05:00 Informational dhcpd Copyright 2004-2022 Internet Systems Consortium.
2024-11-15T08:19:46-05:00 Informational dhcpd Internet Systems Consortium DHCP Server 4.4.3-P1
2024-11-15T08:19:46-05:00 Informational dhcpd PID file: /var/run/dhcpd.pid
2024-11-15T08:19:46-05:00 Informational dhcpd Database file: /var/db/dhcpd.leases
2024-11-15T08:19:46-05:00 Informational dhcpd Config file: /etc/dhcpd.conf
2024-11-15T08:19:46-05:00 Informational dhcpd For info, please visit https://www.isc.org/software/dhcp/
2024-11-15T08:19:46-05:00 Informational dhcpd All rights reserved.
2024-11-15T08:19:46-05:00 Informational dhcpd Copyright 2004-2022 Internet Systems Consortium.
2024-11-15T08:19:46-05:00 Informational dhcpd Internet Systems Consortium DHCP Server 4.4.3-P1
2024-11-15T08:19:45-05:00 Informational dhcpd failover peer dhcp_opt1: I move from normal to communications-interrupted
2024-11-15T08:19:45-05:00 Informational dhcpd peer dhcp_opt1: disconnected
2024-11-15T08:19:45-05:00 Informational dhcpd failover peer dhcp_lan: I move from normal to communications-interrupted
2024-11-15T08:19:45-05:00 Informational dhcpd peer dhcp_lan: disconnected
2024-11-15T08:19:45-05:00 Informational dhcpd failover peer dhcp_opt1: Both servers normal
2024-11-15T08:19:45-05:00 Informational dhcpd failover peer dhcp_opt1: peer moves from communications-interrupted to normal
2024-11-15T08:19:45-05:00 Informational dhcpd balanced pool 3fa62b117240 192.168.9.0/24 total 105 free 52 backup 48 lts 2 max-misbal 15
2024-11-15T08:19:45-05:00 Informational dhcpd balancing pool 3fa62b117240 192.168.9.0/24 total 105 free 52 backup 48 lts 2 max-own (+/-)10
2024-11-15T08:19:45-05:00 Informational dhcpd failover peer dhcp_opt1: I move from communications-interrupted to normal
2024-11-15T08:19:45-05:00 Informational dhcpd failover peer dhcp_opt1: peer moves from normal to communications-interrupted
2024-11-15T08:19:40-05:00 Informational dhcpd failover peer dhcp_opt1: I move from normal to communications-interrupted
2024-11-15T08:19:40-05:00 Informational dhcpd peer dhcp_opt1: disconnected
2024-11-15T08:19:40-05:00 Error dhcpd timeout waiting for failover peer dhcp_opt1


Around the same time, I am seeing this in the general log:

2024-11-15T08:20:33-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "MASTER" for vhid 4
2024-11-15T08:19:48-05:00 Notice kernel <6>carp: 4@igb2: BACKUP -> MASTER (preempting a slower master)
2024-11-15T08:19:45-05:00 Notice kernel <6>carp: demoted by -240 to 0 (pfsync bulk done)
2024-11-15T08:19:45-05:00 Notice kernel <6>carp: demoted by 240 to 240 (pfsync bulk start)
2024-11-15T08:19:45-05:00 Notice kernel <6>carp: 4@igb2: INIT -> BACKUP (initialization complete)
2024-11-15T08:19:45-05:00 Notice kernel <6>carp: 4@igb2: MASTER -> INIT (hardware interface up)
2024-11-15T08:19:44-05:00 Notice kernel <6>carp: 4@igb2: BACKUP -> MASTER (master timed out)
2024-11-15T08:19:42-05:00 Notice opnsense /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "Virtual Guest IP (192.168.9.1) (4@igb2)" has resumed the state "BACKUP" for vhid 4
2024-11-15T08:19:40-05:00 Notice kernel <6>carp: demoted by -240 to 0 (interface up)
2024-11-15T08:19:40-05:00 Notice kernel <6>carp: 4@igb2: INIT -> BACKUP (initialization complete)


I then reverted the kernel of my backup box to the 24.7.6 one, and I have so far not seen these issues for the past hour three days anymore. My primary is still on intel6. Can't say 100% that the above issues are related to the new drivers, I'm merely reporting what I see.

P.S.: Both boxes are identical twins (Protectli FW4B) running coreboot. Hardware details below (hw-probe):


HOST:
arch:amd64
boot_mode:EFI
cores:4
dual_boot:0
dual_boot_win:0
filesystem:zfs
freebsd_release:14.1
freebsd_version:14.1-RELEASE-p6
kernel:14.1-RELEASE-p5
lang:C.UTF-8
microarch:Silvermont
model:FW4B
nics:4
part_scheme:GPT
probe_ver:1.6
ram_total:4194304
ram_used:274432
sockets:1
space_total:21
space_used:2
system:opnsense-24.7.8
system_version:24.7.8
threads:1
type:desktop
vendor:Protectli
year:2022

DEVICES:
pci:8086-1539;02-00-00;works;network;igb;Intel Corporation;I211 Gigabit Network Connection
pci:8086-1539;02-00-00;works;network;igb;Intel Corporation;I211 Gigabit Network Connection
pci:8086-1539;02-00-00;works;network;igb;Intel Corporation;I211 Gigabit Network Connection
pci:8086-1539;02-00-00;works;network;igb;Intel Corporation;I211 Gigabit Network Connection
pci:8086-2280-8086-7270;06-00-00;detected;bridge;hostb;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series SoC Transaction Register
pci:8086-2284-8086-7270;04-03-00;detected;sound;hdac;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series High Definition Audio Controller
pci:8086-2292-8086-7270;0c-05-00;detected;smbus;ichsmb;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx SMBus Controller
pci:8086-229c-8086-7270;06-01-00;detected;bridge;isab;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series PCU
pci:8086-22a3-8086-7270;01-06-01;works;storage;ahci;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series SATA Controller
pci:8086-22b1-8086-7270;03-00-00;detected;graphics card;vgapci;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Integrated Graphics Controller
pci:8086-22b5-8086-7270;0c-03-30;detected;usb controller;xhci;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series USB xHCI Controller
pci:8086-22c8-8086-7270;06-04-00;detected;bridge;pcib;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series PCI Express Port
pci:8086-22ca-8086-7270;06-04-00;detected;bridge;pcib;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series PCI Express Port
pci:8086-22cc-8086-7270;06-04-00;detected;bridge;pcib;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series PCI Express Port
pci:8086-22ce-8086-7270;06-04-00;works;bridge;pcib;Intel Corporation;Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series PCI Express Port
usb:1d6b-0003;09-00-00;detected;hub;uhub;BSD;XHCI root HUB
0;;detected;;;;
bios:coreboot-v4-12-0-8-10-25-2022;;works;bios;;coreboot;BIOS v4.12.0.8 10/25/2022
board:protectli-fw4b-1-0;;works;motherboard;;Protectli;Motherboard FW4B 1.0
cpu:intel-6-76-4-celeron-j3160;;works;cpu;;Intel;Celeron CPU J3160 @ 1.60GHz
cpu:intel-6-76-4-celeron-j3160;;works;cpu;;Intel;Celeron CPU J3160 @ 1.60GHz
cpu:intel-6-76-4-celeron-j3160;;works;cpu;;Intel;Celeron CPU J3160 @ 1.60GHz
cpu:intel-6-76-4-celeron-j3160;;works;cpu;;Intel;Celeron CPU J3160 @ 1.60GHz
ide:transcend-ts32gmts400-serial-1d60f49bfd09be47dc06ee30ef0346ff;;works;disk;ada, ahcich;Transcend;TS32GMTS400 32GB
mem:ram-module-4gb-1600-serial-aaec96992031920b10ead184d63872b5;;works;memory;;;RAM Module 4GB 1600MT/s

I updated both boxes to OPNsense 24.7.10_2 and unfortunately the problem I described above is back again. At this point, I am blaming it on the new Intel driver changes in the kernel :)

@Grossartig Are you willing to test a kernel wit a few igb(4) reverts to get to the bottom of it?


Thanks,
Franco