Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - felipe0123

#1
26.1 Series / Re: 26.1.3 and Intel X710 (ixl0)
March 29, 2026, 03:29:30 AM
Protectli hasn't been very helpful so far, they state they sold more than 4000 VP2440 and I'm the first person reporting the issue and no reply to my next communication.

Since the script has fixed the issue for me, I'm adding it here in case someone else face the same issue: https://github.com/galmeida/opnsense-protectli-vp2440
#2
26.1 Series / Re: 26.1.3 and Intel X710 (ixl0)
March 26, 2026, 04:14:31 AM
I will try to flip traffic to x710 again to make some tests and report back here.

UPDATE:
1. flipped traffic to X710, pushed 1.5Gb/s traffic with iperf, crashed pretty quickly. Under 3 minutes, likely under 2.
2. I noticed the pciconf -lce pci0:1:0:0 would report ASPM enabled, despite aspm=0 with sysctl
3. forced ASPM off using pciconf
4. pushed 3.5Gb/s of sustained traffic for around 10 minutes, no issue so far.

I'm optimist that may be the issue, I'm running coreboot and it doesn't have an option do disable ASPM at that level, maybe that can be accomplished if running AMI BIOS and no pciconf workaround would be needed. I will contact Protectli support



#3
26.1 Series / Re: 26.1.3 and Intel X710 (ixl0)
March 25, 2026, 03:23:40 AM
@OPNenthu

Quote from: OPNenthu on March 06, 2026, 07:17:25 AMThis must be the VP2440. Did you recently install coreboot v0.9.1-rc3 (the one that fixes the i226-v ASPM issue)?  I wonder if that firmware maybe introduced a new issue.

FYI, since I started experiencing the issue with x710, I moved my LAN connection to i226. Once I did that I started to experience a different issue, from time to time one of the rx-queues would just stop processing traffic. I tried forcing a single queue and the issue happened with the single queue as well. So I decided to apply the coreboot v0.9.1-rc3.

It's too early to be completely sure, but once I applied the update Ierrs on both igc0 and igc1 dropped to zero (igc0, WAN) had IErrs > 0 almost immediately after boot. As a test I had two internal hosts concurrently running speediest-cli in loop and Errs on WAN is still zero, LAN is 103 but on a deeper inspection it seems to be due to queue saturation probably because I'm still running with a single queue, I will revert that once I get more confident the coreboot update helped.

Although this is not related to the original issue I posted about for x710, I'm posting this here so people facing issues with Protectli VP2440 and v226 interfaces are aware the update helps. Before the update I had hw.pci.enable_aspm set to zero, but that workaround clearly wasn't enough.
#4
26.1 Series / Re: 26.1.3 and Intel X710 (ixl0)
March 09, 2026, 03:00:28 PM
Quote from: lechterpolntrien on March 07, 2026, 05:18:06 AMI've been struggling with intermittent instability on my VP2440 since I got it, but I'm still on 25.7. I currently have a 25 day uptime, which is the longest uptime I've had since I got it - I'm sure it will fall over tonight now that I've thought about it.


I was considering a 26.1 upgrade to help with these problems...

Out of curiosity, what kind of issues do you see? VP2440 was never really stable for me until the last 25 release + all NIC firmware updates. 26 made it unstable again.
#5
26.1 Series / Re: 26.1.3 and Intel X710 (ixl0)
March 06, 2026, 04:16:46 PM
It's the vp2440, but I did not update coreboot (ever) or NIC firmware (in the last few months). Other than some RX errors on the v226 connected to ONT, it has been running smoothly for a long while. The issue started immediately after I upgraded to OpnSense 26.1.3. (And of course I forgot to take a ZFS snapshot before upgrading and after moving to the new firewall rules interface)
#6
26.1 Series / Re: 26.1.3 and Intel X710 (ixl0)
March 06, 2026, 07:02:10 AM
Thanks, I've seen that. I doesn't seem relevant, that looks like an issue with virtualized servers, I'm not using that. Running on a Protectli
#7
26.1 Series / Re: 26.1.3 and Intel X710 (ixl0)
March 05, 2026, 07:15:26 PM
> Are you using other ports on the card?

device has 2x x710 + 2x i226. Only one of each was in use.


> what do you have set under "Interfaces: Settings -> Network Interfaces"

"disable ... offload" checkboxes all checked. vlan hardware filtering set to disabled too.
#8
26.1 Series / [SOLVED] 26.1.3 and Intel X710 (ixl0)
March 05, 2026, 05:54:19 PM
After upgrading to 26.1.3 I'm experiencing issues with the intel x710 interface. I wonder if anyone has seem something similar or have any advice.

Less than 1 minute after upgrading to 26.1.3, OPNsense lost all connectivity. It's like the network stack crashes completely, no traffic arrives at any of its interfaces (I226 and X710). Rebooting fixes it for 1 minute or so. About half the time, it will require power cycling as the soft reboot won't compete with the shutdown.

Console shows the following message a few times during shutdown:

ixl0: ixl_del_hw_filters: i40e_aq_remove_macvlan status I40E_ERR_ADMIN_QUEUE_FULL, error OK


That made me suspect the ixl0 interface, so I unplugged it and the issue went away immediately. Has anyone experienced something similar? Any advice?

[1] ixl0: <Intel(R) Ethernet Controller X710 for 10GbE SFP+ - 2.3.3-k> mem 0x82000000-0x827fffff,0x83000000-0x83007fff at device 0.0 on pci1
[1] ixl0: fw 9.156.79020 api 1.15 nvm 9.56 etid 800100fb oem 0.0.0
[1] ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
[1] ixl0: Using 1024 TX descriptors and 1024 RX descriptors
[1] ixl0: Using 4 RX queues 4 TX queues
[1] ixl0: Using MSI-X interrupts with 5 vectors
[1] ixl0: Ethernet address: 64:62:66:xx:xx:xx
[1] ixl0: Allocating 4 queues for PF LAN VSI; 4 queues active
[1] ixl0: PCI Express Bus: Speed 8.0GT/s Width x4
[1] ixl0: SR-IOV ready
[1] ixl0: netmap queues/slots: TX 4/1024, RX 4/1024


UPDATE: solved, see https://forum.opnsense.org/index.php?topic=51174.msg263902#msg263902

 
#9
Issue: Lots of RX errors / missed packets when the NIC is connected to Arris ONT Calix 711GE ONT, connecting it to switch or computer gives no error. Swapping igc0/igc1 makes no difference. So many pkts are missed that it will frequently look like DNS issues, due to missed SYN-ACKs and long delays to connect to websites.


# sysctl dev.igc.1.mac_stats.missed_packets
dev.igc.1.mac_stats.missed_packets: 15467

# netstat -I igc1 | head 2
Name    Mtu Network         Address                                           Ipkts  Ierrs  Idrop    Opkts  Oerrs   Coll
igc1   1500 <Link#4>        64:62:66:XX:XX:XX                               1298933  15597      0   629757      0      0



Device info:

# dmesg | grep '\[1\] igc'
[1] igc0: <Intel(R) Ethernet Controller I226-V> mem 0x80400000-0x804fffff,0x80600000-0x80603fff at device 0.0 on pci2
[1] igc0: EEPROM V2.17-0 eTrack 0x80000303
[1] igc0: Using 1024 TX descriptors and 1024 RX descriptors
[1] igc0: Using 4 RX queues 4 TX queues
[1] igc0: Using MSI-X interrupts with 5 vectors
[1] igc0: Ethernet address: 64:62:66:XX:XX:XX
[1] igc0: netmap queues/slots: TX 4/1024, RX 4/1024
[1] igc1: <Intel(R) Ethernet Controller I226-V> mem 0x80700000-0x807fffff,0x80900000-0x80903fff at device 0.0 on pci4
[1] igc1: EEPROM V2.17-0 eTrack 0x80000303
[1] igc1: Using 1024 TX descriptors and 1024 RX descriptors
[1] igc1: Using 4 RX queues 4 TX queues
[1] igc1: Using MSI-X interrupts with 5 vectors
[1] igc1: Ethernet address: 64:62:66:XX:XX:XY
[1] igc1: netmap queues/slots: TX 4/1024, RX 4/1024

# pciconf -lV | grep igc
igc0@pci0:2:0:0:    class=0x020000 rev=0x04 hdr=0x00 vendor=0x8086 device=0x125c subvendor=0x8086 subdevice=0x0000
igc1@pci0:4:0:0:    class=0x020000 rev=0x04 hdr=0x00 vendor=0x8086 device=0x125c subvendor=0x8086 subdevice=0x0000

You will notice RSS enabled, I just enabled it for testing. It made no difference.


UPDATE: Inspired by another users' post, I picked up an old EdgeRoute, configured it as a dumb switch and put it in between opnsense and the OTN, RX errors dropped to zero
UPDATE2: updated one of the interfaces from V2.17-0 to V2.32-0 (2M). It did not fix the issue. Moved connection back to 2.17 with dumb switch
UPDATE3: after a couple days, family complained about transient connection issues. Although I can't see if there are RX errors on dumb switch interface there's none on opnsense interface, but lots of duplicate ACK just like before
UPDATE4: it tuned out the issues were caused by ASPM. I'm running OpnSense on Protectli V2440 + coreboot and interfaces would still have ASPM enabled even with aspm set to 0 in sysctl. Upgrading coreboot to 0.9.1-rc3 fixed all issues and the device is very stable now.