OPNsense Forum

English Forums => 25.7 Series => Topic started by: OPNenthu on August 04, 2025, 08:35:26 PM

Title: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: OPNenthu on August 04, 2025, 08:35:26 PM
I came across this mailing list thread while searching online about FreeBSD instabilities with N100, as many have been reporting upgrade issues.  I'm not sure if this is related to the problematic microcode updates.

https://lists.freebsd.org/archives/freebsd-current/2025-January/006984.html

ChatGPT (for what it's worth) describes the issues like this:

Quote2. PCID / Cache Corruption Bug

    The N100 has a known CPU erratum: INVLPG instruction with PCID enabled fails to flush TLB entries, causing data corruption on UFS file systems (sometimes panics or inode mangling) [ref] (https://lists.freebsd.org/archives/freebsd-current/2025-January/006984.html)

    The workaround: add

    vm.pmap.pcid_enabled=0 

    to loader.conf, ideally tested in production. Users report stability regained after disabling PCID [ref] (https://forums.freebsd.org/threads/unable-to-start-x-on-12th-gen-alder-lake-n100-with-failed-to-load-module-i915kms-while-kldstat-shows-loaded.92162/)

3. UFS Filesystem Instability

    Severe issues such as inode corruption, filesystem panics, or UFS failure have been seen repeatedly when PCID remains enabled and UFS is used [ref] (https://lists.freebsd.org/archives/freebsd-current/2025-January/006984.html)

    ZFS appears to avoid these issues entirely.

Quote⚠️ Why Might You Want to Disable It?

Some CPUs (including Intel N100/Alder Lake-N) exhibit hardware bugs when PCID is used. Specifically:

    A known CPU erratum causes INVLPG (used to invalidate specific TLB entries) to fail when PCID is active.

    This can result in stale or corrupted memory mappings, leading to:

        Filesystem corruption (especially UFS)

        Kernel panics

        Data loss

        Subtle stability problems

Disabling PCID (vm.pmap.pcid_enabled=0) avoids using the broken logic path.
🧪 Who Should Set It?

If you're using:

    Intel N100 or other Alder Lake-N CPUs

    UFS as a filesystem

    FreeBSD 13.x or 14.x

👉 You should absolutely set vm.pmap.pcid_enabled=0 to ensure stability.

Seemed a little concerning and I thought I'd bring it up here for more technical insight.

I'm not affected personally as I don't have an N100 at this time.
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: MoeJoe on August 04, 2025, 09:51:55 PM
ive got a n100 running on my workshop since aprox. 1 1/2 years without any issue with UFS and a local 128gb nvme in it. so i cant commit this issue.
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: marka2k on August 05, 2025, 01:50:05 AM
I had similar issues with installing 25.7 (data corruption) installing the OS_CPU_MICROCODE_INTEL Plugin resolved the issue, previously was Not installed when I experienced the errors. Hooked up an monitor and noticed cluster errors during boot from upgrade (boot loop) reverted back to previous version installed the microcode plugin and then upgraded it has been running for an week now and several reboots no issues.
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: BrandyWine on August 05, 2025, 07:24:47 AM
Maybe just use ZFS ?
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: franco on August 05, 2025, 07:36:12 AM
Since the issue is sort of elusive on the CPU level chances are this affects stability in other ways than ZFS in particular (or any FS generally) so I think the recommendation for the tunable is something to consider for all relevant installs:

vm.pmap.pcid_enabled=0

I've also come to believe that moving way from our previous defaults hw.ibrs_disable=0 and vm.pmap.pti=1 back to FreeBSD's defaults (1 and 0 respectively) may cause some of the currently seen instabilities. Feel free to double check by setting these again on 25.7 and up:

hw.ibrs_disable=0
vm.pmap.pti=1

A number of people complained about OPNsense being slower than FreeBSD which was because of these security settings. From the looks of it now it has traded stability for speed on the lower end Intel CPUs for the most part.

https://docs.opnsense.org/troubleshooting/hardening.html#spectre-and-meltdown


Cheers,
Franco
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: OPNenthu on August 05, 2025, 08:57:25 PM
Quote from: franco on August 05, 2025, 07:36:12 AMhw.ibrs_disable=0
vm.pmap.pti=1

I have an observation, though unrelated to the main topic.

These two tunables you mention appear in the OPNsense UI and are co-located in the default sorting (convenient for screenshotting) but they don't appear to have any values or defaults:

tunables.png

Corresponding 'sysctl':

root@firewall:~ # sysctl -a | grep -E 'vm.pmap.pcid_enabled|vm.pmap.pti|hw.ibrs_disable'
vm.pmap.pti: 0
vm.pmap.pcid_enabled: 0
hw.ibrs_disable: 1

I trust the sysctl output, just wondering why the OPNsense tunables list is that way?  I get that the tunables list isn't complete and the system may support additonal ones not listed in OPNsense, but if they're listed they should have values I think (?).

----

Regarding Meltdown/Spectre, looks like a big can of worms trying to determine which particular CPU is or isn't vulnerable as some models vary depending even on the particular stepping. Uff.

Since the important one for data corruption is already disabled as recommended, I'm going to leave these alone.  I'm not seeing any issues with performance or stability right now and I've kept up with UEFI and microcode updates.  Probably these OS mitigations are more critical on virtualized environments where other things can be running, but I could be wrong (and wish to be corrected, as always).
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: franco on August 05, 2025, 09:20:04 PM
Quote from: OPNenthu on August 05, 2025, 08:57:25 PMroot@firewall:~ # sysctl -a | grep -E 'vm.pmap.pcid_enabled|vm.pmap.pti|hw.ibrs_disable'
vm.pmap.pti: 0
vm.pmap.pcid_enabled: 0
hw.ibrs_disable: 1

I trust the sysctl output, just wondering why the OPNsense tunables list is that way?  I get that the tunables list isn't complete and the system may support additonal ones not listed in OPNsense, but if they're listed they should have values I think (?).

We removed the explicit tunables from the config.xml, but we also removed the default values for it in order to go back to FreeBSD defaults. If you have these config.xml tunables but not set to "0" or "1" (meaning currently empty string "") they will use the system default now since we don't provide another default.  Looking at your data that is the expected output on 25.7 when nothing else was specified.


Cheers,
Franco
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: OPNenthu on August 05, 2025, 09:35:55 PM
Clear now, thanks!
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: marka2k on August 06, 2025, 01:27:20 AM
Quote from: BrandyWine on August 05, 2025, 07:24:47 AMMaybe just use ZFS ?

Took a chance and backed up my config, fresh installed using ZFS and restored. Running fine thank you for heading me in the direction easier than I thought.
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: cibomato on August 06, 2025, 12:18:59 PM
Hi Franco,

Quote from: franco on August 05, 2025, 07:36:12 AMSince the issue is sort of elusive on the CPU level chances are this affects stability in other ways than ZFS in particular (or any FS generally) so I think the recommendation for the tunable is something to consider for all relevant installs:

vm.pmap.pcid_enabled=0

I've also come to believe that moving way from our previous defaults hw.ibrs_disable=0 and vm.pmap.pti=1 back to FreeBSD's defaults (1 and 0 respectively) may cause some of the currently seen instabilities. Feel free to double check by setting these again on 25.7 and up:

hw.ibrs_disable=0
vm.pmap.pti=1


Sorry that I've captured this thread, don't know how to delete this post...
I've added vm.pmap.pcid_enabled=0 and corrected vm.pmap.pti=1 (was 0) but still it won't upgrade to 25.7.1.1_1
I also tried to install the intel-microcode-plugin (which I hadn't installed yet) but it claims that it'd need upgrade to 25.7.1_1 first, which doesn't work...
Trying to upgrade fails with:

Checking integrity...Assertion failed: (strcmp(uid, p->uid) != 0), function pkg_conflicts_check_local_path, file pkg_jobs_conflicts.c, line 315.
Child process pid=62294 terminated abnormally: Abort trap
Starting web GUI...done.
***DONE***

Any more iedaes?

Thanks and best regards,
Jochen

Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: franco on August 06, 2025, 12:44:49 PM
Drop to the console and do

# pkg install os-cpu-microcode-intel

and reboot to activate...

# opnsense-shell reboot

Then try the update again.


Cheers,
Franco
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: cibomato on August 07, 2025, 11:33:53 AM
Installed microcode plugin parallel in both ways /boot/loader.conf and /etc/rc.conf but still no upgrade possible. UI said the microcode plugin was misconfigured so i removed it.
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: franco on August 07, 2025, 12:19:12 PM
opnsense-bootstrap would be the last resort before a clean reinstall, but not knowing what is wrong there's not much more guidance to give and things would just continue to deteriorate for unknown reasons (like hardware failures).


Cheers,
Franco
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: davidfi01 on August 11, 2025, 03:23:46 PM
FWIW, I migrated to a new N200 6-intell 226 made in China box for 25.7.  Using default migration, I was having multiple random shutdowns per day. Per Franco, I reverted the tunables above to:

sysctl -a | grep -E 'vm.pmap.pcid_enabled|vm.pmap.pti|hw.ibrs_disable'

vm.pmap.pti: 1
vm.pmap.pcid_enabled: 0
hw.ibrs_disable: 0

and the box has been up and running without problems for 4 days now.

Don't want to take a victory lap just yet. Will post back once I hit the 1 week mark.  Also note that I use cron to reboot box every night.

D
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: lmester on August 12, 2025, 06:41:42 AM
Quote from: OPNenthu on August 04, 2025, 08:35:26 PMI came across this mailing list thread while searching online about FreeBSD instabilities with N100, as many have been reporting upgrade issues.  I'm not sure if this is related to the problematic microcode updates.

https://lists.freebsd.org/archives/freebsd-current/2025-January/006984.html (https://lists.freebsd.org/archives/freebsd-current/2025-January/006984.html)

Thank you!

I have an N100 system. I recently upgraded to 25.7. The system crashed during the first boot after the upgrade. I saw that there were file system errors during the boot. After a re-install it appeared to be running OK. While reading the forum to try and solve some other problems I had during the upgrade I found this thread. I have now added vm.pmap.pcid_enabled=0 to the tunables. Even though it seems to be running fine I assume that there could still have been some file system corruption. Do you think I should re-install 25.7? If so, how would I do this so that the vm.pmap.pcid_enabled=0 setting is in place before the first boot? Sorry this may be simple but I'm not very good with Linux.
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: OPNenthu on August 12, 2025, 12:25:58 PM
Quote from: lmester on August 12, 2025, 06:41:42 AMEven though it seems to be running fine I assume that there could still have been some file system corruption. Do you think I should re-install 25.7? If so, how would I do this so that the vm.pmap.pcid_enabled=0 setting is in place before the first boot?

There's a chance that your disk has gone bad, which is something I see often on the forums.  Try to install the 'os-smart' plugin and run a S.M.A.R.T check to see about your disk health.  That plugin provides a simple status widget that you can add to the Lobby screen as well.  Probably not worth reinstalling if it's running well now and passing the Health audit (System->Firmware->Status->Run an audit->Health).

As for tunables during installation, you can set them temporarily from the boot menu:

https://forum.opnsense.org/index.php?topic=47494.msg239887#msg239887

You'll need console access- serial or VGA.
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: lmester on August 12, 2025, 03:46:44 PM
Quote from: OPNenthu on August 12, 2025, 12:25:58 PMThere's a chance that your disk has gone bad, which is something I see often on the forums.  Try to install the 'os-smart' plugin and run a S.M.A.R.T check to see about your disk health.  That plugin provides a simple status widget that you can add to the Lobby screen as well.  Probably not worth reinstalling if it's running well now and passing the Health audit (System->Firmware->Status->Run an audit->Health).

As for tunables during installation, you can set them temporarily from the boot menu:

https://forum.opnsense.org/index.php?topic=47494.msg239887#msg239887

You'll need console access- serial or VGA.

I think the system is corrupted. I'm seeing errors in the health audit. Can't install os-smart. Getting errors when I try to install updates.




############### Health audit errors ##############

***GOT REQUEST TO AUDIT HEALTH***
Currently running OPNsense 25.7 (amd64) at Tue Aug 12 09:20:05 EDT 2025
>>> Root file system: /dev/gpt/rootfs
>>> Check installed kernel version
Version 25.7 is correct.
>>> Check for missing or altered kernel files
No problems detected.
>>> Check installed base version
Version 25.7 is correct.
>>> Check for missing or altered base files
No problems detected.
>>> Check installed repositories
OPNsense (Priority: 11)
>>> Check installed plugins
os-nut 1.9
>>> Check locked packages
No locks found.
>>> Check for missing package dependencies
Checking all packages: .......... done
>>> Check for missing or altered package files
Checking all packages: ....
nspr-4.37: checksum mismatch for /usr/local/bin/nspr-config
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_aix32.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_aix64.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_darwin.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_freebsd.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_hpux32.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_hpux64.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_linux.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_netbsd.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_nto.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_openbsd.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_qnx.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_riscos.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_solaris.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/md/_win95.cfg
nspr-4.37: checksum mismatch for /usr/local/include/nspr/pratom.h
nspr-4.37: checksum mismatch for /usr/local/include/nspr/prinit.h
nspr-4.37: checksum mismatch for /usr/local/lib/libnspr4.a
nspr-4.37: checksum mismatch for /usr/local/lib/libnspr4.so
nspr-4.37: checksum mismatch for /usr/local/lib/libplc4.so
nspr-4.37: checksum mismatch for /usr/local/lib/libplds4.so
nspr-4.37: checksum mismatch for /usr/local/libdata/pkgconfig/nspr.pc
nspr-4.37: missing file /usr/local/share/licenses/nspr-4.37/LICENSE
nspr-4.37: missing file /usr/local/share/licenses/nspr-4.37/MPL20
nspr-4.37: missing file /usr/local/share/licenses/nspr-4.37/catalog.mk
Checking all packages.....
py311-certifi-2025.7.14: missing file /usr/local/lib/python3.11/site-packages/certifi-2025.7.14.dist-info/LICENSE
py311-certifi-2025.7.14: missing file /usr/local/lib/python3.11/site-packages/certifi-2025.7.14.dist-info/METADATA
py311-certifi-2025.7.14: missing file /usr/local/lib/python3.11/site-packages/certifi-2025.7.14.dist-info/RECORD
py311-certifi-2025.7.14: missing file /usr/local/lib/python3.11/site-packages/certifi-2025.7.14.dist-info/WHEEL
py311-certifi-2025.7.14: missing file /usr/local/lib/python3.11/site-packages/certifi-2025.7.14.dist-info/top_level.txt
py311-certifi-2025.7.14: checksum mismatch for /usr/local/lib/python3.11/site-packages/certifi/__init__.py
py311-certifi-2025.7.14: checksum mismatch for /usr/local/lib/python3.11/site-packages/certifi/__main__.py
py311-certifi-2025.7.14: checksum mismatch for /usr/local/lib/python3.11/site-packages/certifi/__pycache__/__init__.cpython-311.opt-1.pyc
py311-certifi-2025.7.14: checksum mismatch for /usr/local/lib/python3.11/site-packages/certifi/__pycache__/__init__.cpython-311.pyc
py311-certifi-2025.7.14: checksum mismatch for /usr/local/lib/python3.11/site-packages/certifi/__pycache__/__main__.cpython-311.opt-1.pyc
py311-certifi-2025.7.14: checksum mismatch for /usr/local/lib/python3.11/site-packages/certifi/__pycache__/__main__.cpython-311.pyc
py311-certifi-2025.7.14: checksum mismatch for /usr/local/lib/python3.11/site-packages/certifi/__pycache__/core.cpython-311.opt-1.pyc
py311-certifi-2025.7.14: checksum mismatch for /usr/local/lib/python3.11/site-packages/certifi/__pycache__/core.cpython-311.pyc
py311-certifi-2025.7.14: checksum mismatch for /usr/local/lib/python3.11/site-packages/certifi/cacert.pem
py311-certifi-2025.7.14: checksum mismatch for /usr/local/lib/python3.11/site-packages/certifi/core.py
py311-certifi-2025.7.14: missing file /usr/local/share/licenses/py311-certifi-2025.7.14/LICENSE
py311-certifi-2025.7.14: missing file /usr/local/share/licenses/py311-certifi-2025.7.14/MPL20
py311-certifi-2025.7.14: missing file /usr/local/share/licenses/py311-certifi-2025.7.14/catalog.mk
Checking all packages.....
py311-typing-extensions-4.14.1: checksum mismatch for /usr/local/lib/python3.11/site-packages/__pycache__/typing_extensions.cpython-311.opt-1.pyc
py311-typing-extensions-4.14.1: checksum mismatch for /usr/local/lib/python3.11/site-packages/__pycache__/typing_extensions.cpython-311.pyc
py311-typing-extensions-4.14.1: missing file /usr/local/lib/python3.11/site-packages/typing_extensions-4.14.1.dist-info/METADATA
py311-typing-extensions-4.14.1: missing file /usr/local/lib/python3.11/site-packages/typing_extensions-4.14.1.dist-info/RECORD
py311-typing-extensions-4.14.1: missing file /usr/local/lib/python3.11/site-packages/typing_extensions-4.14.1.dist-info/WHEEL
py311-typing-extensions-4.14.1: missing file /usr/local/lib/python3.11/site-packages/typing_extensions-4.14.1.dist-info/licenses/LICENSE
py311-typing-extensions-4.14.1: checksum mismatch for /usr/local/lib/python3.11/site-packages/typing_extensions.py
Checking all packages..... done
>>> Check for core packages consistency
Core package "opnsense" at 25.7 has 68 dependencies to check.
Checking packages: .......................
opnsense-25.7 version mismatch, expected 25.7.1_1
Checking packages: ...........................
py311-duckdb-1.3.1_1 version mismatch, expected 1.3.2
Checking packages: ..............
sudo-1.9.17p1 version mismatch, expected 1.9.17p2
Checking packages: ..
syslog-ng-4.8.2_3 version mismatch, expected 4.8.2_4
Checking packages: ... done
***DONE***


#########  os-smart install fails #########

***GOT REQUEST TO INSTALL***
Currently running OPNsense 25.7 (amd64) at Tue Aug 12 09:24:10 EDT 2025
Installation out of date. The update to opnsense-25.7.1_1 is required.
***DONE***

######## Firmware update fails ###########

***GOT REQUEST TO UPDATE***
Currently running OPNsense 25.7 (amd64) at Tue Aug 12 00:54:33 EDT 2025
Updating OPNsense repository catalogue...
OPNsense repository is up to date.
All repositories are up to date.
Updating OPNsense repository catalogue...
OPNsense repository is up to date.
All repositories are up to date.
Checking for upgrades (11 candidates): .......... done
Processing candidates (11 candidates): .......... done
The following 11 package(s) will be affected (of 0 checked):

Installed packages to be UPGRADED:
boost-libs: 1.88.0_1 -> 1.88.0_2
curl: 8.14.1 -> 8.15.0
ivykis: 0.43.2 -> 0.43.2_1
jq: 1.8.0 -> 1.8.1
libucl: 0.9.2_1 -> 0.9.2_2
nss: 3.113.1_1 -> 3.114
opnsense: 25.7 -> 25.7.1_1
py311-duckdb: 1.3.1_1 -> 1.3.2
sudo: 1.9.17p1 -> 1.9.17p2
syslog-ng: 4.8.2_3 -> 4.8.2_4
webp: 1.5.0 -> 1.6.0

Number of packages to be upgraded: 11

36 MiB to be downloaded.
[1/11] Fetching boost-libs-1.88.0_2.pkg: .......... done
[2/11] Fetching nss-3.114.pkg: .......... done
[3/11] Fetching jq-1.8.1.pkg: .......... done
[4/11] Fetching syslog-ng-4.8.2_4.pkg: .......... done
[5/11] Fetching webp-1.6.0.pkg: .......... done
[6/11] Fetching ivykis-0.43.2_1.pkg: .......... done
[7/11] Fetching curl-8.15.0.pkg: .......... done
[8/11] Fetching libucl-0.9.2_2.pkg: .......... done
[9/11] Fetching opnsense-25.7.1_1.pkg: .......... done
[10/11] Fetching py311-duckdb-1.3.2.pkg: .......... done
[11/11] Fetching sudo-1.9.17p2.pkg: .......... done
Checking integrity...Assertion failed: (strcmp(uid, p->uid) != 0), function pkg_conflicts_check_local_path, file pkg_jobs_conflicts.c, line 315.
Child process pid=26045 terminated abnormally: Abort trap
Starting web GUI...done.
***DONE***



QuoteAs for tunables during installation, you can set them temporarily from the boot menu:

https://forum.opnsense.org/index.php?topic=47494.msg239887#msg239887

I assume that i'll need to do this for all future updates unless the developers change the default for this tunable. Is this correct? This will make future updates painful :-(



Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: franco on August 12, 2025, 03:49:33 PM
Adding a persistent value to System: Settings: Tunables is all you need. This principle has not changed and if someone already had persistent values the update would not have changed them either.


Cheers,
Franco
Title: Re: Intel Alder Lake / N100 instability in FreeBSD and data corruption with UFS
Post by: Lucid1010 on August 14, 2025, 03:16:58 PM
sysctl -a | grep -E 'vm.pmap.pcid_enabled|vm.pmap.pti|hw.ibrs_disable'
vm.pmap.pti: 1
vm.pmap.pcid_enabled: 0
hw.ibrs_disable: 0


pkg clean -a

I still encounter an error when trying to update. Is it possible to upgrade without doing a clean install and restore config backup?

```
Checking integrity...Assertion failed: (strcmp(uid, p->uid) != 0), function pkg_conflicts_check_local_path, file pkg_jobs_conflicts.c, line 315.
Child process pid=37800 terminated abnormally: Abort trap
Starting web GUI...done.
***DONE***
```