OPNsense Forum

Archive => 19.7 Legacy Series => Topic started by: JdeFalconr on September 20, 2019, 03:27:49 pm

Title: System Reboots Itself!
Post by: JdeFalconr on September 20, 2019, 03:27:49 pm
Thanks in advance for your help. Newly-built system that looks to be rebooting itself randomly. I'm really not sure how to troubleshoot this. I don't see much in logs but one of the problem/crash reports comes up every time after this happens. Below is what at first looks to me like the possible cause. Help!!!

EDIT: From what I see there are some known issues with Apollo-lake-based chipsets. Do I need to do anything to get Opnsense working reliably with those or does the current software release (19.7) already incorporate those fixes?

Code: [Select]
(KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36
FreeBSD 11.2-RELEASE-p14-HBSD  07680caafe9(stable/19.7) amd64
OPNsense 19.7.4_1 2da6de42b
Plugins os-dyndns-1.17 os-upnp-1.3
Time Fri, 20 Sep 2019 06:22:57 -0700
OpenSSL 1.0.2s  28 May 2019
PHP 7.2.22
dmesg.boot:
arp: 172.20.0.150 moved from 38:8b:59:24:9e:13 to 3c:28:6d:31:e1:7b on igb1


Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 06
fault virtual address = 0x100000000
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff8300b260
stack pointer         = 0x28:0xfffffe01da5537c0
frame pointer         = 0x28:0xfffffe01da553810
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 12 (swi4: clock (0))

My hardware:
ASRock J3455B-ITX (Intel J3455-based)
2x4GB DDR3-1600
120GB SATA SSD
PicoPSU 90W
Intel 82576-based 2x1GB NIC

EDIT: A bit more info from a more recent crash. I just wiped my SSD, reinstalled and then restored my prior config. It crashed again but got a bit more in the logs:

Code: [Select]
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0x7f00000000
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff83006260
stack pointer         = 0x28:0xfffffe01da5537c0
frame pointer         = 0x28:0xfffffe01da553810
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 12 (swi4: clock (0))
Title: Re: System Reboots Itself!
Post by: JdeFalconr on September 21, 2019, 04:39:11 am
Anyone? ::bump::
If you need more info I'm happy to provide it. I see historically there have been several threads about "Fatal trap 12: page fault" but most seem to do with not being able to boot which is not my situation.
Title: Re: System Reboots Itself!
Post by: bartjsmit on September 21, 2019, 09:30:37 am
Have you soak tested your hardware? https://www.stresslinux.org/sl/

Bart...
Title: Re: System Reboots Itself!
Post by: JdeFalconr on September 21, 2019, 06:12:18 pm
Thanks for writing. I'd love to try using that program but I can't get my computer to recognize the drive with the bootable image on it after writing. Is that some sort of whole-system testing suite? Or alternatively would something like Prime95 accomplish what you're suggesting?
Title: Re: System Reboots Itself!
Post by: bartjsmit on September 21, 2019, 08:09:02 pm
Yes, stresslinux has a host of tools to simulate a workload on your hardware and test all components.

How did you write the image to the USB drive? The best tool is dd but some other options are discussed in the docs: https://www.stresslinux.org/sl/wiki/Documentation#HowtoimageUSBflashdrives

If you can't get stresslinux to work, you should at least test your RAM with memtest: https://www.memtest86.com/

Prime95 will only test your CPU. You are not likely to learn much about your RAM and disk.

Bart...
Title: Re: System Reboots Itself!
Post by: JdeFalconr on September 21, 2019, 09:20:23 pm
Bart I appreciate the help. For whatever reason I couldn't get the computer to recognize a bootable stresslinux flash drive. I used Rufus as well as Imagewriter but to no avail. I tried the same with Memtest and while the drive was recognized upon selecting it for boot I was immediately taken back to the boot options screen. I'll have to fiddle with settings a bit to see if I can get this working. Alternatively I have an extra SSD that I can try writing Memtest or StressLinux to. I'll report back here when I have something.

Also of note is that I've thus far tried just about every settings change I can find related to Apollo Lake chipsets including setting hint.hpet.0.clock="0" in /boot/loader.conf.local. From what I'm reading my guess is that OpnSense 19.7 supports Apollo Lake and the J3455 chipset without a bunch of contortions and workarounds...but any verification I can get would be great.

Also to try and rule out my SSD I want to run off the live installer via flash drive for a bit and see if the crash reproduces.

Thanks for the continued help.
Title: Re: System Reboots Itself!
Post by: packet loss on September 22, 2019, 03:39:07 am
A google search did not seem to indicate there was a fix incorporated into FreeBSD for Apollo Lake Processors. I didn't search long. You might want to try some basic modifications such as ones provided in the following link:

https://www.reddit.com/r/PFSENSE/comments/8166u3/is_apollo_lake_24_stability_okay_yet/ (https://www.reddit.com/r/PFSENSE/comments/8166u3/is_apollo_lake_24_stability_okay_yet/)
Title: Re: System Reboots Itself!
Post by: JdeFalconr on September 22, 2019, 04:27:43 am
Thanks azdps! Doesn't it say so within that thread? From there:

Quote
omber
Update for ya'll looking at this in Q3 of 2018.

This is caused by a kernel bug in FreeBSD 11.1 branch on which pfSense 2.4 branch is based.
You can run legacy pfSense 2.3 release (based on FreeBSD 10.3) without issues. Package installation is not supported, however, so things like OpenVPN Client Export are not possible.
According to FreeBSD 11.2 release notes this should be fixed. See Section 5.2 Kernel Bug Fixes https://www.freebsd.org/releases/11.2R/relnotes.html. Upcoming pfSense 2.4.4 should be based on FreeBSD 11.2.
Title: Re: System Reboots Itself!
Post by: packet loss on September 22, 2019, 08:00:11 am
Try installing OPNsense again and manually configure instead of importing your config. I saw where someone kept on having booting problems after importing their configuration file. Issue was resolved by manually configuring it. Some of the FreeBSD commits don't necessarily fix all issues.

I ran into an issue where OPNsense would freeze on one of my firewall appliances. A bios update fixed the issue. I had another issue as well where it would not boot. One of my bios had a strange operating system setting and from what I recall it had 3 choices. One was Windows, Android and something else for the particular setting. I had to set it to Android otherwise OPNsense would not boot.

The information I'm providing you with are just suggestions. Good luck.
Title: Re: System Reboots Itself!
Post by: bartjsmit on September 22, 2019, 09:58:48 am
If you can only get FreeBSD to boot, you can try the ports for the testing tools:

https://www.freshports.org/sysutils/memtest/
https://www.freshports.org/sysutils/stress/

I've not had any personal experience with running freshports, but there will be plenty on this forum who have ;)

Bart...
Title: Re: System Reboots Itself!
Post by: JdeFalconr on September 23, 2019, 05:03:51 am
Update for you all. I ran Memtest and right out of the gate was presented with a bevy of errors. I then re-ran it testing each of the two RAM sticks individually and both came back clean. I then tested both sticks at once again (possibly swapping their positions...I don't recall) and came back again with no errors. Per the documentation for Memtest that's not entirely unexpected with dual-channel RAM. I then ran the system through Linpack and it passed all the CPU tests. I've had it up and looking quite stable for well over 24 hours now so I'm cautiously optimistic.

I've rebooted it and I'm giving the system another 24 hours to think about what it's done. If it can pass that test I want to play with removing some of the workarounds that I entered for the Apollo Lake chipset just to see what happens. Or who knows...at that point I may be so fed up with working on this I'll run some performance tests and call it good.

Big thanks to everyone for your help and advice. I don't think I would have gotten here nearly as quickly or without far more effort without your help.
Title: Re: System Reboots Itself!
Post by: sporkman on September 24, 2019, 04:54:01 am
Thanks in advance for your help. Newly-built system that looks to be rebooting itself randomly. I'm really not sure how to troubleshoot this. I don't see much in logs but one of the problem/crash reports comes up every time after this happens.

Interesting. On a Core2Duo box I've had nightly panics since the last update. Always happens shortly after 3:00 a.m., which on stock FreeBSD is when a bunch of daily cron jobs run.

It's odd because some prior releases had random panics and I just assumed it was either the particular kernel + whatever patches these folks use or some combo of under-tested modules (for me, turning off IDS in one of the old releases seemed to stop the panics). Prior to that this was a pfsense box and no panics there, so I'm not really thinking it's hardware. I sometimes wonder if it's the HardenedBSD stuff - it's always in the logs and in the old days of OpenBSD I remember lots of their security precautions/checks could end up causing stability issues on quirky systems they'd not tested with.

Trying to find a time when I can go without internet for the many hours memtest86 consumes. :)