OPNsense Forum

Archive => 20.1 Legacy Series => Topic started by: adiz0r on March 22, 2020, 11:42:04 AM

Title: Recurring random crashes with 20.1.x on a PC Engines APU2
Post by: adiz0r on March 22, 2020, 11:42:04 AM
Hello all,

I have an annoying problem with my router: it randomly crashes from time to time. Sometimes it can go for almost a month without problems, sometimes it reboots after a couple of days. All 20.1.x versions showed this behaviour.

There are no infos at all in the local log files. I set up remote logging to another APU2 running Linux but to no avail, as there were no usable infos, either. Interestingly, the bootup kernel messages were not logged on the loghost, either, but I might need to tune something for that.

I don't think it's caused by overheating, as the CPU temp graphs show constant temperatures between 60-62°C. Of course it could still be a another hardware problem. There's a 16GB mSATA card installed, the internet uplink is provided by a USB LTE stick in PPP mode and all the 3 igbX interfaces are in use. The BIOS is fairly receny, 4.11.0.2.

Another somewhat disturbing thing is that these crashes are not visible in lastlog. Perhaps it's a BSD thingie (I'm mostly used to Linux and Solaris form my jobs and am fairly new to BSDs). The last crash happened today morning at around 8:10 CET - between the 2 topmost lines in last's output.


root       pts/0    A.B.C.D           Sun Mar 22 10:20   still logged in
root       pts/0    A.B.C.D           Sat Mar 21 22:22 - 22:24  (00:01)
root       pts/0    A.B.C.D           Sat Mar 21 06:45 - 06:51  (00:06)


I know it's pretty much trying to catch a black cat in a dark room at this point, but perhaps others also experienced this. Does anyone perhaps have any ideas where I can start to look for more clues?

I also thought about setting up logging for the serial console, as I have a FreeNAS box close to the router which I use for console access with a USB-serial converter.

Unfortunately I cannot trigger this behaviour and I couldn't correlate it to other events, either. This morning it happened while my family was still sleeping :) , so there was basically no traffic and load on the router.

Any help is much appreciated.

Gabor
Title: Re: Recurring random crashes with 20.1.x on a PC Engines APU2
Post by: adiz0r on May 04, 2020, 11:49:57 AM
Update, for the record, in case someone else has similar problems: I flipped the 2 SSDs between my APU2C2 and APU2C4, now the latter is the router running OPNsense. It's been stable since then.

Perhaps it's an issue with the USB subsystem of the boards, as since the swap my ZTE MFE831 USB LTE modem crashed several times. It is basically a screenless Linux/Android data-only mobile phone which can share its cellular connection via USB.

Interestingly, the previously used APU2C2 has been rock solid since then with Alpine Linux. I was running Linux kernel compilation for several days using 5-6 threads to load all cores and the memory subsystem, but it could took the load without breaking any sweat.