OPNsense Forum

Archive => 17.7 Legacy Series => Topic started by: Julien on September 01, 2017, 11:38:19 am

Title: Opnsense hardware keeps craching
Post by: Julien on September 01, 2017, 11:38:19 am
Dear All,
2 months ago one of our hardware firewall keep craching, every two weeks we have to call our datacenter to reboot the firewall manually ( recycle the power).
Lately we have a new hardware which been running fine for over 12 days and the same issue has happend again.
Can someone please advies what could be the cause ?
the log of the systems are
Code: [Select]
Sep 1 09:35:03
configd.py: [5e5e4268-ed4c-4507-aa7e-eed305844a82] request pfctl byte/packet counters
Sep 1 09:34:57
configd.py: [97b717e7-5bd3-4342-9ffd-709d5b350a1a] request pfctl byte/packet counters
Sep 1 09:34:51
configd.py: [56c05ec9-3d3e-40b1-8d97-d3d4335d34df] request pfctl byte/packet counters
Sep 1 09:34:45
configd.py: [531f7824-d83a-4d15-a0ef-e2d4b8b5718e] request pfctl byte/packet counters
Sep 1 09:34:39
configd.py: [543d6d0b-ffd4-469a-9200-1dc4db383a8a] request pfctl byte/packet counters
Sep 1 09:20:32
kernel:
Sep 1 09:20:30
kernel: OK
Sep 1 09:20:30
sshlockout[56078]: sshlockout/webConfigurator v3.0 starting up
Sep 1 09:20:30
configd.py: [9c8aeb7f-7a72-498b-bb3a-014b7b443107] restarting cron
Sep 1 09:20:25
kernel: done.
Sep 1 09:20:23
root: /etc/rc.d/hostid: WARNING: hostid: unable to figure out a UUID from DMI data, generating a new one
Sep 1 09:20:23
configd.py: generate template container OPNsense/Syslog
Sep 1 09:20:22
kernel: done.
Sep 1 09:20:22
configd.py: [13006258-a01b-4c8e-90d6-a16da3e21f86] generate template OPNsense/Syslog
Sep 1 09:20:19
configd.py: generate template container OPNsense/WebGui
Sep 1 09:20:19
configd.py: generate template container OPNsense/Syslog
Sep 1 09:20:19
configd.py: generate template container OPNsense/Sample/sub2
Sep 1 09:20:19
configd.py: generate template container OPNsense/Sample/sub1
Sep 1 09:20:19
configd.py: generate template container OPNsense/Sample
Sep 1 09:20:17
configd.py: generate template container OPNsense/Proxy
Sep 1 09:20:16
configd.py: generate template container OPNsense/Netflow
Sep 1 09:20:16
configd.py: generate template container OPNsense/Macros
Sep 1 09:20:15
configd.py: generate template container OPNsense/IPFW
Sep 1 09:20:14
configd.py: generate template container OPNsense/IDS
Sep 1 09:20:12
configd.py: generate template container OPNsense/HAProxy
Sep 1 09:20:12
configd.py: generate template container OPNsense/Cron
Sep 1 09:20:11
configd.py: generate template container OPNsense/Captiveportal
Sep 1 09:20:11
configd.py: generate template container OPNsense/Auth
Sep 1 09:20:10
kernel: .done.
Sep 1 09:20:10
configd.py: [5a922f60-441d-4902-8680-1aae8fed6d6c] generate template *
Sep 1 09:20:09
kernel: done.
Sep 1 09:20:09
UNKNOWN[72110]: Process 68024 died: No such process; trying to remove PID file. (/var/run/radvd.pid)
Sep 1 09:20:08
configd.py: [4f4ff1ba-eaea-40b0-8030-d61ec3012427] Linkup starting igb0
Sep 1 09:20:08
kernel: igb0: link state changed to UP
Sep 1 09:20:08
configd.py: [1ea78395-0c46-4aa4-a5fa-40d12d5d1af7] Linkup starting igb1
Sep 1 09:20:08
kernel: igb1: link state changed to UP
Sep 1 09:20:08
kernel:
Sep 1 09:20:06
kernel: done.
Sep 1 09:20:06
kernel: done.
Sep 1 09:20:06
opnsense: /usr/local/etc/rc.bootup: ROUTING: setting IPv4 default route to 5.200.3.3
Sep 1 09:20:06
lighttpd[55240]: (log.c.217) server started
Sep 1 09:20:06
configd.py: generate template container OPNsense/WebGui
Sep 1 09:20:05
configd.py: [23526e5a-f89b-430f-849b-fa8d5df35f34] generate template OPNsense/WebGui
Sep 1 09:20:05
kernel: done.
Sep 1 09:20:05
opnsense: /usr/local/etc/rc.bootup: Adding static route for monitor 8.8.8.8 via 5.200.3.3
Sep 1 09:20:05
kernel: .done.
Sep 1 09:20:05
opnsense: /usr/local/etc/rc.bootup: Removing static route for monitor 8.8.8.8 via 5.200.3.3
Sep 1 09:20:05
kernel: ...
Sep 1 09:20:04
kernel: pflog0: promiscuous mode enabled
Title: Re: Opnsense hardware keeps craching
Post by: Julien on September 01, 2017, 01:04:34 pm
i hope somone can advise why this keeps happening.
the same hardware that was craching previously we had a pfsense and its running fine wihtout issues.
the hardware is Supermicro.

Thank you
Title: Re: Opnsense hardware keeps craching
Post by: phoenix on September 01, 2017, 02:31:37 pm
I'd guess that some details of the hardware specification (inc RAM & NICs) and the version of OPNsense wouldn't go amiss.
Title: Re: Opnsense hardware keeps craching
Post by: Julien on September 02, 2017, 12:29:42 am
Hi Bill,
tje Ram is 4GB and CPU is Intel(R) Pentium(R) CPU N3700 @ 1.60GHz (4 cores)
the version OPNsense 17.7.1_2-amd64
igb0             0c:c4:7a:7f:83:c4 Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k
igb1             0c:c4:7a:7f:83:c5 Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k
igb2             0c:c4:7a:7f:83:c6 Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k
igb3             0c:c4:7a:7f:83:c7 Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k


Title: Re: Opnsense hardware keeps craching
Post by: Julien on September 02, 2017, 10:06:18 pm
I Hope someone can give me here a clue either its hardware of software issue.
Today we had to drive a 180 km to reboot the firewall because the datacentre people does not work in the afternoon.
when we arrived the firewall was and can't access it on the LAN or ping it.
USB ports were not responding for the USB.
the only solution was is to reboot the firewall to get stuff working.
the only notice we've seen is the time was wrong 2 hrs different.
Thank you
Title: Re: Opnsense hardware keeps craching
Post by: bartjsmit on September 03, 2017, 09:49:05 am
Hi Julien,

If you have more than one identical firewall under change control, I would think that the chances of a software or configuration issue affecting just one are minimal.

My first instinct would be to replace the faulty unit wholesale, which from the hardware description would not cost significantly more than you driving around for hours.

Either bin the old unit or boot up Stress Linux https://www.stresslinux.org/sl/

If that finds any issues, it will also make a good acceptance test for new units from your supplier.

Bart...
Title: Re: Opnsense hardware keeps craching
Post by: JDtheHutt on September 03, 2017, 10:15:28 am
Hi there. I recently moved to 17.7 having used pfsense for ages. I am also using Supermicro hardware, specifically the X10SBA. It ran fine under pfsense but OPNsense is seeing constant crashes and reboots. They seem erratic, sometimes during boot, sometimes just after it, sometimes after running for hours to a day or two. Eventually it seems that the repeat crashes results in corruption of the system and it then won't even boot. I have tried a fresh install with a basic config but no change. Reinstalling pfsense sees no such issue occurring.

http://www.supermicro.com/products/motherboard/celeron/x10/x10sba.cfm

Unfortunately, BT also performed as stellar a service as ever and cancelled my internet a week earlier than they were supposed to, before my new provider takes over, so I have no means to access the net than through my mobile, but I don't get any signal at home and have to walk at least one or two streets away to get online. I will post some error logs as soon as I have net again and was trying to manually write out the relevant error log warnings but at the moment I can't get it to stay online long enough to even login and get them.
Title: Re: Opnsense hardware keeps craching
Post by: gothbert on September 03, 2017, 11:28:01 am
Hi,

I use a Supermicro X11SBA-LN4F with a 500 GB SSD. Absolutely stable.

Power supply is a Meanwell 12V 60W desktop power supply. Case is a 1 HU supermicro chassis.

Did you check the cooling/case temperature?

Kind regards
Boris

Title: Re: Opnsense hardware keeps craching
Post by: Julien on September 03, 2017, 07:22:45 pm
This has been working fine for over 4 month until we update to the 17.7.
a new hardware has been ordered with different hardware hopefully we won't have a issue with this one.
I've read somewhere it could be a issue with the Intel 1000 PCI NIC ?
and need to adjust the boot.con somehow ?
Bartje : this the second hardware ( similar to the first one ) which crashed too. I am sure its not a hardware issue but a drivers related issue. with pfsense now working on the LAB.

we have ordered a new hardware and hopefully is the right one this time
CPU  Intel® I53317U Dual Core 4 Threads(1.8GHz)
Chipeset Intel® HM65 Express Chipset
Memory SO DDR3,1333MHz, 8 GB
HDD/SSD Samsung EVO 850 120 SSD
Ethernet  6*Intel® 82583V Gigabit Ethernet
 
please advise about the hardware
Title: Re: Opnsense hardware keeps craching
Post by: JDtheHutt on September 03, 2017, 08:03:50 pm
Quote
Hi,

I use a Supermicro X11SBA-LN4F with a 500 GB SSD. Absolutely stable.

Power supply is a Meanwell 12V 60W desktop power supply. Case is a 1 HU supermicro chassis.

Did you check the cooling/case temperature?

Kind regards
Boris

The X10SBA is fanless and doesn't have any temperature issues. Same as the other poster, I am stable with the latest pfsense. As soon as I don't need to leave my home to get net I will post the error logs.  I'd like to be using OPNsense instead, so want to find a solution for this.
Title: Re: Opnsense hardware keeps craching
Post by: JDtheHutt on September 05, 2017, 02:57:17 am
Seems I found my issue. I'm on 17.7 still and just found this thread which matches my errors and behaviour of my system. Hadn't seen that thread till just now, the pain of navigating via mobile standing in the street to get signal! I really should change provider.

https://forum.opnsense.org/index.php?topic=5697.0
Title: Re: Opnsense hardware keeps craching
Post by: Julien on October 09, 2017, 02:55:42 am
we have a new hardware and everything is working fine,
the issue was a hardware related.
Title: Re: Opnsense hardware keeps craching
Post by: bringha on October 09, 2017, 09:31:32 am
Hi all,

there is a two year long running thread in pfsense wrt to the X11SBA-LN4F Mainboard

https://forum.pfsense.org/index.php?topic=98230.0 (https://forum.pfsense.org/index.php?topic=98230.0).

There also some backgrounds are described (see inputs from user 'engineer')

I have been affected too from this and saw precisely the same issue:
(see also here: https://forum.opnsense.org/index.php?topic=5063.0 (https://forum.opnsense.org/index.php?topic=5063.0))

after a few weeks of stable operations, I got a watchdog timer reset on at least one NIC (most WAN) and then the entire machine crashed. No debug output, if there were luck, SOMETIMES an watchdog reset message has been thrown to console (only visible in debug mode)

I RMA the board to Supermicro (which was HW Rev. 1.1) and got a new one with HW Rev. 1.2 and a BIOS Update to 1.0c.

Since then, the entire machine is absolutely stable, I have an uptime of 180 days by now and no crash since then at all.

I bought meanwhile a second board, same revision, same BIOS and made the same experience ....

So a check could be worthwhile (if you have this board running what I assume from your description (tbc)) which HW version you have and which Bios Version. Perhaps an RMA/update might also be helpful for you

Br br