syslog-ng crashed after 20.1.1 -> 20.1.2 upgrade

Started by xsfpo, March 06, 2020, 06:11:45 PM

Previous topic - Next topic
Recently upgraded my opnsense 20.1.1 -> 20.1.2 and get completely unresponsive opnsense host. No ping, no ssh connection. Found in log:


2020-03-06T19:29:31 syslog-ng[69376]: syslog-ng starting up; version='3.25.1'
2020-03-06T19:29:29 kernel: -> pid: 88979 ppid: 1 p_pax: 0xa50<SEGVGUARD,ASLR,NOSHLIBRANDOM,NODISALLOWMAP32BIT>
2020-03-06T19:29:29 kernel: [HBSD SEGVGUARD] [syslog-ng (88979)] Preventing execution due to repeated segfaults.
2020-03-06T19:29:29 kernel: [HBSD SEGVGUARD] [syslog-ng (88979)] Preventing execution due to repeated segfaults.
2020-03-06T19:29:28 kernel: -> pid: 73968 ppid: 88979 p_pax: 0xa50<SEGVGUARD,ASLR,NOSHLIBRANDOM,NODISALLOWMAP32BIT>
2020-03-06T19:29:28 kernel: [HBSD SEGVGUARD] [syslog-ng (73968)] Suspending execution for 600 seconds after 5 crashes.
2020-03-06T19:29:28 kernel: pid 73968 (syslog-ng), uid 0: exited on signal 6 (core dumped)
2020-03-06T19:29:27 kernel: pid 20038 (syslog-ng), uid 0: exited on signal 6 (core dumped)
2020-03-06T19:29:26 kernel: pid 70069 (syslog-ng), uid 0: exited on signal 6 (core dumped)
2020-03-06T19:29:25 kernel: pid 55160 (syslog-ng), uid 0: exited on signal 6 (core dumped)
2020-03-06T19:29:23 kernel: pid 37176 (syslog-ng), uid 0: exited on signal 6 (core dumped)


Upgrade time is around 19:28-19:29.
After several soft restart (by power button) and after disabling sensei plugin - system work ok now.

I have this problem, a long with several other severe ones. Cannot view log, etc. Am now on third reboot, but total brokenness. The only thing that works is pinging outside from Interfaces->Diagnostics. Help!

I had similar when I upgraded from 20.1.1 this morning, lost all internet but came back when I restored saved config.

20.1.2 been good all day since

Mine never came back. I did a fresh install, and then restored configs. Took about an hour and a half, but everything works again. Will wait to try upgrade to 20.1.2 again. I don't have a couple of hours to waste on that.

We have had a few unsubstantial reports, what was missing was complexity of the install in terms of plugin use, sensei, further customisation. So far, syslog-ng can segfault on shutdown (programming style vs. hbsd security it seems), but that would never lock up the system. What is especially suspicious is the timing between two minor updates 20.1.1 -> 20.1.2. Happy for further feedback.


Cheers,
Franco

Quite a basic install here, System:Firmware:Plugins shows only
os-dyndns (installed) 1.20
os-smart (installed) 2.1

All still good since my report earlier in this thread


It seems to me, that real cause of system lock up was sensei, which depends on syslog-ng. When syslog-ng (on some reason) can't start -> sensei waiting for syslog-nd daemon -> syslog-ng still down -> sensei waiting -> and infinite loop.
So If I revert syslog-ng to previous version (3.24) will it help? And could it affect other packages which use syslog-ng?

I don't think so. The issue is present in any Syslog-ng we've had so far.

The error should not be persistent and a reboot will unlock Syslog-ng, but maybe the firmware upgrade won't finish which can be problematic. We'll see if we can add Syslog-ng to a whitelist for the HBSD guard feature to make this more reliable.


Cheers,
Franco

I won't expect a syslog-ng crash affecting Sensei, but will have a look.

We are aware of a netmap(4) race condition, which -sometimes- causes a network interface stop responding if an application (like Suricata/Sensei) opens that interface in netmap mode while traffic was passing through that interface (i.e. > 1-2 Mbps).

In this case, since after upgrade, this required a reboot and thus above condition might have came into play.