OPNsense Forum

English Forums => Development and Code Review => Topic started by: MasterXBKC on January 10, 2018, 12:51:47 am

Title: Possible Bug, PHP-CGI Crash
Post by: MasterXBKC on January 10, 2018, 12:51:47 am
So at first i thought it was my code, or else a change that came down in PHP 7.1, but now im not so sure.

Ive begun seeing a log of 503 errors where the web admin becomes un-available, and remains so untill you use option 11 to restart the services.

Ive found a way to re-produce it also.

With my pfmontor checkin agent installed on the device, if i run it on the ssh shell it runs fine, but it seems that if any other process is using php or php-cgi at the same time as i run the script, it crashes the php-cgi background processes that the web admin uses.  or if they are running to quickly.

To reproduce the issue, all i have to do is run my php script in rapid succession from ssh using either of:
php pfmonitor.checkinopn.php
or
php-cgi pfmonitor.checkinopn.php

Up+Enter a few times and the web interface dies, and the php-cgi background processes all dissappear from ps aux.

All my script does is read some files, and post the contents to a external url using php curl at this point, i had commented out all the other functions.

running it once works fine, running it, then immediately again a few times, or if the opnsense itself or the web interface is also doing something at the same time, and bang, it crashes the php-cgi's.

like i said i thought it was my code at first, but now i dont think so.
Title: Re: Possible Bug, PHP-CGI Crash
Post by: franco on January 10, 2018, 08:33:08 am
Running too many scripts at once may starve the lighttpd children pool. Never heard of this from a user report though, but then uncontrollably executing a lot of processes can do a lot of things to a system and I hope we can avoid doing so.

Then again, none of this should matter for a CLI script that should not meddle with CGI at all?

Grasping at straws here, forgive me. :)


Cheers,
Franco
Title: Re: Possible Bug, PHP-CGI Crash
Post by: marjohn56 on January 10, 2018, 03:46:17 pm
I've seen it to, a restart of services brings it back, but as yet I cannot seem to create it at will. Everything continues to work, just the GUI has toddled off.
Title: Re: Possible Bug, PHP-CGI Crash
Post by: MasterXBKC on January 10, 2018, 06:45:29 pm
I can recreate it quite easily.

If i run my script with php, it only happens rarely when im running it manually at the shell, but if i run it 6-7 times rapidly from the shell using php-cgi, it happens almost every time.

The problem is my script runs once every 60 seconds just to poll data, and even that seems to hose it up after between 5-30 minutes, and only since the jump to 17.7.11 i think, it could have been in the prior build too but i skipped that one on most devices.


I did find a possible reason for this but im not sure, i found old documentation that if lighty ends up waiting for a response from php, and does not get a response back quickly enough it can cause it to just hang up because it is expecting a response that either php is busy, or else the finished results.  And in the absence of either, it goes out to lunch indefinitely.

I believe pfsense encountered this issue as well with their nginix/php-fpm setup, but im not sure how they mitigated it, or if they did at all, but it doesnt seem to happen anymore.
Title: Re: Possible Bug, PHP-CGI Crash
Post by: marjohn56 on January 10, 2018, 06:50:07 pm
Hmm...

Well having come from the darkside there were a lot of issues with a 502 error that used to occur when pfblockerNG was installed, one of my testers used to be able to get it to do it without pfblocker too. Now he came up with a fix... let me see if I can find it, it might be related.

Title: Re: Possible Bug, PHP-CGI Crash
Post by: marjohn56 on January 10, 2018, 06:57:48 pm
OK, found it, but Franco would need to take a look and see if it applies to opnsense. I'm finding my way around but  it's a big system. :)

@franco

Take a look at this and see if it can be applied. Chris one of my testers did this for pf, it fixed a lot of issues and certainly fixed his.

https://github.com/marjohn56/pfsense/blob/2c131b10b25db593331048d4f2b28fbf9bf5662e/src/etc/rc.php_ini_setup (https://github.com/marjohn56/pfsense/blob/2c131b10b25db593331048d4f2b28fbf9bf5662e/src/etc/rc.php_ini_setup)
Title: Re: Possible Bug, PHP-CGI Crash
Post by: mircsicz on January 10, 2018, 07:40:26 pm
@Franco

If you wanna have access to a bunch of machines suffering from the issue just let me know... ;-)

BTW: I'm the one who triggered the whole topic, as I've just started using pfMon with some of my machines.

Greetz
Mircsicz
Title: Re: Possible Bug, PHP-CGI Crash
Post by: MasterXBKC on January 10, 2018, 07:53:21 pm
This might or might not be related too, when i installed a new opnsense recently from ISO on VMware, i saw that during the bootup it said:
Less than 512mb of ram detected, not enabling opcache,

But this machine had 4 or 8 GB of ram....
Title: Re: Possible Bug, PHP-CGI Crash
Post by: franco on January 10, 2018, 11:20:40 pm
Less is more. I think if we wrap that in configd to serialise it shouldn't happen...
Title: Re: Possible Bug, PHP-CGI Crash
Post by: MasterXBKC on January 14, 2018, 09:22:40 pm
Less is more. I think if we wrap that in configd to serialise it shouldn't happen...

is this something to expect in the next release?
Title: Re: Possible Bug, PHP-CGI Crash
Post by: franco on January 15, 2018, 03:52:19 pm
It's something we need to change in your plugin.

https://docs.opnsense.org/development/backend/configd.html


Cheers,
Franco
Title: Re: Possible Bug, PHP-CGI Crash
Post by: MasterXBKC on January 15, 2018, 05:13:34 pm
It's something we need to change in your plugin.

https://docs.opnsense.org/development/backend/configd.html


Cheers,
Franco

This was occurring even just running it on the shell, but if you think that will fix it....
Title: Re: Possible Bug, PHP-CGI Crash
Post by: franco on January 15, 2018, 06:33:20 pm
What's the purpose of running it repeatedly other than triggering the bug? It simply needs a funnel to not waste time and system resources. A service can't be started repeatedly if it's not built to interlace on the work chunks it computes. :)


Cheers,
Franco
Title: Re: Possible Bug, PHP-CGI Crash
Post by: MasterXBKC on January 15, 2018, 08:05:47 pm
it is meant to be a cron, it periodically sends the data in, and retrieves any user commands to run.
Title: Re: Possible Bug, PHP-CGI Crash
Post by: mircsicz on January 27, 2018, 12:40:24 pm
Hi Guys,

what's the status of this issue. I'ld like to move on with... ;-)
Title: Re: Possible Bug, PHP-CGI Crash
Post by: mircsicz on February 06, 2018, 08:14:37 am
Just upgraded four of my machines to 18.1.1 and it seems as the problem persist's...
Title: Re: Possible Bug, PHP-CGI Crash
Post by: MasterXBKC on February 07, 2018, 03:44:05 am
Franco, i think i might have found a possible cause for this, all the symptoms seem to match up except that this thread is using nginx instead of lighty.

https://forum.nginx.org/read.php?2,265576,265586

Perhaps disabling opcache for CLI per this thread i wonder:
https://stackoverflow.com/questions/21556437/disable-opcache-temporarily
Title: Re: Possible Bug, PHP-CGI Crash
Post by: MasterXBKC on February 08, 2018, 07:35:28 pm
See screenshots, when i run the php file, all of the php-cgi processes die without any explanation.

and sometimes it doesnt happen till the 4th or 5th time it runs, sometimes on the first time.
Title: Re: Possible Bug, PHP-CGI Crash
Post by: MasterXBKC on February 08, 2018, 07:40:29 pm
Here you see after running that single command, all the php-cgi processes simultaneously die.

and now the web config only gives 502 errors.
Title: Re: Possible Bug, PHP-CGI Crash
Post by: MasterXBKC on February 08, 2018, 08:35:18 pm
OK guys, so i believe i have my code fixed up, and not bringing down the php-cgi's anymore, not sure why it was happenning, but after changing a single line that was used to end the script its now working.

So now all i need to do is get it built into a plugin, and its all done.