Show Posts

19.7 Legacy Series / Unresponsive every 2 weeks, fixed

« on: January 02, 2020, 04:38:01 pm »

This is a problem that started recent-ish following an upgrade, which I've resolved it (not a fix as such) so sharing with the community.

I'm running 19.7.8-amd64 on a Decisio appliance.

Recently it started becoming unresponsive every 2 weeks or so, unless we pro-actively reboot it. The firewall would fail to respond to all direct connections to it including DNS,HTTPS,SSH,Ping; but would continue to allow traffic between networks as normal (so long as no DNS lookup required). It sounded like a resource leak as being the root cause.

Looking at /var/log/system.log following such a failure event there were a lot of messages:

kernel: swap_page_getswapspace(): failed kernel: swap_page_getswapspace(): failed kernel: swap_page_getswapspace(): failed

Clearly the OS is running out of memory. Further monitoring of memory and processes in Reporting > Health > System highlighted a failry constant growth in the number of processes, as well as a steady increase in network latency in Reporting > Health > System.

Using the console and running top reported a lot of 'pinger' processes running under the 'squid' user account. I stopped squid service and killed off all 'pinger' proicesses and normal reliable service appears to have resumed.

Messages - johnw

19.7 Legacy Series / Unresponsive every 2 weeks, fixed