Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - johnw

#1
19.7 Legacy Series / Unresponsive every 2 weeks, fixed
January 02, 2020, 04:38:01 PM
This is a problem that started recent-ish following an upgrade, which I've resolved it (not a fix as such) so sharing with the community.

I'm running 19.7.8-amd64 on a Decisio appliance.

Recently it started becoming unresponsive every 2 weeks or so, unless we pro-actively reboot it. The firewall would fail to respond to all direct connections to it including DNS,HTTPS,SSH,Ping; but would continue to allow traffic between networks as normal (so long as no DNS lookup required). It sounded like a resource leak as being the root cause.

Looking at /var/log/system.log following such a failure event there were a lot of messages:

kernel: swap_page_getswapspace(): failed
kernel: swap_page_getswapspace(): failed
kernel: swap_page_getswapspace(): failed


Clearly the OS is running out of memory. Further monitoring of memory and processes in Reporting > Health > System highlighted a failry constant growth in the number of processes, as well as a steady increase in network latency in Reporting > Health > System.

Using the console and running top reported a lot of 'pinger' processes running under the 'squid' user account. I stopped squid service and killed off all 'pinger' proicesses and normal reliable service appears to have resumed.