[Bf-committers] Restarted dev/wiki.b.o and a brief login issue on wiki

Dan McGrath danmcgrath.ca at gmail.com
Sat Jun 28 23:02:02 CEST 2014


Hey,

Yet again today it appeared that the fast cgi process for the wiki died.
Last week when it happened I decided to monitor the open files/sockets in
hopes of catching it. Today it would appear that I was at least able to
verify some resource exhaustion (despite us having high maximums):

  http://www.pasteall.org/pic/73262

>From the chart above, you can probably guess when everything ground to a
halt :)

Still, it would appear that too many open files and sockets was at least
the culprit. As for the wiki login issue, it seemed that when I restarted
lighttpd on the wiki jail the first time, it left some processes around.
The second restart though appeared to have mucked with wiki logins somehow,
and despite a 3rd restart of the service, didn't fix it, thus a server
reboot was tried (which worked). Sorry for those who were editing wiki
pages during this time.

As for trying to solve the issue, I noticed that our mbufs (4k) in freebsd
were a little close to the maximum, so I attempted to increase this as
well. It would appear that our services never really had to fight each
other so much as they were on separate servers before, but with a bunch of
things running on one machine (wiki+phab etc.), it would seem that we have
reached a point where we need to really start to tune things for the load
(cpu is still only 5-10% avg.).

Hopefully the changes today are all that is needed. Sorry (again!) for the
disruption. o/


Dan


More information about the Bf-committers mailing list