[Bf-committers] developer.blender.org maintenance/outage

Dan McGrath danmcgrath.ca at gmail.com
Wed Jun 19 13:35:16 CEST 2019


Hi,

It would appear that the speed up that we had from the previous tweaks,
which were great, also caused the crawling effect on the site to amplify.
This, however, has the negative effect of bringing an old TODO/BUG in the
MySQL setup that causes it to slowly bleed memory into swap (despite having
`--memlock`). It is unknown to me why this happens, but just recently it
caused that server to put itself into swap again.

I had to reboot the server and tweak some settings for MySQL to try curb
it's usage a bit, so we will see how it goes. I would like to disable swap
entirely, but i fear that the system will end up killing some critical
service in an OOM killer fashion while nobody is around. In the meantime,
we poked the higher powers about some much needed RAM for this server! If
you can't make SQL behave, than make it's working set fit in RAM, damnit! :D


Cheers,

Dan

On Tue, Jun 18, 2019 at 11:04 PM Dan McGrath <danmcgrath.ca at gmail.com>
wrote:

> Hi,
>
> Just giving you all an update on the issue from earlier today on the
> developer.blender.org slow downs and outages.
>
> First of all, the reports of our assimilation into the B.ORG collective
> have been greatly exaggerated! :D As I am not one for writing big fancy
> professional reports, I will try keep it short, and to the point.
>
> Yesterday, Phabricator started to experience slowdown, which was hard to
> properly look into, as I was already busy prepping the night before to
> replace a server in the data center, which only slowed things down more. A
> quick look into the issue showed that the hard drives were being exhausted
> with writes. Looking into it a bit more, it seemed that when people visit
> the site, the site invokes `git --log` on the commits so that it can be
> rendered and displayed to the user. The actual problem would appear to be
> that these files go to a directory on disk (synchronously?), which created
> the write IOP starvation that we saw.
>
> As a workaround, I have changed the ZFS `sync` setting on this dataset to
> disabled, which appears to have relaxed the storm a bit. The directory
> these uploads go to is a double hashed directory (./AA/AA/, ./AB/AA/,
> ./AC/AA, etc.) which totals about 64k directories (OH MY GOD....), so even
> doing a `find ./` takes 20 minutes on these systems. We can try to
> experiment with putting those files on their own dataset in ZFS, with tuned
> recordsizes and properties, but this may not help as much as an SSD, and
> more RAM.
>
> For now, I will leave the sync in the disabled state so that Phabricator
> isn't bogged down. The problem is that the server it's on, with it's
> current setup, can't just have it's drives replaced unless the new drives
> are exactly the same, or bigger, than the 2TB HDD's (without reinstall),
> and 2TB SSD's aren't exactly ideal on that old clunker of a box! Worse, to
> move stuff off there is tricky as it is also our some of our Bacula
> storage, which has nowhere to go without moving a lot of stuff around and
> maybe adding more hard drives to Proxmox, which takes time to setup.
>
> Anyway, that is all details for me and the crew to bang out. We will try
> keep an eye on things. Sorry for the delays in your bug reports!
>
>
> Cheers,
>
> Dan McGrath
>
> On Tue, Jun 18, 2019 at 3:35 AM Dan McGrath <danmcgrath.ca at gmail.com>
> wrote:
>
>> Hi,
>>
>> It seems that a few hours ago that developer.blender.org became horribly
>> slow and unusable. While the exact cause is still to be determined, the
>> HTTP logs were tossing an excessive amount of errors about unsafe strings.
>>
>> Sergey is en route to the data center for some planned maintenance
>> (replace a server), but has already queued up some git commits to help
>> address some of the issues with the PHP errors, and plans to poke at it
>> some more once we get things sorted out.
>>
>> Sorry for the inconvenience!
>>
>>
>> Cheers,
>>
>> Dan McGrath
>>
>


More information about the Bf-committers mailing list