[Bf-committers] Seemingly hugepage-related performance issues

Jonas Wielicki j.wielicki at sotecware.net
Thu Nov 22 13:46:22 CET 2012


Hi all,

First off, I did run long tests (i.e. baking) with blender 2.59 when I
experienced this issue first and I did a short check to verify it's
still present in blender 2.64a. For full system specs see [3].
[sidenote: I'm still using blender 2.59 cause that's what my linux
distribution's (fedora 16) shipping]
I've been pointed to this mailing list after jumping into the devel irc
to find out where to discuss this problem.

Description of symptoms
-----------------------

I've been baking a simulations on a rather decent PC for a few hours
now. As long as I keep memory use in the mid-terms of the available
physical memory, everything is fine. However, things start to screw up
when I go to the upper range, like blender using 80% or more of the
physical memory (which should be fine, as the remaining isn't used too
much). In that case, other (memory using) applications often stall
without any swapping involved.

I've observed blender using htop (it's like top, just more awesome) and
did some research on the involved kernel thread, khugepaged. When the
stalls happen, blender, the other stalling application and khugepaged
are using most of the CPU (with blender and the stalling application
using 90%--99% of each core and khugepaged totalling to 8% or
something). Now, using CPU isn't unusual for blender, but it's spending
the time in the kernelspace instead of the userspace (100% of it), which
is obviously not desired.

khugepaged is related to a linux kernel feature called Transparent
Hugepage Memory, about which more information is available here[1]. It
seems to boil down to try to keep memory for application using lots of
it as contiguous as possible.

Appearantly, this involves some memory compaction and moving around of
pages, which I am able to observe using

    watch "cat /proc/vmstat | grep compact_*"

Especially compact_fail and compact_pages_moved are increasing heavily
(compared to their absolute value) (the values are explained in [1]).

Suggested diagnostic
--------------------

In theory, compaction should be fine and after a few minutes, everything
should even out -- the application doing heavy calculations involving
lots of memory gets its contiguous pages and can crunch the numbers happily.

However, things start to screw up if the application releases and
allocates large blocks memory alternatingly (possibly only on an in the
meantime averagely used desktop system (now the first question is
whether that's actually of interest for the blender project) ),
especially if the time between the allocation and deallocation is a lot
smaller than the time needed for the compaction to converge (which may
be the case with a complex smoke simulation in blender 2.5). See the
message [2] for some reference that this might be relevant.

Indicators that the diagnostic may be correct
---------------------------------------------

Now, blender does exactly that. For each frame of the 256-division smoke
sim with 2 subdivisions high-resolution noise (and some 32k emitter
particles involved), blender (de-)allocates the whole memory for each
frame at the beginning/end of each frame. With hugepaging, this makes
blender stall for some time during the allocation. Other applications
trying to allocate larger blocks of memory (firefox, pdf viewers) are
also pulled into the vortex and get stalled for some time, often shorter
than blender though.

Observing /proc/$pid/stack of the blender threads points to the
compaction routines in the kernel too (try_to_compact_pages is in the
callstack actually).

The specific behaviour of stalling at the start of a frame is _not_
observed when turning off transparent hugepage support (echo never >
/sys/kernel/mm/transparent_hugepage/enabled before starting blender),
_but_ the system starts swapping, possibly because no contiguous memory
is available for blender.


Because this is, as far as I can tell, expected behaviour in the linux
kernel (inferred from the discussion of the patch at[2]; the patch
itself is afaik not related to the problem, but the discussion is
enlightening of the purpose and the effects of hugepaging), I decided to
go ahead and report this to blender, as it seems this could be fixed by
changing the memory use behaviour of blender.

I'm not sure what further information I can share with you. If you need
any additional information snippets, please just ask back. I tried ato
limit myself to the description of the symptoms and a diagnostic
inferred from what I've learnt about hugepaging in the last few days.

best regards & looking forward to your replies,
Jonas

   [1]: http://www.mjmwired.net/kernel/Documentation/vm/transhuge.txt
   [2]: http://article.gmane.org/gmane.linux.kernel.mm/70032
   [3]: System specification (possibly relevant parts):
        blender: 2.59, 2.63a, 2.64a
        linux: 3.6.6-1.fc16.x86_64
        graphics (hopefully not relevant): nvidia proprietary 304.60
        memory (for reference): 8 GB


More information about the Bf-committers mailing list