[Bf-committers] Blender and OpenCL

François T. francoistarlier at gmail.com
Mon Aug 30 10:16:09 CEST 2010


here I have a few math node example if you need some :) *
http://tinyurl.com/327hgjw*



2010/8/29 Vilem Novak <pildanovak at post.cz>

> Hello, maybe focusing on performance - heavier nodes would make sense?
> Rather performance heavy in my experience can be quality blurs(especially
> defocus), UV remap,
> bilateral blur.
> With these the advantages would be visible even with the bus problems.
> With regards,
> Vilem Novak
>
> > ------------ Původní zpráva ------------
> > Od: Jeroen Bakker <j.bakker at atmind.nl>
> > Předmět: Re: [Bf-committers] Blender and OpenCL
> > Datum: 29.8.2010 11:19:15
> > ----------------------------------------
> > Hi Lukas,
> >
> > Your explanation is a good one. Didn't come up to write it down that way.
> > The issue with memory during compositing is the way the nodes-editor
> > works. When changing a node-value (like degree) only the rotate-node and
> > all dependent nodes are re-calculated. The input-image is not
> > re-calculated it is still in memory. This is a good optimization during
> > editing time you only need to reevaluate a part of the node-system, but
> > in complex node-systems I think this will not work for OpenCL due to the
> > needed memory.
> >
> > I am looking for a situation what is good during editing (decrease the
> > feedback-time to the end-user) and rendering (overall performance of the
> > system). But haven't found a good solution.
> >
> > At the moment I am evaluating 2 things:
> > a. per viewer and compositor node a opencl kernel/program will be
> > generated and executed.
> > b. per node a program and kernel is created. and evaluation is done as
> > the current situation.
> >
> > A question back. Have you seen any speed-up? My system (three years old
> > dual core 2 at 2000Mhz laptop with 16 at 400Mhz nvidia cores and a bus of
> > 800Mhz) was not able to see big differences. I think that a desktop
> > system with a faster Bus and more and powerful gpu cores would get much
> > better performance.
> >
> > Regards,
> > Jeroen
> >
> > On 08/28/2010 09:40 PM, Lukas Tönne wrote:
> > > I have tried out your patch, nice work :)
> > >
> > > Here are some more thoughts on how to process data in the node tree. I
> > > hope i'm not getting too verbose or tell you guys obvious stuff ;)
> > >
> > > Basically when talking about data in the tree i see two different
> > > types of dependency:
> > > 1. Inter-node dependency ("vertical"):
> > > A node can only be executed (be it for a single pixel or the whole
> > > image) when all it's inputs are done. This dependency _always_ exists
> > > in node trees to a certain degree.
> > > 2. Inter-element dependency ("horizontal"):
> > > An element (pixel, sample, particle, vertex, etc.) depends on the
> > > state of other elements (neighbouring pixels, particles in a certain
> > > radius, connected vertices).
> > >
> > > Vertical dependency does not depend on the tree type, but only on the
> > > connectivity of the nodes (complexity of the tree). Here's a made-up
> > > example with strong connectivity in the middle part:
> > > http://www.pasteall.org/pic/5405
> > >
> > > Horizontal (inter-element dependency) on the other hand chiefly
> > > depends on the type of tree you're looking at:
> > > * Shader- and texture trees have _no_ horizontal dependency at all,
> > > the color of a material or texture sample does not depend on other
> > > samples. This is why shader trees can be evaluated per sample and do
> > > not need to store large amounts of data.
> > > * Compositor tree are the other extreme: while some nodes, such as
> > > Mix, operate per-pixel, others like Blur and Defocus heavily depend on
> > > neighbouring or even _all_ other pixels of the input images
> > > respectively.
> > > * Particles are not as extreme as compo trees (less neighbours to take
> > > into account), but they lack the inherent ordering of image pixels and
> > > need kd trees for finding neighbours.
> > >
> > > One relatively simple thing one could probably do to decrease memory
> > > usage is removing data that is not needed any more (I am not sure if
> > > the current compositors do something like this already, if so, just
> > > skip this section). As soon as all nodes, which use a certain socket
> > > for input, have been processed, that sockets data can be freed from
> > > memory. This of course only works as long as connectivity is
> > > relatively low and node relations are "local". In the example above
> > > the result of the Blur node would have to be kept in memory until all
> > > the mix nodes are finished, whereas the initial renderlayer node could
> > > free its buffer right after Blur is done. It might even be an option
> > > to bite the bullet, if memory usage gets dangerously high, and discard
> > > intermediate results used very late in the tree and recalculate them
> > > later.
> > >
> > > Another improvement i currently use in the simulation trees is
> > > splitting the large data blocks into smaller parts ("batches"). This
> > > has the advantage of making better use of available processing power,
> > > especially when some nodes need significantly more time than others.
> > > In the compositor nodes one thread processes the full image for one
> > > node at a time, which can lead to threads idly waiting for the result
> > > of one other (iirc Brecht recently coded internal multithreading for
> > > the especially heavy Defocus node though). At the same time by staying
> > > with one node for a range of elements instead of processing them
> > > one-by-one avoids the overhead of switching between nodes. Afaik this
> > > is basically the same concept as OpenCLs "work groups", have to read
> > > up on that again.
> > >
> > > Cheers
> > > Lukas
> > >
> > > On Tue, Aug 24, 2010 at 7:18 PM, Jeroen Bakker<j.bakker at atmind.nl>
>  wrote:
> > >
> > >> Hi all
> > >>
> > >> I have been experimenting with OpenCL and are planning a basic
> framework
> > >> to support it in Blender.
> > >>
> > >> main features are:
> > >>   * OpenCL is disabled by default, CPU fall-back must ALWAYS be
> > >> available. OpenCL can be enabled with command-line parameter
> > >>   * Compiler directive to completely disable OpenCL in Blender.
> > >>   * Basic implementation to access and use GPU-devices
> > >>   * I am not targeting the blender-render, but other time-consuming
> > >> processes (fluids, node systems etc)
> > >>
> > >> I think this matches the basic blender principles:
> > >>   * can work on standard home PC's
> > >>   * blender installation is unzipping an zip
> > >>
> > >> Are other people also busy with this subject?
> > >>
> > >> Best regards,
> > >> Jeroen
> > >>
> > >> http://wiki.blender.org/index.php/User_talk:Jbakker
> > >> _______________________________________________
> > >> Bf-committers mailing list
> > >> Bf-committers at blender.org
> > >> http://lists.blender.org/mailman/listinfo/bf-committers
> > >>
> > >>
> > > _______________________________________________
> > > Bf-committers mailing list
> > > Bf-committers at blender.org
> > > http://lists.blender.org/mailman/listinfo/bf-committers
> > >
> > >
> >
> > _______________________________________________
> > Bf-committers mailing list
> > Bf-committers at blender.org
> > http://lists.blender.org/mailman/listinfo/bf-committers
> >
> >
> >
> _______________________________________________
> Bf-committers mailing list
> Bf-committers at blender.org
> http://lists.blender.org/mailman/listinfo/bf-committers
>



-- 
____________________
François Tarlier
www.francois-tarlier.com
www.linkedin.com/in/francoistarlier


More information about the Bf-committers mailing list