[Bf-committers] Blender and OpenCL

Thu Sep 2 10:52:13 CEST 2010

Hi Jeroen,

You have a cl_int pointer as a parameter, you should dereference it
when assigning the error code:

cl_program BKE_opencl_build_source(cl_context context, const char*
source, cl_int *resultcode) {
	cl_program result;
	result = clCreateProgramWithSource(context, 1, &source, 0, resultcode);
	if (CL_SUCCESS == *resultcode) {
		*resultcode = clBuildProgram(result, 0, 0, 0, 0, 0);
	}
	return result;
}

Afaik the cl_*** types are just typedef'd pointers anyway, but that
could be implementation-dependent (using NVIDIA stuff here), so with
the pointers you should be on the safe side.
I'm not sure if that could have caused the trouble though. For
reasonable filter sizes it worked perfectly for me, but i got a total
(!) freeze when trying a 2k render with 50x50 filtering. Could it be
you need to actually bail out when there's an error? Atm it mostly
just prints the error message and continues when creating the image
buffers etc.

2010/9/1 Jeroen Bakker <j.bakker at atmind.nl>:
> Hi,
>
> I have created an OpenCL implementation of the bokeh blur. I got a
> speedup (2 times faster) on my old hardware, but also some stability
> issues. I think that some issues concern my old NVidia + OpenCL drivers.
> But I would really like to know how other hardware setups will do.
>
> I get a random out of resource issue, that I cannot influence by code
> and got a UI freeze for long calculations.
>
> the current patch can be found on
> http://sicg.atmind.nl/media/patches/patch-opencl-bokeh.txt
>
> Regards,
> Jeroen
>
>
> On 08/29/2010 01:10 PM, Vilem Novak wrote:
>> Hello, maybe focusing on performance - heavier nodes would make sense?
>> Rather performance heavy in my experience can be quality blurs(especially defocus), UV remap,
>> bilateral blur.
>> With these the advantages would be visible even with the bus problems.
>> With regards,
>> Vilem Novak
>>
>>
>>> ------------ Původní zpráva ------------
>>> Od: Jeroen Bakker<j.bakker at atmind.nl>
>>> Předmět: Re: [Bf-committers] Blender and OpenCL
>>> Datum: 29.8.2010 11:19:15
>>> ----------------------------------------
>>> Hi Lukas,
>>>
>>> Your explanation is a good one. Didn't come up to write it down that way.
>>> The issue with memory during compositing is the way the nodes-editor
>>> works. When changing a node-value (like degree) only the rotate-node and
>>> all dependent nodes are re-calculated. The input-image is not
>>> re-calculated it is still in memory. This is a good optimization during
>>> editing time you only need to reevaluate a part of the node-system, but
>>> in complex node-systems I think this will not work for OpenCL due to the
>>> needed memory.
>>>
>>> I am looking for a situation what is good during editing (decrease the
>>> feedback-time to the end-user) and rendering (overall performance of the
>>> system). But haven't found a good solution.
>>>
>>> At the moment I am evaluating 2 things:
>>> a. per viewer and compositor node a opencl kernel/program will be
>>> generated and executed.
>>> b. per node a program and kernel is created. and evaluation is done as
>>> the current situation.
>>>
>>> A question back. Have you seen any speed-up? My system (three years old
>>> dual core 2 at 2000Mhz laptop with 16 at 400Mhz nvidia cores and a bus of
>>> 800Mhz) was not able to see big differences. I think that a desktop
>>> system with a faster Bus and more and powerful gpu cores would get much
>>> better performance.
>>>
>>> Regards,
>>> Jeroen
>>>
>>> On 08/28/2010 09:40 PM, Lukas Tönne wrote:
>>>
>>>> I have tried out your patch, nice work :)
>>>>
>>>> Here are some more thoughts on how to process data in the node tree. I
>>>> hope i'm not getting too verbose or tell you guys obvious stuff ;)
>>>>
>>>> Basically when talking about data in the tree i see two different
>>>> types of dependency:
>>>> 1. Inter-node dependency ("vertical"):
>>>> A node can only be executed (be it for a single pixel or the whole
>>>> image) when all it's inputs are done. This dependency _always_ exists
>>>> in node trees to a certain degree.
>>>> 2. Inter-element dependency ("horizontal"):
>>>> An element (pixel, sample, particle, vertex, etc.) depends on the
>>>> state of other elements (neighbouring pixels, particles in a certain
>>>> radius, connected vertices).
>>>>
>>>> Vertical dependency does not depend on the tree type, but only on the
>>>> connectivity of the nodes (complexity of the tree). Here's a made-up
>>>> example with strong connectivity in the middle part:
>>>> http://www.pasteall.org/pic/5405
>>>>
>>>> Horizontal (inter-element dependency) on the other hand chiefly
>>>> depends on the type of tree you're looking at:
>>>> * Shader- and texture trees have _no_ horizontal dependency at all,
>>>> the color of a material or texture sample does not depend on other
>>>> samples. This is why shader trees can be evaluated per sample and do
>>>> not need to store large amounts of data.
>>>> * Compositor tree are the other extreme: while some nodes, such as
>>>> Mix, operate per-pixel, others like Blur and Defocus heavily depend on
>>>> neighbouring or even _all_ other pixels of the input images
>>>> respectively.
>>>> * Particles are not as extreme as compo trees (less neighbours to take
>>>> into account), but they lack the inherent ordering of image pixels and
>>>> need kd trees for finding neighbours.
>>>>
>>>> One relatively simple thing one could probably do to decrease memory
>>>> usage is removing data that is not needed any more (I am not sure if
>>>> the current compositors do something like this already, if so, just
>>>> skip this section). As soon as all nodes, which use a certain socket
>>>> for input, have been processed, that sockets data can be freed from
>>>> memory. This of course only works as long as connectivity is
>>>> relatively low and node relations are "local". In the example above
>>>> the result of the Blur node would have to be kept in memory until all
>>>> the mix nodes are finished, whereas the initial renderlayer node could
>>>> free its buffer right after Blur is done. It might even be an option
>>>> to bite the bullet, if memory usage gets dangerously high, and discard
>>>> intermediate results used very late in the tree and recalculate them
>>>> later.
>>>>
>>>> Another improvement i currently use in the simulation trees is
>>>> splitting the large data blocks into smaller parts ("batches"). This
>>>> has the advantage of making better use of available processing power,
>>>> especially when some nodes need significantly more time than others.
>>>> In the compositor nodes one thread processes the full image for one
>>>> node at a time, which can lead to threads idly waiting for the result
>>>> of one other (iirc Brecht recently coded internal multithreading for
>>>> the especially heavy Defocus node though). At the same time by staying
>>>> with one node for a range of elements instead of processing them
>>>> one-by-one avoids the overhead of switching between nodes. Afaik this
>>>> is basically the same concept as OpenCLs "work groups", have to read
>>>> up on that again.
>>>>
>>>> Cheers
>>>> Lukas
>>>>
>>>> On Tue, Aug 24, 2010 at 7:18 PM, Jeroen Bakker<j.bakker at atmind.nl>   wrote:
>>>>
>>>>
>>>>> Hi all
>>>>>
>>>>> I have been experimenting with OpenCL and are planning a basic framework
>>>>> to support it in Blender.
>>>>>
>>>>> main features are:
>>>>>    * OpenCL is disabled by default, CPU fall-back must ALWAYS be
>>>>> available. OpenCL can be enabled with command-line parameter
>>>>>    * Compiler directive to completely disable OpenCL in Blender.
>>>>>    * Basic implementation to access and use GPU-devices
>>>>>    * I am not targeting the blender-render, but other time-consuming
>>>>> processes (fluids, node systems etc)
>>>>>
>>>>> I think this matches the basic blender principles:
>>>>>    * can work on standard home PC's
>>>>>    * blender installation is unzipping an zip
>>>>>
>>>>> Are other people also busy with this subject?
>>>>>
>>>>> Best regards,
>>>>> Jeroen
>>>>>
>>>>> http://wiki.blender.org/index.php/User_talk:Jbakker
>>>>> _______________________________________________
>>>>> Bf-committers mailing list
>>>>> Bf-committers at blender.org
>>>>> http://lists.blender.org/mailman/listinfo/bf-committers
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bf-committers mailing list
>>>> Bf-committers at blender.org
>>>> http://lists.blender.org/mailman/listinfo/bf-committers
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Bf-committers mailing list
>>> Bf-committers at blender.org
>>> http://lists.blender.org/mailman/listinfo/bf-committers
>>>
>>>
>>>
>>>
>> _______________________________________________
>> Bf-committers mailing list
>> Bf-committers at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-committers
>
> _______________________________________________
> Bf-committers mailing list
> Bf-committers at blender.org
> http://lists.blender.org/mailman/listinfo/bf-committers
>