[Bf-committers] Blender OpenCL compositor

Thu Jan 20 19:07:46 CET 2011

Hi Aurel,

I think that you mention most of the concerns I had when I started 
writing this proposal :)

In my opinion you have 3 balls that needs to be balanced out:
Maintainability of the source code, Economics and Performance.

In my perception the performance results into: less idling (GPU and 
CPU). Even that one a handful of nodes are really waiting for an OpenCL 
implementation. The GPU should not be idling when other nodes are 
calculated. Also in my approach I don't intent to implement a node for 
node calculation. The whole compositor system is transferred to OpenCL 
when calculating and code that will not be needed for execution are 
removed (if checked). Also the new compositor system will utilize the 
processor stack for most data transports. GPU memory or RAM are only 
needed to store and read from tiles.

Economics. For the same price you can get a GPU that is 20-50 times as 
fast as a processor for the same price. This is true for most of the 
hardware configurations.

Maintainability: In my opinion you should only write your code once. And 
this code should be able to run on normal CPU and OpenCL. I want that 
implementing new nodes should be really easy. We have different 
experience developers and I don't want only very experienced developers 
capable of making nodes. It should be possible that power users can add 
a new node. (perhaps a personal principle, but I think it is important 
for blender).

Jeroen

On 01/20/2011 09:11 AM, Aurel W. wrote:
> I guess for renderfarms, we need at least CPU only support for the
> next couple of years, not doing compositing directly when rendering
> the shot could be annoying.
>
> I have currently some concerns when I think of how opencl 'could' be
> integrated into the compositor and what mistakes could be done. Tho
> the overall concept of the compositor fits very well gpgpu, the
> existing design, architecture and implementation doesn't. So I hope
> the opencl code won't be too much bound to the current implementation.
>
> Many of the performance problems of the current compositor is not due
> to the fact that everything is done on the CPU, it's because some
> nodes are ridiculously unefficiently implemented. If more performance
> is the target this is also one main problem which has to be tackled.
>
> On the other hand, it's still important to be able to integrate nodes
> with CPU code with nodes, which are implemented with opencl. The main
> issue here is, that this capability shouldn't cause much overhead,
> when there are only opencl nodes. It is very important to really do
> the entire evaluation of the compositing graph on gpu, and not to just
> offload some computations. So buffers shouldn't be copied from main
> ram to vram and vice versa all the time a node is executed. Also
> previews, outputs, view nodes, should be converted to a framebuffer
> object and displayed directly rather than copying to main ram again.
> This would be really necessary to get full blown performance, but as I
> said, without much overhead but still being able to integrate CPU
> nodes and in this case do vram<->  ram.
>
> So to some it up, if the target is to do reasonable fast compositing
> for 4k footage in the future, a lot of things have to considered, not
> just to implement some opencl kernels to reimplement current nodes.
>
> aurel
>
> On 20 January 2011 04:45, Ian Johnson<enjalot at gmail.com>  wrote:
>>> Hi Ian,
>>
>>> ----- Original Message -----
>>> because it supports multiple device architectures, a code optimized
>>> for the GPU won't run fast on the CPU.
>>
>>> I thought you could write kernels optimised for various architectures, and
>>> choose the best one at run time. So each node could have one kernel for the
>>> GPU and one for the CPU. But an OpenCL GPU kernel will at least run on the
>>> CPU, even if it does so sub-optimally; and vice versa.
>>
>> Yes ideally one would write different kernels optimized for different
>> architectures, and this is the goal of OpenCL. The main issue is when you
>> have a hammer everything looks like a nail, so we must be careful not think
>> OpenCL is a magic bullet, but rather a really nice tool for some situations.
>> The most dramatic speedups will be had at first with highly data parallel
>> algorithms which can be moved to the GPU, with slower but still accelerated
>> CPU versions taking advantage of multiple cores.
>>
>>
>>>> Then there is the question of user's having the hardware to even run
>>> it, necessitating a CPU only fall-back.
>>
>>> Do you mean two entirely separate codes? Or could we have one
>>> implementation that uses OpenCL, but with a CPU kernel (also written in
>>> OpenCL) to fall back on? That would seem ideal, since you can develop just
>>> one kernel to start with, and add architecture-specific kernels at a later
>>> time.
>>
>>> Is the problem that there are no free OpenCL libraries (e.g. for use
>>> without a GPU)?
>>
>> I do mean two entirely separate codes, at least for a while. Keep in mind
>> that just about everything is already implemented on the CPU, with some
>> multiprocessor support from OpenMP. So if we want to accelerate some feature
>> it doesn't make sense to just throw away the existing code, just switch to
>> OpenCL if its available. This is especially true since OpenCL
>> implementations are not yet ubiquitous (they are free, from NVIDIA, ATI,
>> Intel and Apple to name a few) so we don't want to disadvantage any users
>> who don't have it yet.
>>
>> In the future, if and when OpenCL is everywhere it would make sense to just
>> code a CPU kernel and a GPU kernel to switch between (or whatever kind of
>> kernel you make for a CPU+GPU chip like NVIDIA's project Denver, ATI's
>> Fusion or Intel's Sandy Bridge). Until then we should provide a solid
>> infrastructure for acceleration but not throw out the baby with the bath
>> water.
>>
>>
>>> Cheers,
>> Alex
>>
>>
>> --
>> Ian Johnson
>> http://enja.org
>> _______________________________________________
>> Bf-committers mailing list
>> Bf-committers at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-committers
>>
> _______________________________________________
> Bf-committers mailing list
> Bf-committers at blender.org
> http://lists.blender.org/mailman/listinfo/bf-committers
>

-- 

Met vriendelijke groet,

Jeroen Bakker

*At Mind BV
*

Telefoon: 06 50 611 262
E-mail: j.bakker at atmind.nl <mailto:j.bakker at atmind.nl>