[Bf-committers] Proposal: Blender OpenCL compositor

Aurel W. aurel.w at gmail.com
Sat Jan 22 10:11:21 CET 2011


Hi Jeron,

> Please help me to determine the case when a whole output image is
> needed. IMO input is readonly and output is writeonly. I don't see the
> need atm to support whole output images in a 'per output pixel'
> approach. And every 'per input pixel' approach can be written by a 'per
> output pixel' approach. In the current nodes the two approaches are mixed.
The problem of the concept of pixel to pixel operations is also that
this tends to be implemented with a lot of overhead. Like having 3
frames on the call stack for adding two pixels and this for every
pixel in the buffer,.... it is really nasty. This is why even adding
buffers together is rather inefficient at the moment. Another example
would be the filter node, with these pixel_processors for convolution.
If you really think about low level efficiency, down to the level of
single instructions, a lot could be done better at the moment.

I also realize that the argument "it would work with the current
compositor" is a strong argument. But I got some problems with that.
First of all I think that a compositor should be in principal be able
to support all image processing operations. I think it's a rather bad
idea to be stuck with a very limited architecture, which already
requires a bunch of hacks to implement the functionality of current
nodes as those doing convolution.

Another problem I see with tiling is, that you are doing spacial
partitioning and are therefore stuck in the spatial domain. But there
are a lot of possibilities of working in gradient and frequency
domain, also including speedups. But you won't be able to convert a
tile to gradient domain, because you can't determine the correct
gradient on the borders. When you want to work in frequency domain you
also run into issues with tiling, because of your spacial
partitioning.

But back to the simple issue with operations, which need full buffer
access. I agree that this could be still done with tiling, because you
can simply compute all input tiles and just access those when
computing one single output tile. So this is sort of how this should
work? At least your diagram in your document looks like this. Any
other workarounds like using overlapping tiles for the very special
case of a 3*3 kernel convolution are just hacks, but will prevent the
implementation of any future nodes, which have other non pixel->pixel
operations.

Such future node for instance could be tone-mapping. This is for e.g.
a standard feature in lux, so I guess it's not that absurd to include
such features in blenders compositor. And some tone mapping algorithms
need to operate on the entire image.

In terms of memory usage, caching, etc. if we assume that only
reasonable sized buffers are used, let's say up to 64MB, I also don't
see the strong benefits in using tiles rather than buffers, which hold
the entire image. But maybe you have to be more specific about the
caching scheme you want to use here.

aurel


More information about the Bf-committers mailing list