[Bf-committers] Is there a parallel pipeline in blender?

Tue Aug 11 09:02:51 CEST 2009

No. I'm not familiar with OpenMP, and I currently can't find and "very
simple" way to implement this feature.

The main focus is on the synchronization that promise data is OUT in
there IN order, and only one piece of data should be in and out. The
loop constructs in OpenMP do whole data out-of-order, or do only one
piece of data in order (using ordered construct).
Maybe the following graph tells more:

  1 call:           thread0:        thread1:        return(data_done,
pipeline_stat):
  2 do(d[0])        f1(d[0])        none            null, running
  3 do(d[1])        f2(d[0])        f1(d[1])        d[0], running
  4 do(d[2])        f1(d[2])        f2(d[1])        d[1], running
  5 do(d[3])        f2(d[2])        f1(d[3])        d[2], running
  6 do(null)        none            f2(d[3])        d[3], empty

Note that the date[n] come in stream and only one piece a time.

2009/8/11 Jonathan Merritt <merritt at unimelb.edu.au>:
> Ah ok... I guess you're talking more about a SIMD approach.  So,
> couldn't you implement your example very simply using the looping
> constructs in OpenMP?  See the spec here:
>     http://www.openmp.org/mp-documents/spec30.pdf
>
> Futures are very similar to lazy evaluation... as you said, they're
> really just a framework to allow a whole chain of operations to be
> evaluated lazily.
>
> Jonathan Merritt.
>
> On 11/08/2009, at 3:13 PM, Ruan Beihong wrote:
>
>> I've heard Scala. and the "future" to me is just an explicitly way to
>> do lazy evaluation that exists in many functional programming
>> language.
>> See http://en.wikipedia.org/wiki/Lazy_evaluation .
>> And I think I understand your idea. But laziness is in running
>> strategy. The actually execution couldn't avoid locks and thread
>> waits, if anyone knows how to avoid those. We can introduce the lazy
>> evaluation in to composite node. But the feature I meant to talk about
>> has nothing about it. This feature focusing to the execution of actual
>> work. To the outside, its just simply a loop execution a set of
>> chained function wrapped in one "pipeline", and looks the similar as
>> chained functions are running in one thread, but they are run in more
>> than one thread.
>> Think about 4 piece of data d[0:3] need processing by function f1,f2:
>> The series version:
>> thread0:
>> f1(d[0])
>> f2(d[0])
>> f1(d[1])
>> f2(d[1])
>> f1(d[2])
>> f2(d[2])
>> f1(d[3])
>> f2(d[3])
>> The pipelined version:
>> thread0: thread1:
>> f1(d[0])
>> f2(d[0])   f1(d[1])
>> f1(d[2])   f2(d[1])
>> f2(d[2])   f1(d[3])
>>             f2(d[3])
>>
>> 2009/8/11 Jonathan Merritt <merritt at unimelb.edu.au>:
>>> Couldn't you simply use a "future" model for results of nodes:
>>>     http://en.wikipedia.org/wiki/Futures_and_promises
>>> In abstract terms, the "future" construct is like a place-holder
>>> for a
>>> result that will eventually be computed, but is not necessarily
>>> available just yet.  When the result of the "future" is demanded
>>> absolutely, execution on other threads can wait until the "future"'s
>>> result is available.  However, by chaining "futures" together, you
>>> can
>>> often avoid having to set many explicit points at which your "start"
>>> and "stop" events occur.  That approach can reduce thread waits in
>>> systems like a network of nodes.
>>>
>>> I know this sounds very abstract and complicated, and I can't point
>>> you to any examples in C.  It's an approach that works like a dream
>>> in
>>> Scala though... even I can manage it! :-)
>>>
>>> Jonathan Merritt.
>>>
>>> On 11/08/2009, at 12:37 PM, Ruan Beihong wrote:
>>>> Hi,
>>>> I use CPU pipeline as an example of how this should work.
>>>> This feature is not very generic, but it do solve problems that data
>>>> can be processed in parallel but should be start and stop the
>>>> process
>>>> in order. Many operations could benefit from this feature for
>>>> example
>>>> the composite node: each frame is individual and operation in each
>>>> composite node is pure functional, i.e. thread-safe, but they must
>>>> be
>>>> done in order of frame and composite nodes (from input to output).
>>>> With this feature, more than one frame can actually composite in
>>>> pipeline but the are returned in order.
>>>>
>>>> The BLI_thread may serve as an under layer of this feature.
>>>>
>>>> 2009/8/11 Brecht Van Lommel <brecht at blender.org>:
>>>>> Hi,
>>>>>
>>>>> As mentioned, Blender already does multithreading in various
>>>>> places.
>>>>> Some using the BLI_thread functions, others using OpenMP. I'm not
>>>>> sure
>>>>> why you are talking about CPU pipelining, to me it seems what you
>>>>> are
>>>>> proposing is standard multithreading, so it may be better to talk
>>>>> in
>>>>> that terminology. There's more places in Blender that could benefit
>>>>> from
>>>>> multithreading, and they can share some code, but basically need to
>>>>> be
>>>>> handled on a case by case basis. What you're proposing is very
>>>>> generic,
>>>>> so it's unclear to me what it is really about.
>>>>>
>>>>> Brecht.
>>>>>
>>>>> On Fri, 2009-08-07 at 09:14 +0800, Ruan Beihong wrote:
>>>>>> Thanks for the reply.
>>>>>> Maybe I didn't put it right. I mean the execution of the tasks in
>>>>>> pipeline goes in similar way as the pipeline in CPU, which always
>>>>>> starts tasks in order, and end it in order but between them tasks
>>>>>> run
>>>>>> in parallel. This feature can be put into use when some series of
>>>>>> tasks done in order to more than one piece of data.
>>>>>> I don't know much about OpenMP, but it seems OpenMP doesn't
>>>>>> provide
>>>>>> this feature directly, isn't it?
>>>>>>
>>>>>> 2009/8/7 GSR <gsr.b3d at infernal-iceberg.com>:
>>>>>>> Hi,
>>>>>>> ruanbeihong at gmail.com (2009-08-06 at 2311.39 +0800):
>>>>>>>> Hi there,
>>>>>>>> I wonder if there is a parallel pipeline in blender. I mean the
>>>>>>>> pipeline as that in CPU which increase IPC (instructions per
>>>>>>>> cycle).
>>>>>>>
>>>>>>> Sorry, but I am unable to see how instructions per cycle matter
>>>>>>> for
>>>>>>> this. Maybe you mean Symmetric Multi Processing?
>>>>>>>
>>>>>>>> I'm considering implement that feature.
>>>>>>>
>>>>>>> Some parts of code are using OpenMP to accomodate systems with
>>>>>>> more
>>>>>>> than one logical CPU, libelbeem for example.
>>>>>>>
>>>>>>> GSR
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bf-committers mailing list
>>>>>>> Bf-committers at blender.org
>>>>>>> http://lists.blender.org/mailman/listinfo/bf-committers
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bf-committers mailing list
>>>>> Bf-committers at blender.org
>>>>> http://lists.blender.org/mailman/listinfo/bf-committers
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> James Ruan
>>>> _______________________________________________
>>>> Bf-committers mailing list
>>>> Bf-committers at blender.org
>>>> http://lists.blender.org/mailman/listinfo/bf-committers
>>>
>>> _______________________________________________
>>> Bf-committers mailing list
>>> Bf-committers at blender.org
>>> http://lists.blender.org/mailman/listinfo/bf-committers
>>>
>>
>>
>>
>> --
>> James Ruan
>> _______________________________________________
>> Bf-committers mailing list
>> Bf-committers at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-committers
>
> _______________________________________________
> Bf-committers mailing list
> Bf-committers at blender.org
> http://lists.blender.org/mailman/listinfo/bf-committers
>

-- 
James Ruan