[Bf-committers] Is there a parallel pipeline in blender?

Wed Aug 12 01:53:00 CEST 2009

Ruan Beihong schrieb:
> No. I'm not familiar with OpenMP, and I currently can't find and "very
> simple" way to implement this feature.
>
> The main focus is on the synchronization that promise data is OUT in
> there IN order, and only one piece of data should be in and out. The
> loop constructs in OpenMP do whole data out-of-order, or do only one
> piece of data in order (using ordered construct).
> Maybe the following graph tells more:
>
>   1 call:           thread0:        thread1:        return(data_done,
> pipeline_stat):
>   2 do(d[0])        f1(d[0])        none            null, running
>   3 do(d[1])        f2(d[0])        f1(d[1])        d[0], running
>   4 do(d[2])        f1(d[2])        f2(d[1])        d[1], running
>   5 do(d[3])        f2(d[2])        f1(d[3])        d[2], running
>   6 do(null)        none            f2(d[3])        d[3], empty
>
> Note that the date[n] come in stream and only one piece a time.
>
> 2009/8/11 Jonathan Merritt <merritt at unimelb.edu.au>:
>   
>> Ah ok... I guess you're talking more about a SIMD approach.  So,
>> couldn't you implement your example very simply using the looping
>> constructs in OpenMP?  See the spec here:
>>     http://www.openmp.org/mp-documents/spec30.pdf
>>
>> Futures are very similar to lazy evaluation... as you said, they're
>> really just a framework to allow a whole chain of operations to be
>> evaluated lazily.
>>
>> Jonathan Merritt.
>>
>> On 11/08/2009, at 3:13 PM, Ruan Beihong wrote:
>>
>>     
>>> I've heard Scala. and the "future" to me is just an explicitly way to
>>> do lazy evaluation that exists in many functional programming
>>> language.
>>> See http://en.wikipedia.org/wiki/Lazy_evaluation .
>>> And I think I understand your idea. But laziness is in running
>>> strategy. The actually execution couldn't avoid locks and thread
>>> waits, if anyone knows how to avoid those. We can introduce the lazy
>>> evaluation in to composite node. But the feature I meant to talk about
>>> has nothing about it. This feature focusing to the execution of actual
>>> work. To the outside, its just simply a loop execution a set of
>>> chained function wrapped in one "pipeline", and looks the similar as
>>> chained functions are running in one thread, but they are run in more
>>> than one thread.
>>> Think about 4 piece of data d[0:3] need processing by function f1,f2:
>>> The series version:
>>> thread0:
>>> f1(d[0])
>>> f2(d[0])
>>> f1(d[1])
>>> f2(d[1])
>>> f1(d[2])
>>> f2(d[2])
>>> f1(d[3])
>>> f2(d[3])
>>> The pipelined version:
>>> thread0: thread1:
>>> f1(d[0])
>>> f2(d[0])   f1(d[1])
>>> f1(d[2])   f2(d[1])
>>> f2(d[2])   f1(d[3])
>>>             f2(d[3])
>>>
>>> 2009/8/11 Jonathan Merritt <merritt at unimelb.edu.au>:
>>>       
>>>> Couldn't you simply use a "future" model for results of nodes:
>>>>     http://en.wikipedia.org/wiki/Futures_and_promises
>>>> In abstract terms, the "future" construct is like a place-holder
>>>> for a
>>>> result that will eventually be computed, but is not necessarily
>>>> available just yet.  When the result of the "future" is demanded
>>>> absolutely, execution on other threads can wait until the "future"'s
>>>> result is available.  However, by chaining "futures" together, you
>>>> can
>>>> often avoid having to set many explicit points at which your "start"
>>>> and "stop" events occur.  That approach can reduce thread waits in
>>>> systems like a network of nodes.
>>>>
>>>> I know this sounds very abstract and complicated, and I can't point
>>>> you to any examples in C.  It's an approach that works like a dream
>>>> in
>>>> Scala though... even I can manage it! :-)
>>>>
>>>> Jonathan Merritt.
>>>>
>>>> On 11/08/2009, at 12:37 PM, Ruan Beihong wrote:
>>>>         
>>>>> Hi,
>>>>> I use CPU pipeline as an example of how this should work.
>>>>> This feature is not very generic, but it do solve problems that data
>>>>> can be processed in parallel but should be start and stop the
>>>>> process
>>>>> in order. Many operations could benefit from this feature for
>>>>> example
>>>>> the composite node: each frame is individual and operation in each
>>>>> composite node is pure functional, i.e. thread-safe, but they must
>>>>> be
>>>>> done in order of frame and composite nodes (from input to output).
>>>>> With this feature, more than one frame can actually composite in
>>>>> pipeline but the are returned in order.
>>>>>
>>>>> The BLI_thread may serve as an under layer of this feature.
>>>>>
>>>>> 2009/8/11 Brecht Van Lommel <brecht at blender.org>:
>>>>>           
>>>>>> Hi,
>>>>>>
>>>>>> As mentioned, Blender already does multithreading in various
>>>>>> places.
>>>>>> Some using the BLI_thread functions, others using OpenMP. I'm not
>>>>>> sure
>>>>>> why you are talking about CPU pipelining, to me it seems what you
>>>>>> are
>>>>>> proposing is standard multithreading, so it may be better to talk
>>>>>> in
>>>>>> that terminology. There's more places in Blender that could benefit
>>>>>> from
>>>>>> multithreading, and they can share some code, but basically need to
>>>>>> be
>>>>>> handled on a case by case basis. What you're proposing is very
>>>>>> generic,
>>>>>> so it's unclear to me what it is really about.
>>>>>>
>>>>>> Brecht.
>>>>>>
>>>>>> On Fri, 2009-08-07 at 09:14 +0800, Ruan Beihong wrote:
>>>>>>             
>>>>>>> Thanks for the reply.
>>>>>>> Maybe I didn't put it right. I mean the execution of the tasks in
>>>>>>> pipeline goes in similar way as the pipeline in CPU, which always
>>>>>>> starts tasks in order, and end it in order but between them tasks
>>>>>>> run
>>>>>>> in parallel. This feature can be put into use when some series of
>>>>>>> tasks done in order to more than one piece of data.
>>>>>>> I don't know much about OpenMP, but it seems OpenMP doesn't
>>>>>>> provide
>>>>>>> this feature directly, isn't it?
>>>>>>>
>>>>>>> 2009/8/7 GSR <gsr.b3d at infernal-iceberg.com>:
>>>>>>>               
>>>>>>>> Hi,
>>>>>>>> ruanbeihong at gmail.com (2009-08-06 at 2311.39 +0800):
>>>>>>>>                 
>>>>>>>>> Hi there,
>>>>>>>>> I wonder if there is a parallel pipeline in blender. I mean the
>>>>>>>>> pipeline as that in CPU which increase IPC (instructions per
>>>>>>>>> cycle).
>>>>>>>>>                   
>>>>>>>> Sorry, but I am unable to see how instructions per cycle matter
>>>>>>>> for
>>>>>>>> this. Maybe you mean Symmetric Multi Processing?
>>>>>>>>
>>>>>>>>                 
>>>>>>>>> I'm considering implement that feature.
>>>>>>>>>                   
>>>>>>>> Some parts of code are using OpenMP to accomodate systems with
>>>>>>>> more
>>>>>>>> than one logical CPU, libelbeem for example.
>>>>>>>>
>>>>>>>> GSR
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Bf-committers mailing list
>>>>>>>> Bf-committers at blender.org
>>>>>>>> http://lists.blender.org/mailman/listinfo/bf-committers
>>>>>>>>
>>>>>>>>                 
>>>>>>>
>>>>>>>               
>>>>>> _______________________________________________
>>>>>> Bf-committers mailing list
>>>>>> Bf-committers at blender.org
>>>>>> http://lists.blender.org/mailman/listinfo/bf-committers
>>>>>>
>>>>>>             
>>>>>
>>>>> --
>>>>> James Ruan
>>>>> _______________________________________________
>>>>> Bf-committers mailing list
>>>>> Bf-committers at blender.org
>>>>> http://lists.blender.org/mailman/listinfo/bf-committers
>>>>>           
>>>> _______________________________________________
>>>> Bf-committers mailing list
>>>> Bf-committers at blender.org
>>>> http://lists.blender.org/mailman/listinfo/bf-committers
>>>>
>>>>         
>>>
>>> --
>>> James Ruan
>>> _______________________________________________
>>> Bf-committers mailing list
>>> Bf-committers at blender.org
>>> http://lists.blender.org/mailman/listinfo/bf-committers
>>>       
>> _______________________________________________
>> Bf-committers mailing list
>> Bf-committers at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-committers
>>
>>     
>
>
>
>   
Usually I am only watching, sitting in my rocking chair. /* no heated 
blanket yet */
Now i feel i have to add 0.02 Euro.

As Brecht said:
We have a thread spawning and collecting results API, and i really like 
to stick to it.
However, some 'watch dog ' functionality to pull down wrong balanced 
threads would be nice, for debugging purposes .. me smirks at Brecht  :)
Said that, I really think that anticipating ( estimating ) process load 
.. and how to spread it  is pretty much in the developers brain, rather 
than having some voodoo doing it. So i am very old fashioned voting 
against OpenMP .. and any other  automatic  multi  processor  magic .. 
unless proven to be optimal.
Promises made sound like:  We have have ground and slaves (CPU) so beat 
'em to get it going
other than
We have ground .. people living on .. let's find a clever way to get  it 
growing..
Well ..
take it with 2 grains of salt.
BM