[Bf-committers] Data Nodes: Extending the New Depsgraph Design

Lukas Tönne lukas.toenne at gmail.com
Fri Sep 18 10:54:45 CEST 2015

argh ... never write long emails in the browser ...

On Fri, Sep 18, 2015 at 10:54 AM, Lukas Tönne <lukas.toenne at gmail.com>

> Hi Joshua,
> Sergey pointed me at the EvaluationContext concept yesterday. I tend to
> see this as a "memory manager", to say it in my own words: It handles the
> creation and destruction of data buffers, while the actual depsgraph
> provides lifetime information (when is a data buffer needed, perhaps
> scheduling can even be optimized for memory use).
> I would perhaps keep the types of data and operations separate: rather
> than have a matching context for each operation, have a dedicated set of
> data types (transform, mesh, pose, ...) that are used by various
> operations. That also leaves the possibility for operations to use multiple
> instances of data, e.g. a modifier combining several meshes. A
> operation-based context type might dictate data structure unnecessarily.
> The subgraph concept still seems a bit fuzzy to me. I would suggest to
> start by defining what features we want to see implemented, rather then
> presenting the implementation first and then explaining what it can be used
> for.
> - I mentioned evaluation for different purposes in the original mail (like
> render vs simulation). Some of these situations
> On Fri, Sep 18, 2015 at 2:53 AM, Joshua Leung <aligorith at gmail.com> wrote:
>> Hi Lukas,
>> Many of the points you've raised here are indeed things I've thought about
>> at some stage, and mostly agree with.
>> Here's the scheme I was originally thinking of (but which we
>> didn't/haven't
>> implemented yet - for timing/project management reasons, to get an initial
>> version working first).   Warning: LONG TEXT AHEAD!
>> Regards,
>> Joshua
>> -----------------------------------------------------------
>> == 1) Data Ready Nodes ==
>> Similar to your data nodes, we have some explicit nodes in the graph which
>> signal when certain geometry blobs are "ready to use". Render engines and
>> the viewport drawing code could register handlers (perhaps as graph nodes,
>> or otherwise as plain callbacks) which get called when these "ready to
>> use"
>> nodes are reached. The idea is that as soon as some data becomes
>> available,
>> downstream users can pull that geometry and start working with it.
>> This push-based scheme is partly motivated by the need to have some way to
>> keep memory usage down in geometry-heavy scenes, and particularly for the
>> duplicators/particle instancing cases. By having a way of allowing render
>> engines to take each piece of geometry as it becomes available, we can
>> minimise the amount of geometry that needs to be concurrently stored as
>> both Blender-side data and render-converted data. That is, instead of
>> having to convert the entire scene database to tesselated geometry that
>> the
>> render engine then takes and converts into it's own internal format (i.e.
>> we now have 2x the data), we only have the Blender-tesselated stuff around
>> for as long as it takes for the render engine to convert it into whatever
>> format it needs.
>> == 2) Registering Interest + Pruning Eval Paths ==
>> Prior to evaluating the depsgraph, whoever is requesting data to be
>> evaluated via the depsgraph must register their interest for that data.
>> This can be done in the form of registering callbacks for the relevant
>> data
>> nodes (as mentioned above). An example of this are the viewport requesting
>> geometry and/or material colours for visible objects.
>> We then "filter" the graph (or do some other processing) by only taking
>> the
>> paths which lead to these data nodes are included in the graph to execute.
>> For example, if we're only interested in the transforms for two specific
>> bones (e.g. for plotting their motion paths), we only need to deal with
>> the
>> subgraph with paths between any "dirty" nodes and the outputs required.
>> (In the original depsgraph prototypes I was doing, this stuff was
>> basically
>> the whole "querying and filtering" API, which was intended to take the
>> original "full" depsgraph you have in the scene, and extract out the set
>> of
>> nodes + relations needed for some specific evaluation problem - like
>> iteratively evaluating a rig to solve the nasty
>> posespace-to-transformchannels inverse problem. The filtered graph would
>> then just contain a reduced set of nodes + relations, and could be set off
>> to work in a background thread on a separate copy of the data; since it
>> doesn't contain unnecessary stuff, we save time trying to prevent
>> unnecessary nodes from evaluating.)
>> Where does this registration of interest occur? Well, one thing that we've
>> learned is that the depsgraph nodes themselves shouldn't be used to store
>> data state stuff like this; so that means...
>> == 3) Evaluation Contexts => Data Stores ==
>> Disclaimer: IIRC, this seems to be one of the parts where Sergey and I
>> have
>> slightly different ideas about what should happen. It is also one of the
>> more invasive/work intensive paths forward.
>> If you've had a look at the evaluation callbacks recently, you'll have
>> noticed that we have a thing called "EvaluationContext" (or a similar
>> name)
>> which gets passed around. It was introduced as a way to help the various
>> update processes distinguish between viewport drawing, preview rendering,
>> and proper rendering. Currently, it also gets passed out to all
>> OperationDepsNode callbacks as the first arg, laying the foundations for
>> the inevitable (since it will be needed in some form for those).
>> The idea is to extend the usage of this thing, and make it a full-blown
>> "data store". All data access/management during evaluation passes through
>> it:
>>  A) Every time some evaluation process needs some piece of data, it asks
>> the EvaluationContext to give it to them. If necessary, we can distinguish
>> between requests for "reading" results, and for "writing" results.  -->
>> Sergey's "Copy on Write" stuff could fit in here seamlessly, as we'd be
>> effectively mediating data access.
>>  B) By making evaluation operations work on the data provided by the
>> EvaluationContext instead of the DNA copies directly, we solve many of the
>> concurrency/instancing problems we have now, as we can simply use a
>> separate evaluation context for each use case (e.g. one for render, one
>> for
>> viewport, one for baking). Of course, we now have to make a special
>> exception (or maybe not? it could work via standard interest registration)
>> here so that viewport evaluation results get flushed back down to the DNA,
>> so that all existing tools continue to work the way they have.
>>  C) Every time you evaluate the depsgraph, you pass it an Evaluation
>> Context object. When creating the evaluation context, you specify what the
>> evaluation is for (e.g. render, viewport, baking, background-calculations,
>> etc.), register interest in results to get out of the evaluation (e.g.
>> geometry for objects 1,2,3 and the main character's rig), as well as any
>> callbacks to execute when certain data becomes available (e.g.
>> mesh/curve/surface geometry, materials, etc.). In response, the evaluation
>> context will create the relevant DataContexts in preparation/anticipation
>> for their availability once evaluation has completed.
>> So, what are these "DataContexts"? What is stored in the
>> EvaluationContext,
>> and what about intermediate results?
>>   * To support the granular operations we're now performing, we need
>> somewhere to store intermediate results used between operations. Usually,
>> these intermediate results require a bunch of "related products". We
>> encapsulate each clump of such things as a "DataContext".
>>   * Examples of "DataContexts" are ParametersContext, TransformContext,
>> GeometryContext, PoseContext, etc. Basically, for each Component type/node
>> in the depsgraph, we have a corresponding DataContext.
>>   For example, the TransformContext would store the current 4x4 matrix
>> ("x-form" to use the Dreamworks terminology), along with the constraint
>> stack eval stuff - bConstraintOb, etc.
>>  Another example is the GeometryContext, which would store the
>> DerivedMesh,
>> DispList, and/or Path data.
>>  * One special class of DataContexts are the TimeSource contexts. These
>> store the current frame + subframe => time float/double value. Again, each
>> TimeSource context corresponds to one TimeSource node. Therefore, the
>> operation on a TimeSourceDepsNode is that it either sets/updates the time
>> stored in the timesource (for the primary), or that it computes the time
>> it
>> stores (e.g. for the secondary ones used for doing time offset animation,
>> etc.)
>>   * DataContexts usually get created/initialised during the "init"
>> operations for each Component. The "ready" or "done" operation for each
>> component therefore triggers the "ready to use" callbacks that may have
>> been registered against that data. Those callbacks will then receive a
>> pointer to the relevant DataContext containing the data they requested to
>> be evaluated.
>>   * If necessary, more than one data context may be created per node (i.e.
>> to handle intermediate results that need to be used by two different
>> forking evaluation paths). The exact details of how that would work would
>> require a bit more thinking...
>>   * All DataContexts are stored in the EvaluationContext, and are retried
>> by using some kind of "data access key". This would probably be similar to
>> the things we use now for finding nodes in the depsgraph, but will
>> probably
>> need additional info to handle separate instances of duplis or other
>> dynamically generated stuff.
>> == 4) Subgraphs and Evaluation Contexts ==
>> The general idea is that each subgraph will get its own evaluation
>> context.
>> A subgraph is typically something like a background set, or maybe a group
>> (which doesn't interact with the rest of the world).
>> It is also possible to just have subgraphs evaluated in the same
>> evaluation
>> context, just that the access keys would need extra qualifications to
>> ensure that we're accessing the copies of the data for the subgraphs and
>> not the main graph.
>> On Fri, Sep 18, 2015 at 2:18 AM, Lukas Tönne <lukas.toenne at gmail.com>
>> wrote:
>> > I want to argue for adding a new concept to the new depsgraph design,
>> which
>> > i tentatively call "Data Nodes".
>> >
>> > The new dependency graph currently supports most of the old depsgraph's
>> > functionality and has fixed some issues relating to pose animation, but
>> has
>> > not yet started to actually add finer detailed operations or actual new
>> > functionality. In fact, some important parts are still missing (e.g.
>> > pruning of hidden object evaluation), and some features have been ported
>> > from the old implementation in the same awkward fashion (layers,
>> uber-eval
>> > nodes).
>> >
>> > Note for clarity: I will use "dependency graph" to mean the NEW
>> dependency
>> > graph, and prefix it with "old" when referring to the old
>> implementation.
>> >
>> > A logical expectation would be to handle modifier evaluations and later
>> > nodes through scheduled operations in the depsgraph. This would mean
>> that a
>> > node graph (aka node "tree") is translated into a set of interdependent
>> > operations, which are scheduled for threaded execution and can run in
>> > parallel where they don't depend on each other. There are provisions in
>> the
>> > depsgraph for individual modifier operations, which are currently a
>> simple
>> > chain - but these are just stubs and do not yet create actual data.
>> >
>> > A typical case with nodes is that a lot of the nodes in a graph are not
>> > actually used for the final result. Users often create whole branches of
>> > nodes that are either muted or not connected to some output node, i.e.
>> they
>> > should not be evaluated. The current definition of "dependency" makes it
>> > difficult to implement pruning in a consistent way. One might consider
>> to
>> > actually rebuild the whole dependency graph every time the set of active
>> > nodes is changed and completely leave away unused nodes. Information on
>> > visibility and such would not need to be stored in nodes at all, any
>> node
>> > in the graph is also used.
>> >
>> > However, the depsgraph is supposed to be a permanent description of the
>> > scene relations that is not rebuilt very often. More importantly: the
>> > depsgraph schedules operations for a whole range of situations, like
>> > - property changes
>> > - layer and object visibility changes
>> > - time changes
>> > - incremental simulation steps
>> > - render database updates
>> >
>> > All of these situations can have different "used" data (an object may be
>> > invisible during renders, simulation steps can apply only to rigid
>> bodies,
>> > etc.). Building a specialized depsgraph for each of them is not very
>> > desirable.
>> >
>> > The set of necessary operations to be scheduled is not just defined by
>> > "what has changed", but also by "what is needed/visible". Note how both
>> of
>> > these labels are written in terms of _data_ rather than operations. In
>> the
>> > dependency graph all data is the result of a single operation, so we can
>> > just as well use "operation" nodes to represent a piece of data that
>> this
>> > operation writes to, like object transformations, a DerivedMesh, render
>> > results. The trouble with the strict operation nodes in the depsgraph is
>> > that no explicit "end points" for data exist, which tie data to its
>> final
>> > purpose (like the derivedFinal mesh in objects). In consequence,
>> > backtracking used-ness of operations based on final visibility is not
>> > possible without fully rebuilding the depsgraph.
>> >
>> > Beside lacking such "data nodes", the current way of forward-propagating
>> > ("flushing") evaluation tags through the dependency nodes also needs to
>> be
>> > augmented by a backward-propagated "used node" set. Currently, a node is
>> > always scheduled if any of its parent nodes is scheduled (i.e. some
>> input
>> > data has changed), but if a child node isn't actually used the parents
>> will
>> > still be scheduled. This wasn't a problem in the old depsgraph, because
>> of
>> > the coarse nodes which could store a coherent visibility flag for each
>> > individual node. With the finer resolution in the new depsgraph and the
>> > expected differentiation in evaluation cases the problem becomes more
>> > apparent.
>> >
>> > Finally, the addition of explicit data nodes could solve a big design
>> > problem with generated data in Blender: All the current depsgraph
>> > operations use _implicit_ references to data in the DNA for storing
>> runtime
>> > results (mostly overriding obmat and the derivedFinal mesh result in
>> > Object). Data nodes could help manage transient runtime data. The
>> operation
>> > scheduling state can be used to manage the lifetime of such data
>> > efficiently, to avoid overhead from keeping unnecessary buffers
>> allocated.
>> > Multiple variants of objects (branching operations) are possible if
>> data is
>> > not glued to DNA instances.
>> > _______________________________________________
>> > Bf-committers mailing list
>> > Bf-committers at blender.org
>> > http://lists.blender.org/mailman/listinfo/bf-committers
>> >
>> _______________________________________________
>> Bf-committers mailing list
>> Bf-committers at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-committers

More information about the Bf-committers mailing list