[Bf-committers] Data Nodes: Extending the New Depsgraph Design

Fri Sep 18 10:54:16 CEST 2015

Hi Joshua,

Sergey pointed me at the EvaluationContext concept yesterday. I tend to see
this as a "memory manager", to say it in my own words: It handles the
creation and destruction of data buffers, while the actual depsgraph
provides lifetime information (when is a data buffer needed, perhaps
scheduling can even be optimized for memory use).

I would perhaps keep the types of data and operations separate: rather than
have a matching context for each operation, have a dedicated set of data
types (transform, mesh, pose, ...) that are used by various operations.
That also leaves the possibility for operations to use multiple instances
of data, e.g. a modifier combining several meshes. A operation-based
context type might dictate data structure unnecessarily.

The subgraph concept still seems a bit fuzzy to me. I would suggest to
start by defining what features we want to see implemented, rather then
presenting the implementation first and then explaining what it can be used
for.
- I mentioned evaluation for different purposes in the original mail (like
render vs simulation). Some of these situations

On Fri, Sep 18, 2015 at 2:53 AM, Joshua Leung <aligorith at gmail.com> wrote:

> Hi Lukas,
>
> Many of the points you've raised here are indeed things I've thought about
> at some stage, and mostly agree with.
>
> Here's the scheme I was originally thinking of (but which we didn't/haven't
> implemented yet - for timing/project management reasons, to get an initial
> version working first).   Warning: LONG TEXT AHEAD!
>
> Regards,
> Joshua
>
> -----------------------------------------------------------
>
>
> == 1) Data Ready Nodes ==
> Similar to your data nodes, we have some explicit nodes in the graph which
> signal when certain geometry blobs are "ready to use". Render engines and
> the viewport drawing code could register handlers (perhaps as graph nodes,
> or otherwise as plain callbacks) which get called when these "ready to use"
> nodes are reached. The idea is that as soon as some data becomes available,
> downstream users can pull that geometry and start working with it.
>
> This push-based scheme is partly motivated by the need to have some way to
> keep memory usage down in geometry-heavy scenes, and particularly for the
> duplicators/particle instancing cases. By having a way of allowing render
> engines to take each piece of geometry as it becomes available, we can
> minimise the amount of geometry that needs to be concurrently stored as
> both Blender-side data and render-converted data. That is, instead of
> having to convert the entire scene database to tesselated geometry that the
> render engine then takes and converts into it's own internal format (i.e.
> we now have 2x the data), we only have the Blender-tesselated stuff around
> for as long as it takes for the render engine to convert it into whatever
> format it needs.
>
>
> == 2) Registering Interest + Pruning Eval Paths ==
> Prior to evaluating the depsgraph, whoever is requesting data to be
> evaluated via the depsgraph must register their interest for that data.
> This can be done in the form of registering callbacks for the relevant data
> nodes (as mentioned above). An example of this are the viewport requesting
> geometry and/or material colours for visible objects.
>
> We then "filter" the graph (or do some other processing) by only taking the
> paths which lead to these data nodes are included in the graph to execute.
> For example, if we're only interested in the transforms for two specific
> bones (e.g. for plotting their motion paths), we only need to deal with the
> subgraph with paths between any "dirty" nodes and the outputs required.
>
> (In the original depsgraph prototypes I was doing, this stuff was basically
> the whole "querying and filtering" API, which was intended to take the
> original "full" depsgraph you have in the scene, and extract out the set of
> nodes + relations needed for some specific evaluation problem - like
> iteratively evaluating a rig to solve the nasty
> posespace-to-transformchannels inverse problem. The filtered graph would
> then just contain a reduced set of nodes + relations, and could be set off
> to work in a background thread on a separate copy of the data; since it
> doesn't contain unnecessary stuff, we save time trying to prevent
> unnecessary nodes from evaluating.)
>
> Where does this registration of interest occur? Well, one thing that we've
> learned is that the depsgraph nodes themselves shouldn't be used to store
> data state stuff like this; so that means...
>
>
> == 3) Evaluation Contexts => Data Stores ==
> Disclaimer: IIRC, this seems to be one of the parts where Sergey and I have
> slightly different ideas about what should happen. It is also one of the
> more invasive/work intensive paths forward.
>
> If you've had a look at the evaluation callbacks recently, you'll have
> noticed that we have a thing called "EvaluationContext" (or a similar name)
> which gets passed around. It was introduced as a way to help the various
> update processes distinguish between viewport drawing, preview rendering,
> and proper rendering. Currently, it also gets passed out to all
> OperationDepsNode callbacks as the first arg, laying the foundations for
> the inevitable (since it will be needed in some form for those).
>
> The idea is to extend the usage of this thing, and make it a full-blown
> "data store". All data access/management during evaluation passes through
> it:
>  A) Every time some evaluation process needs some piece of data, it asks
> the EvaluationContext to give it to them. If necessary, we can distinguish
> between requests for "reading" results, and for "writing" results.  -->
> Sergey's "Copy on Write" stuff could fit in here seamlessly, as we'd be
> effectively mediating data access.
>
>  B) By making evaluation operations work on the data provided by the
> EvaluationContext instead of the DNA copies directly, we solve many of the
> concurrency/instancing problems we have now, as we can simply use a
> separate evaluation context for each use case (e.g. one for render, one for
> viewport, one for baking). Of course, we now have to make a special
> exception (or maybe not? it could work via standard interest registration)
> here so that viewport evaluation results get flushed back down to the DNA,
> so that all existing tools continue to work the way they have.
>
>  C) Every time you evaluate the depsgraph, you pass it an Evaluation
> Context object. When creating the evaluation context, you specify what the
> evaluation is for (e.g. render, viewport, baking, background-calculations,
> etc.), register interest in results to get out of the evaluation (e.g.
> geometry for objects 1,2,3 and the main character's rig), as well as any
> callbacks to execute when certain data becomes available (e.g.
> mesh/curve/surface geometry, materials, etc.). In response, the evaluation
> context will create the relevant DataContexts in preparation/anticipation
> for their availability once evaluation has completed.
>
> So, what are these "DataContexts"? What is stored in the EvaluationContext,
> and what about intermediate results?
>   * To support the granular operations we're now performing, we need
> somewhere to store intermediate results used between operations. Usually,
> these intermediate results require a bunch of "related products". We
> encapsulate each clump of such things as a "DataContext".
>
>   * Examples of "DataContexts" are ParametersContext, TransformContext,
> GeometryContext, PoseContext, etc. Basically, for each Component type/node
> in the depsgraph, we have a corresponding DataContext.
>
>   For example, the TransformContext would store the current 4x4 matrix
> ("x-form" to use the Dreamworks terminology), along with the constraint
> stack eval stuff - bConstraintOb, etc.
>
>  Another example is the GeometryContext, which would store the DerivedMesh,
> DispList, and/or Path data.
>
>  * One special class of DataContexts are the TimeSource contexts. These
> store the current frame + subframe => time float/double value. Again, each
> TimeSource context corresponds to one TimeSource node. Therefore, the
> operation on a TimeSourceDepsNode is that it either sets/updates the time
> stored in the timesource (for the primary), or that it computes the time it
> stores (e.g. for the secondary ones used for doing time offset animation,
> etc.)
>
>   * DataContexts usually get created/initialised during the "init"
> operations for each Component. The "ready" or "done" operation for each
> component therefore triggers the "ready to use" callbacks that may have
> been registered against that data. Those callbacks will then receive a
> pointer to the relevant DataContext containing the data they requested to
> be evaluated.
>
>   * If necessary, more than one data context may be created per node (i.e.
> to handle intermediate results that need to be used by two different
> forking evaluation paths). The exact details of how that would work would
> require a bit more thinking...
>
>   * All DataContexts are stored in the EvaluationContext, and are retried
> by using some kind of "data access key". This would probably be similar to
> the things we use now for finding nodes in the depsgraph, but will probably
> need additional info to handle separate instances of duplis or other
> dynamically generated stuff.
>
>
> == 4) Subgraphs and Evaluation Contexts ==
> The general idea is that each subgraph will get its own evaluation context.
> A subgraph is typically something like a background set, or maybe a group
> (which doesn't interact with the rest of the world).
>
> It is also possible to just have subgraphs evaluated in the same evaluation
> context, just that the access keys would need extra qualifications to
> ensure that we're accessing the copies of the data for the subgraphs and
> not the main graph.
>
>
>
> On Fri, Sep 18, 2015 at 2:18 AM, Lukas Tönne <lukas.toenne at gmail.com>
> wrote:
>
> > I want to argue for adding a new concept to the new depsgraph design,
> which
> > i tentatively call "Data Nodes".
> >
> > The new dependency graph currently supports most of the old depsgraph's
> > functionality and has fixed some issues relating to pose animation, but
> has
> > not yet started to actually add finer detailed operations or actual new
> > functionality. In fact, some important parts are still missing (e.g.
> > pruning of hidden object evaluation), and some features have been ported
> > from the old implementation in the same awkward fashion (layers,
> uber-eval
> > nodes).
> >
> > Note for clarity: I will use "dependency graph" to mean the NEW
> dependency
> > graph, and prefix it with "old" when referring to the old implementation.
> >
> > A logical expectation would be to handle modifier evaluations and later
> > nodes through scheduled operations in the depsgraph. This would mean
> that a
> > node graph (aka node "tree") is translated into a set of interdependent
> > operations, which are scheduled for threaded execution and can run in
> > parallel where they don't depend on each other. There are provisions in
> the
> > depsgraph for individual modifier operations, which are currently a
> simple
> > chain - but these are just stubs and do not yet create actual data.
> >
> > A typical case with nodes is that a lot of the nodes in a graph are not
> > actually used for the final result. Users often create whole branches of
> > nodes that are either muted or not connected to some output node, i.e.
> they
> > should not be evaluated. The current definition of "dependency" makes it
> > difficult to implement pruning in a consistent way. One might consider to
> > actually rebuild the whole dependency graph every time the set of active
> > nodes is changed and completely leave away unused nodes. Information on
> > visibility and such would not need to be stored in nodes at all, any node
> > in the graph is also used.
> >
> > However, the depsgraph is supposed to be a permanent description of the
> > scene relations that is not rebuilt very often. More importantly: the
> > depsgraph schedules operations for a whole range of situations, like
> > - property changes
> > - layer and object visibility changes
> > - time changes
> > - incremental simulation steps
> > - render database updates
> >
> > All of these situations can have different "used" data (an object may be
> > invisible during renders, simulation steps can apply only to rigid
> bodies,
> > etc.). Building a specialized depsgraph for each of them is not very
> > desirable.
> >
> > The set of necessary operations to be scheduled is not just defined by
> > "what has changed", but also by "what is needed/visible". Note how both
> of
> > these labels are written in terms of _data_ rather than operations. In
> the
> > dependency graph all data is the result of a single operation, so we can
> > just as well use "operation" nodes to represent a piece of data that this
> > operation writes to, like object transformations, a DerivedMesh, render
> > results. The trouble with the strict operation nodes in the depsgraph is
> > that no explicit "end points" for data exist, which tie data to its final
> > purpose (like the derivedFinal mesh in objects). In consequence,
> > backtracking used-ness of operations based on final visibility is not
> > possible without fully rebuilding the depsgraph.
> >
> > Beside lacking such "data nodes", the current way of forward-propagating
> > ("flushing") evaluation tags through the dependency nodes also needs to
> be
> > augmented by a backward-propagated "used node" set. Currently, a node is
> > always scheduled if any of its parent nodes is scheduled (i.e. some input
> > data has changed), but if a child node isn't actually used the parents
> will
> > still be scheduled. This wasn't a problem in the old depsgraph, because
> of
> > the coarse nodes which could store a coherent visibility flag for each
> > individual node. With the finer resolution in the new depsgraph and the
> > expected differentiation in evaluation cases the problem becomes more
> > apparent.
> >
> > Finally, the addition of explicit data nodes could solve a big design
> > problem with generated data in Blender: All the current depsgraph
> > operations use _implicit_ references to data in the DNA for storing
> runtime
> > results (mostly overriding obmat and the derivedFinal mesh result in
> > Object). Data nodes could help manage transient runtime data. The
> operation
> > scheduling state can be used to manage the lifetime of such data
> > efficiently, to avoid overhead from keeping unnecessary buffers
> allocated.
> > Multiple variants of objects (branching operations) are possible if data
> is
> > not glued to DNA instances.
> > _______________________________________________
> > Bf-committers mailing list
> > Bf-committers at blender.org
> > http://lists.blender.org/mailman/listinfo/bf-committers
> >
> _______________________________________________
> Bf-committers mailing list
> Bf-committers at blender.org
> http://lists.blender.org/mailman/listinfo/bf-committers
>