[Bf-committers] Data Nodes: Extending the New Depsgraph Design

Fri Sep 18 02:53:42 CEST 2015

Hi Lukas,

Many of the points you've raised here are indeed things I've thought about
at some stage, and mostly agree with.

Here's the scheme I was originally thinking of (but which we didn't/haven't
implemented yet - for timing/project management reasons, to get an initial
version working first).   Warning: LONG TEXT AHEAD!

Regards,
Joshua

-----------------------------------------------------------

== 1) Data Ready Nodes ==
Similar to your data nodes, we have some explicit nodes in the graph which
signal when certain geometry blobs are "ready to use". Render engines and
the viewport drawing code could register handlers (perhaps as graph nodes,
or otherwise as plain callbacks) which get called when these "ready to use"
nodes are reached. The idea is that as soon as some data becomes available,
downstream users can pull that geometry and start working with it.

This push-based scheme is partly motivated by the need to have some way to
keep memory usage down in geometry-heavy scenes, and particularly for the
duplicators/particle instancing cases. By having a way of allowing render
engines to take each piece of geometry as it becomes available, we can
minimise the amount of geometry that needs to be concurrently stored as
both Blender-side data and render-converted data. That is, instead of
having to convert the entire scene database to tesselated geometry that the
render engine then takes and converts into it's own internal format (i.e.
we now have 2x the data), we only have the Blender-tesselated stuff around
for as long as it takes for the render engine to convert it into whatever
format it needs.

== 2) Registering Interest + Pruning Eval Paths ==
Prior to evaluating the depsgraph, whoever is requesting data to be
evaluated via the depsgraph must register their interest for that data.
This can be done in the form of registering callbacks for the relevant data
nodes (as mentioned above). An example of this are the viewport requesting
geometry and/or material colours for visible objects.

We then "filter" the graph (or do some other processing) by only taking the
paths which lead to these data nodes are included in the graph to execute.
For example, if we're only interested in the transforms for two specific
bones (e.g. for plotting their motion paths), we only need to deal with the
subgraph with paths between any "dirty" nodes and the outputs required.

(In the original depsgraph prototypes I was doing, this stuff was basically
the whole "querying and filtering" API, which was intended to take the
original "full" depsgraph you have in the scene, and extract out the set of
nodes + relations needed for some specific evaluation problem - like
iteratively evaluating a rig to solve the nasty
posespace-to-transformchannels inverse problem. The filtered graph would
then just contain a reduced set of nodes + relations, and could be set off
to work in a background thread on a separate copy of the data; since it
doesn't contain unnecessary stuff, we save time trying to prevent
unnecessary nodes from evaluating.)

Where does this registration of interest occur? Well, one thing that we've
learned is that the depsgraph nodes themselves shouldn't be used to store
data state stuff like this; so that means...

== 3) Evaluation Contexts => Data Stores ==
Disclaimer: IIRC, this seems to be one of the parts where Sergey and I have
slightly different ideas about what should happen. It is also one of the
more invasive/work intensive paths forward.

If you've had a look at the evaluation callbacks recently, you'll have
noticed that we have a thing called "EvaluationContext" (or a similar name)
which gets passed around. It was introduced as a way to help the various
update processes distinguish between viewport drawing, preview rendering,
and proper rendering. Currently, it also gets passed out to all
OperationDepsNode callbacks as the first arg, laying the foundations for
the inevitable (since it will be needed in some form for those).

The idea is to extend the usage of this thing, and make it a full-blown
"data store". All data access/management during evaluation passes through
it:
 A) Every time some evaluation process needs some piece of data, it asks
the EvaluationContext to give it to them. If necessary, we can distinguish
between requests for "reading" results, and for "writing" results.  -->
Sergey's "Copy on Write" stuff could fit in here seamlessly, as we'd be
effectively mediating data access.

 B) By making evaluation operations work on the data provided by the
EvaluationContext instead of the DNA copies directly, we solve many of the
concurrency/instancing problems we have now, as we can simply use a
separate evaluation context for each use case (e.g. one for render, one for
viewport, one for baking). Of course, we now have to make a special
exception (or maybe not? it could work via standard interest registration)
here so that viewport evaluation results get flushed back down to the DNA,
so that all existing tools continue to work the way they have.

 C) Every time you evaluate the depsgraph, you pass it an Evaluation
Context object. When creating the evaluation context, you specify what the
evaluation is for (e.g. render, viewport, baking, background-calculations,
etc.), register interest in results to get out of the evaluation (e.g.
geometry for objects 1,2,3 and the main character's rig), as well as any
callbacks to execute when certain data becomes available (e.g.
mesh/curve/surface geometry, materials, etc.). In response, the evaluation
context will create the relevant DataContexts in preparation/anticipation
for their availability once evaluation has completed.

So, what are these "DataContexts"? What is stored in the EvaluationContext,
and what about intermediate results?
  * To support the granular operations we're now performing, we need
somewhere to store intermediate results used between operations. Usually,
these intermediate results require a bunch of "related products". We
encapsulate each clump of such things as a "DataContext".

  * Examples of "DataContexts" are ParametersContext, TransformContext,
GeometryContext, PoseContext, etc. Basically, for each Component type/node
in the depsgraph, we have a corresponding DataContext.

  For example, the TransformContext would store the current 4x4 matrix
("x-form" to use the Dreamworks terminology), along with the constraint
stack eval stuff - bConstraintOb, etc.

 Another example is the GeometryContext, which would store the DerivedMesh,
DispList, and/or Path data.

 * One special class of DataContexts are the TimeSource contexts. These
store the current frame + subframe => time float/double value. Again, each
TimeSource context corresponds to one TimeSource node. Therefore, the
operation on a TimeSourceDepsNode is that it either sets/updates the time
stored in the timesource (for the primary), or that it computes the time it
stores (e.g. for the secondary ones used for doing time offset animation,
etc.)

  * DataContexts usually get created/initialised during the "init"
operations for each Component. The "ready" or "done" operation for each
component therefore triggers the "ready to use" callbacks that may have
been registered against that data. Those callbacks will then receive a
pointer to the relevant DataContext containing the data they requested to
be evaluated.

  * If necessary, more than one data context may be created per node (i.e.
to handle intermediate results that need to be used by two different
forking evaluation paths). The exact details of how that would work would
require a bit more thinking...

  * All DataContexts are stored in the EvaluationContext, and are retried
by using some kind of "data access key". This would probably be similar to
the things we use now for finding nodes in the depsgraph, but will probably
need additional info to handle separate instances of duplis or other
dynamically generated stuff.

== 4) Subgraphs and Evaluation Contexts ==
The general idea is that each subgraph will get its own evaluation context.
A subgraph is typically something like a background set, or maybe a group
(which doesn't interact with the rest of the world).

It is also possible to just have subgraphs evaluated in the same evaluation
context, just that the access keys would need extra qualifications to
ensure that we're accessing the copies of the data for the subgraphs and
not the main graph.

On Fri, Sep 18, 2015 at 2:18 AM, Lukas Tönne <lukas.toenne at gmail.com> wrote:

> I want to argue for adding a new concept to the new depsgraph design, which
> i tentatively call "Data Nodes".
>
> The new dependency graph currently supports most of the old depsgraph's
> functionality and has fixed some issues relating to pose animation, but has
> not yet started to actually add finer detailed operations or actual new
> functionality. In fact, some important parts are still missing (e.g.
> pruning of hidden object evaluation), and some features have been ported
> from the old implementation in the same awkward fashion (layers, uber-eval
> nodes).
>
> Note for clarity: I will use "dependency graph" to mean the NEW dependency
> graph, and prefix it with "old" when referring to the old implementation.
>
> A logical expectation would be to handle modifier evaluations and later
> nodes through scheduled operations in the depsgraph. This would mean that a
> node graph (aka node "tree") is translated into a set of interdependent
> operations, which are scheduled for threaded execution and can run in
> parallel where they don't depend on each other. There are provisions in the
> depsgraph for individual modifier operations, which are currently a simple
> chain - but these are just stubs and do not yet create actual data.
>
> A typical case with nodes is that a lot of the nodes in a graph are not
> actually used for the final result. Users often create whole branches of
> nodes that are either muted or not connected to some output node, i.e. they
> should not be evaluated. The current definition of "dependency" makes it
> difficult to implement pruning in a consistent way. One might consider to
> actually rebuild the whole dependency graph every time the set of active
> nodes is changed and completely leave away unused nodes. Information on
> visibility and such would not need to be stored in nodes at all, any node
> in the graph is also used.
>
> However, the depsgraph is supposed to be a permanent description of the
> scene relations that is not rebuilt very often. More importantly: the
> depsgraph schedules operations for a whole range of situations, like
> - property changes
> - layer and object visibility changes
> - time changes
> - incremental simulation steps
> - render database updates
>
> All of these situations can have different "used" data (an object may be
> invisible during renders, simulation steps can apply only to rigid bodies,
> etc.). Building a specialized depsgraph for each of them is not very
> desirable.
>
> The set of necessary operations to be scheduled is not just defined by
> "what has changed", but also by "what is needed/visible". Note how both of
> these labels are written in terms of _data_ rather than operations. In the
> dependency graph all data is the result of a single operation, so we can
> just as well use "operation" nodes to represent a piece of data that this
> operation writes to, like object transformations, a DerivedMesh, render
> results. The trouble with the strict operation nodes in the depsgraph is
> that no explicit "end points" for data exist, which tie data to its final
> purpose (like the derivedFinal mesh in objects). In consequence,
> backtracking used-ness of operations based on final visibility is not
> possible without fully rebuilding the depsgraph.
>
> Beside lacking such "data nodes", the current way of forward-propagating
> ("flushing") evaluation tags through the dependency nodes also needs to be
> augmented by a backward-propagated "used node" set. Currently, a node is
> always scheduled if any of its parent nodes is scheduled (i.e. some input
> data has changed), but if a child node isn't actually used the parents will
> still be scheduled. This wasn't a problem in the old depsgraph, because of
> the coarse nodes which could store a coherent visibility flag for each
> individual node. With the finer resolution in the new depsgraph and the
> expected differentiation in evaluation cases the problem becomes more
> apparent.
>
> Finally, the addition of explicit data nodes could solve a big design
> problem with generated data in Blender: All the current depsgraph
> operations use _implicit_ references to data in the DNA for storing runtime
> results (mostly overriding obmat and the derivedFinal mesh result in
> Object). Data nodes could help manage transient runtime data. The operation
> scheduling state can be used to manage the lifetime of such data
> efficiently, to avoid overhead from keeping unnecessary buffers allocated.
> Multiple variants of objects (branching operations) are possible if data is
> not glued to DNA instances.
> _______________________________________________
> Bf-committers mailing list
> Bf-committers at blender.org
> http://lists.blender.org/mailman/listinfo/bf-committers
>