[Bf-committers] Data Nodes: Extending the New Depsgraph Design

Thu Sep 17 16:18:13 CEST 2015

I want to argue for adding a new concept to the new depsgraph design, which
i tentatively call "Data Nodes".

The new dependency graph currently supports most of the old depsgraph's
functionality and has fixed some issues relating to pose animation, but has
not yet started to actually add finer detailed operations or actual new
functionality. In fact, some important parts are still missing (e.g.
pruning of hidden object evaluation), and some features have been ported
from the old implementation in the same awkward fashion (layers, uber-eval
nodes).

Note for clarity: I will use "dependency graph" to mean the NEW dependency
graph, and prefix it with "old" when referring to the old implementation.

A logical expectation would be to handle modifier evaluations and later
nodes through scheduled operations in the depsgraph. This would mean that a
node graph (aka node "tree") is translated into a set of interdependent
operations, which are scheduled for threaded execution and can run in
parallel where they don't depend on each other. There are provisions in the
depsgraph for individual modifier operations, which are currently a simple
chain - but these are just stubs and do not yet create actual data.

A typical case with nodes is that a lot of the nodes in a graph are not
actually used for the final result. Users often create whole branches of
nodes that are either muted or not connected to some output node, i.e. they
should not be evaluated. The current definition of "dependency" makes it
difficult to implement pruning in a consistent way. One might consider to
actually rebuild the whole dependency graph every time the set of active
nodes is changed and completely leave away unused nodes. Information on
visibility and such would not need to be stored in nodes at all, any node
in the graph is also used.

However, the depsgraph is supposed to be a permanent description of the
scene relations that is not rebuilt very often. More importantly: the
depsgraph schedules operations for a whole range of situations, like
- property changes
- layer and object visibility changes
- time changes
- incremental simulation steps
- render database updates

All of these situations can have different "used" data (an object may be
invisible during renders, simulation steps can apply only to rigid bodies,
etc.). Building a specialized depsgraph for each of them is not very
desirable.

The set of necessary operations to be scheduled is not just defined by
"what has changed", but also by "what is needed/visible". Note how both of
these labels are written in terms of _data_ rather than operations. In the
dependency graph all data is the result of a single operation, so we can
just as well use "operation" nodes to represent a piece of data that this
operation writes to, like object transformations, a DerivedMesh, render
results. The trouble with the strict operation nodes in the depsgraph is
that no explicit "end points" for data exist, which tie data to its final
purpose (like the derivedFinal mesh in objects). In consequence,
backtracking used-ness of operations based on final visibility is not
possible without fully rebuilding the depsgraph.

Beside lacking such "data nodes", the current way of forward-propagating
("flushing") evaluation tags through the dependency nodes also needs to be
augmented by a backward-propagated "used node" set. Currently, a node is
always scheduled if any of its parent nodes is scheduled (i.e. some input
data has changed), but if a child node isn't actually used the parents will
still be scheduled. This wasn't a problem in the old depsgraph, because of
the coarse nodes which could store a coherent visibility flag for each
individual node. With the finer resolution in the new depsgraph and the
expected differentiation in evaluation cases the problem becomes more
apparent.

Finally, the addition of explicit data nodes could solve a big design
problem with generated data in Blender: All the current depsgraph
operations use _implicit_ references to data in the DNA for storing runtime
results (mostly overriding obmat and the derivedFinal mesh result in
Object). Data nodes could help manage transient runtime data. The operation
scheduling state can be used to manage the lifetime of such data
efficiently, to avoid overhead from keeping unnecessary buffers allocated.
Multiple variants of objects (branching operations) are possible if data is
not glued to DNA instances.