[Bf-cycles] Cycles contributor meeting notes

Brecht Van Lommel brechtvanlommel at pandora.be
Sat Nov 5 02:04:30 CET 2016

Hi all,

Here's my notes from the Cycles contributor meeting at the Blender
Conference. Feel free to add or correct things that I missed or got

We created a roadmap, which I've added in the wiki here. If there's
other things being worked on we can add them there still, this is
mostly based on what we discussed at the meeting.



Generally not a lot of complaints about the contribution process.

Code reviews can be a bottleneck, especially since it has been mostly
Sergey and Brecht doing them. Would be great if other developers would
help doing code reviews more.

We don't want to drop quality standards though, still like to stay
relatively strict and not accept half finished features.

Code documentation and wiki is lacking, we'll try to update it in the
coming weeks. Stefan Werner reports that lack of documentation was not
a real problem for him in integration. This may not be true for
everyone though.


We could use more regression tests, but also automated performance
tests. First step would be to make a script that lets developers do
more automated benchmarking locally.

We would also like to have this running every night, posting stats and
graphs so we can see regressions and progression. Also would like to
run this on as many different GPUs and operating systems as possible
to catch GPU issues sooner.

This would help speed up code reviews, since a lot of that now is
performance testing.

The Blender Institute would be the logical place to put servers for
this, but issue is that this effectively means Sergey would have to
set up and maintain the hardware, and he's busy enough already. If we
could find a way to let someone else do this that would be great.


Poser integration is using this already, with some modifications, no
one else seems to be using it in production. They might be able to
contribute back changes to Cycles master to make this more complete.
Would like to be able to write out scenes exported from host
application, and then read them back, in a way that's mostly

File format itself doesn't matter too much, might switch to something
else than XML if needed. But as long as the Cycles API can read/write
it, it's not a big deal.

Not an immediate need for this in Blender, but for exchanging between
applications and standalone this would be interesting.


File format for exchanging materials would be useful. Some possibilities are:

Cycles XML: covers all data we need with no conversions, easiest if we
only need it for Cycles.
NVidia MDL: already mature and comes with a material library, but no
open source implementation.
MaterialX: will be open source, but is still being developed.


Tangent animation and Gottfried mention artistic control being
important, being able to do things in a renderer that you can't do in

Light linking is one part of that, we will review the patch and get it
into master. Patch needs to be updated for Blender 2.78 first however.

Even more flexible solution would be ray tagging, though is
incompatible with bidirectional path tracing. Probably we want to
avoid adding tricks that are incompatible with that, but can analyze
each specific use case. Often there may exist a different solution
that is compatible.


We need a more flexible AOV render API in Blender, for Cycles and
other external renderers. Once that exists, implementing light groups
becomes a lot easier. Light groups are a feature wanted by Tangent

Lukas has done initial work on both the AOV render API in Blender and
light groups. Some design issues to work out for the AOV render API.


We think this will only get more important as time goes on. Especially
for still images it would now already be useful.

However for animation, it's not really solving problems that the
Blender Institute or Tangent Animation have now. There's more
important optimizations that can be done still, render times are long
enough already for typical Blender project budgets.

It would be good to incrementally work towards this, refactor some
code like light or BSDFs to be compatible, maybe do some testing with
a prototype. But unlikely to be a big priority for developers present
at the meeting now.


This is an important target for Tangent Animation. In the Poser
integration there already exists an alternate texture system backend
that use OpenImageIO and includes some SVM modifications. Could
cooperate to get this finished and into Cycles master.

Motivation: users have very high resolution textures and don't want to
have to worry about scaling them down, with a texture cache that's

The OpenImageIO texture system only works on the CPU, so GPU
implementation would be later.

For optimal texture cache performance, we need to good ray
differentials, and they seem to be somewhat broken now.


UDIM seems most requested. Is already implemented in OpenImageIO, so
would get this for free if we use the OpenImageIO texture system
(though again CPU only).

For Blender this requires also changes to the UV editor and viewport
drawing, Cycles side changes would be smaller.

Ptex seems to be not as popular anymore, maybe UDIM won as the standard?


Still a lot of work to do to make this part of Cycles production
ready. Rendering OpenVDB would be great for rendering simulations from
e.g. Houdini.

In terms of performance and noise also still many optimizations
possible, not much work has been done here so far. Also some artifacts
to solve still, precision issues with non-watertight intersections, ..


For production there is a real need to be able to analyze performance.
Lots of things we could do here:
* Memory usage stats
* Performance counters
* Time based sampling (but how to scale it for a long running render?)
* HTML reports or display in Blender
* In principle can all be done with lower overhead compared to total
render time so it can be always on


Being worked on by Mai. Now already working on the CPU as well. It's a
bit slower than megakernel, probably due to extra memory access.

Besides improving AMD GPU rendering, the split kernel also makes it
possible to implement some interesting things on all devices.
Wavefront path tracing, ray reordering, packet tracing, different ray
tracing kernels, better cache coherence for micropolygons or textures,
... .

These can make up for the extra overhead from the split kernel
compared to the megakernel, and improve performance overall.


It is difficult for Cycles to keep up with all the optimizations that
Intel, AMD and NVidia are doing in their raytracing libraries. So far
we have been maintaining our own BVH and ray tracing code and
integrating code from other libraries.

But we are behind in motion blur performance, taking advantage of wide
SIMD on CPUs, performance on AMD cards, and possibly other areas.

A big reason for that is that we support CPU, CUDA, OpenCL (and maybe
Metal in the future?), and there exists no library that covers all
those targets. The split kernel will make dropping in external ray
tracing libraries a bit easier. Stefan may investigate this for the
Poser integration.

Still unclear if for Blender Cycles this is something we could do, due
to maintenance and complexity issues, we might also just try to copy
code. But interesting to try nonetheless, to measure performance
differences and see how well it fits.


How much time do we need to spend getting features to work on the GPU
versus adding more features on the CPU?

Generally we want all features to work on all devices, but we
shouldn't block features just because they are ready for the CPU but
have no GPU implementation yet. Better accept them anyway and
incrementally support more features on the GPU.


It appears unlikely that Apple will upgrade their OpenCL version in
the near future, and AMD can't upgrade it on their own. Moving forward
this will only get more and more outdated.

RadeonRays may add a Metal backend. We may need to do the same for
Cycles eventually, though no concrete plans here yet. Metal compute
also has some limitations still, but that seems to be improving.


Philipp Slusallek mentioned AnyDSL which his research group is working
on. It provides an abstraction for CPU / GPU, where high level code
can compile down to highly optimized code for each device, even
outperforming hand tuned code:

They also have a wavefront path tracer implemented using this system.
We might be able to cooperate in one way or another. They might be
able to test their research by integrating it into Cycles and get some
better feedback about performance in real world scenes, we may be able
to adopt it or use ideas from it.

Still quite speculative and unclear if this research project is
something we could rely on being maintained long enough to depend on
for Cycles, but it's an interesting technology that I wish I had when
I started writing Cycles.


Adding a shader requires changes to too many files and code is
duplicated for SVM, OSL, GLSL. There is some code reorganization we
could do to reduce the number of files to modify. However to solve the
duplication entirely we'd need some major macro magic or a custom
intermediate compiler or code generator that abstracts the different

Unclear how to tackle this in the near term, maybe some incremental
improvements are possible.


The Blender viewport project also has some implications for Cycles,
Dalai will make a proposal for that.

More information about the Bf-cycles mailing list