[Bf-cycles] Micropolygon Displacement in Cycles

Fri Jun 26 03:58:00 CEST 2015

Hi Mai,

> There are variations on an LRU cache that work better with multi-threading
> which I'd like to try.

Good point.  I didn't go much beyond a basic LRU cache design.

> Also, I think most users of cycles will probably have
> a high enough memory-to-thread ratio to make per-thread cache feasible.
> Better scalability and GPU will probably need some architectural changes,
> but most systems using CPU should be fine without them.

I guess that's my main concern as far as caching goes: Cycles is
designed for both CPU and GPU, and IMO it makes sense to implement
things in a way that will eventually allow for a good GPU
implementation as well.  It would be a shame if displacements were
permanently a CPU-only feature.  Then again, maybe geometry caching
wouldn't be as unfriendly to GPU's as I'm imagining--I'm far from
expert in that space.  It just seems like it would invite cache
thrashing, though, among other things.

> Large displacements are indeed a problem. Even REYES renderers suffer from
> this. I think this can only be addressed by informing artists of the problem
> with careful documentation.

Reyes can down a fair bit with large displacements, yes.  But for ray
tracing it's an even bigger issue because it ruins the acceleration
structure.  If someone has an already-tessellated mesh, and they apply
even just moderately large displacements to it, the rendering will
seem to freeze.  I know this from experience, because it happened in
my earlier attempts at displacement mapping in Psychopath.

If the maximum displacement is 5x the average triangle size of the
model, you're looking at something in the neighborhood of a 100x slow
down when testing rays against that model (if I'm doing my
back-of-the-napkin math properly).  You have to push the displacement
bounds out 5x in each direction, so the leaf bounding boxes of the BVH
become 10x bigger in each dimension, which means on average they're
each going to overlap with 100 triangles even at the leaf nodes, which
means the BVH is useless for ruling out those 100 triangles and they
_all_ have to be tested against.  And 5x the average triangle size is
not unreasonable for a finely tessellated mesh.

I hope I'm explaining things well...

I recommend reading the paper "Ray-tracing Procedural Displacement
Shaders" by Heidrich et al. for a possible solution to this problem
(although they don't directly apply their approach to BVH's, it could
be done if enough information is stored at the BVH nodes).
Unfortunately, OSL doesn't do interval/affine arithmetic, so OSL
displacements couldn't be supported for such a method (unless such
math is added to OSL, which would be awesome!).

In any case, this is honestly the biggest hurdle to dynamic
displacements, IMO.  And it's one of the primary things I'm trying to
tackle in Psychopath.  It's not an easy problem.

> I still feel like it would be worthwhile to at least investigate the screen
> adaptive approach further.

Oh, absolutely!  Don't get me wrong, I would love to see this.  Again,
this is one of the main things I'm trying to do with Psychopath.  I
absolutely agree that it has large benefits if you can get it working
well.  I'm just trying to point out that it is a hard (and I think
open?) problem.  But for Cycles, which I tend to think of as a really
great implementation of current best-practices, I don't know if an
experimental approach to displacements is the best way forward--at
least for a first implementation.

But I don't mean to discourage you from trying.  I would love to see
what you come up with!  I'm just thinking from the perspective of
"What is most likely to get high-quality, reliable displacements into
stable Cycles releases in the near-term."  What you're talking about
is more along the lines of R&D.

> It has been used successfully in production environments

Micropolygon ray tracing has been used, yes.  Using it to render
curved surfaces doesn't have the same problems as displacements,
because the geometry bounds aren't affected.  The extent to which that
technique has been used with _displacements_ in production is, I
imagine somewhat less.  And in a Reyes context the models to be
displaced are likely to be built out of larger curved surface
primitives rather than pre-tessellated triangle soups, which means
that the practical size of displacements is also larger.  I think in a
Reyes context you also don't generally handle instancing in the same
way you want to with ray tracing, which simplified some things as
well.

> As for pre-tessellation, my biggest concern is memory usage. I don’t see an
> easy way to keep it down while also having pixel sized detail—although I may
> be overestimating the problem. Maybe there is some compression technique
> that could be used?

I think there are a few things that can be done here:

1. Given a primitive like a triangle, you know what its subdivided
topology is going to be if you subdivide it uniformly, so you don't
need to store additional connectivity information, you can just store
the vertices in a known order.  That's certainly not a huge space
saver, but it helps some.

2. Adaptive tessellation based on what the camera projection is going
to be.  This is pretty straight-forward, and in many cases with models
that are already highly tessellated you might not even end up having
to do any additional tessellation.

3. Adaptive tessellation based on displacement shape: e.g.
tessellating more finely in areas where the displacement has high
curvature.  This one is not so trivial, but I think it can likely be
done well for many common displacement patterns.

One thing to keep in mind is that there are basically two different
ways you can think of displacements:
- As a way of representing geometric information.
- As a modeling tool.

The pre-tessellation approach is more along the lines of #2.
Displacement mapping becomes a convenient modeling tool for artists,
albeit one that only kicks in at render-time, and that exploits
information that only the renderer knows for efficiency.  In this view
of the world, it's still the artist's responsibility to make sure they
"model" a scene that can fit into RAM.

> You mention
> tunable parameters... could you be more specific?

- Whether to tessellate uniformly or for screen-space size.
- Given either of the above, how finely to tessellate it in that
context (e.g. how many subdivisions or how many pixels a triangle
should roughly project to).
- A quality slider of some kind for adaptive tessellation based on
displacement shape (e.g. max angle between adjacent polygons).
- For any adaptive approach, max subdivisions (to prevent exploding on
accident).

In general, you want to keep the parameters as few and as
easy-to-understand as possible.  The above are just
off-the-top-of-my-head, and might not be ideal.

A final thought:

One of your primary concerns here is memory usage, yes?  I don't think
that's such an issue, however.  Consider a 4K render.  That's roughly
8.8 million pixels.  And let's say a triangle (accounting for BVH
overhead) takes up maybe 96 bytes.  You could have a single triangle
for each pixel of a 4K render for around 850MB of RAM.  And that's a
4K render.  For 2K renders it would only be 240MB.  Granted, with
actual scenes there will be depth complexity, and not only the things
on screen need to be tessellated.  But nevertheless, these are numbers
that seem quite reasonable for modern computers.  I think even scenes
with a lot of displaced geometry could be handled well with
pre-tessellation on a decent computer with several GB of ram.

I strongly suspect that displacements are less of a memory issue than
e.g. hair rendering (though I don't have any actual data to back that
up).

--Nathan

On Wed, Jun 17, 2015 at 8:27 PM, Mai Lavelle <mai.lavelle at gmail.com> wrote:
> Hi Nathan,
>
>
> Thank you for your reply.
>
>
> On Mon, May 4, 2015 at 7:59 PM, Nathan Vegdahl <cessen at cessen.com> wrote:
>>
>> Hi Mai,
>> I have some experience with displacement mapping in a ray tracer due to my
>> work on Psychopath.  So perhaps I can lend some insight.
>
> Psychopath is pretty interesting, love reading blogs like yours. Your
> insight is much appreciated :D
>>
>> There are a few problems with the geometry cache approach.  The first is
>> that it works very poorly with multi-threading.  With multi-threading you
>> either have to have a lock of some kind protecting the cache from data
>> races, or you have to have a separate cache per thread.  Neither of these
>> options scale well to very many cores (and certainly not the GPU!).
>
> There are variations on an LRU cache that work better with multi-threading
> which I'd like to try. Also, I think most users of cycles will probably have
> a high enough memory-to-thread ratio to make per-thread cache feasible.
> Better scalability and GPU will probably need some architectural changes,
> but most systems using CPU should be fine without them.
>>
>> The second problem is that it has a very steep performance cliff when the
>> cache starts thrashing, which isn't as unlikely as you might think.  Dicing
>> rates based on ray differentials helps with that (as per the Pixar paper),
>> but it's not a silver bullet, especially since even diffuse ray
>> differentials often need to be narrow to capture certain shadowing effects
>> properly (this was something I didn't expect when implementing adaptive
>> dicing  in Psychopath).
>
> This was disappointing to me. After reading your email and having a closer
> look at the math I did some experiments with mipmapping to see the impact
> differentials would have. If the differentials are too wide there is a
> blurring effect on indirect light (PDF of BSDF doesn't match PDF of sampled
> region, thus over/under valuing leading to blur). I don't think however that
> this blurring will be objectionable in all cases, and see no reason why some
> sort of quality setting can’t be added for this. There are also other things
> to try. I’d like to investigate having a maximum trace depth for displaced
> geometry which would help with cache thrashing.
>>
>> The third is that large dynamic displacements wreak havoc with BVH
>> quality, to the point that rendering slows to a crawl.  The problem is that
>> if you don't have the already-displaced geometry up front when building the
>> BVH, then you have to add padding to the geometry bounding boxes to account
>> for the displacement.  That's fine if the displacement is small compared to
>> the base geometry, but if it's large compared to the base geometry then it
>> makes all of the triangles' bounds significantly overlap with each other.
>
> Large displacements are indeed a problem. Even REYES renderers suffer from
> this. I think this can only be addressed by informing artists of the problem
> with careful documentation.
>>
>> None of these problems are insurmountable, I think.  But I suspect the
>> solutions involve large architectural shifts, and may require compromises in
>> other areas.
>>
>> In the end, I think pre-tessellation probably makes the most sense for
>> Cycles.  It can be done adaptively, with tunable parameters.  It has
>> drawbacks, sure.  But it's a proven technique in other production renderers,
>> and I think it fits best with Cycles' design.
>>
>> --Nathan
>
> I still feel like it would be worthwhile to at least investigate the screen
> adaptive approach further. It has been used successfully in production
> environments and the differentials it uses are also useful for texture
> caching. It definitely has problems, but I think if it can be gotten to work
> it has some nice benefits.
>
>
> As for pre-tessellation, my biggest concern is memory usage. I don’t see an
> easy way to keep it down while also having pixel sized detail—although I may
> be overestimating the problem. Maybe there is some compression technique
> that could be used? The best I can think of is to reduce the dice rate by
> the distance from camera, but there are situations where this wont work so
> well. Pre-tessellation is far simpler to implement and I already have some
> of it done, so I will finish it up first and see how it does. You mention
> tunable parameters... could you be more specific? Maybe you have thought of
> something I haven't?
>
>
> Best regards,
>
> Mai
>
>
>
>
>>
>> On Apr 29, 2015 2:33 PM, "Mai Lavelle" <mai.lavelle at gmail.com> wrote:
>>>
>>> Hi Ton, thank you for the reply.
>>>
>>>
>>> Sorry for not going into detail. Getting this to work is unlikely to be
>>> easy, theres a lot to consider...
>>>
>>>
>>> The ideal would be to have screen adaptive tessellation, where artists
>>> set a size in pixels for micropolygons. The renderer would then use ray
>>> differentials to calculate needed tessellation rate to match that size as
>>> best as possible with respect to visual impact on
>>> screen/reflections/refractions/etc. Some sort of cache would be used to
>>> store tessellated geometry. Would likely start off limited to CPU, do to
>>> trickiness of needing a dynamic cache. There are other techniques for screen
>>> adaptive tessellation but I don't think that they would perform as well.
>>> Screen adaptive tessellation would probably be the most desired by artists
>>> for its ease of use and the high level of detail it can provide, but its
>>> also the most difficult to implement. Details of this method can get quite
>>> involved...
>>>
>>>
>>> A much simpler approach is to pretessellate all geometry and send the
>>> entire tessellated mesh to the kernel. Major problem with this is it
>>> requires an enormous amount of memory to achieve detail. Bump mapping in
>>> addition to displacement helps a bit, but artifacts appear in the silhouette
>>> and tangent vectors which causes weirdness in specular highlights. Possible
>>> way to mitigate memory issues with pretessellation is to use frustum
>>> adaptive tessellation, but this probably isn't that useful do to shadows and
>>> reflections of objects outside of the frustum having under-tessellated
>>> silhouettes. It could be problematic for animation as well do to possibility
>>> of popping as objects cross the frustum boundary. Instances also pose a
>>> challenge here. In any case it might be a good idea to have a memory limit
>>> for pretess since its difficult to predict, but this could lead to other
>>> usability problems...
>>>
>>>
>>> Examples of memory usage for pretessellation with current BVH and mesh
>>> storage:
>>>
>>>  - default cube with dice rate of 0.001 takes ~7.5GB
>>>
>>>  - for a quad that spans a 1920x1080 screen tessellated to one micropoly
>>> per pixel the memory consumption is ~700MB
>>>
>>>
>>> Screen adaptive tessellation btw would allow for a user configurable
>>> cache size where memory could potentially be traded for speed and vice
>>> versa.
>>>
>>>
>>> I mention pretessellation mostly for its simplicity. Even tho it has big
>>> issues with memory it might still be useful for lower detail? Would be nice
>>> to hear thoughts on which approach would be better. Thoughts about having
>>> both methods available for use, or if there are other usable techniques
>>> would also be great.
>>>
>>>
>>> Regardless of method chosen for tessellation there are some things that
>>> need work in general for displacement to be useful; the Diagsplit
>>> implementation needs some improvement, and texture coordinates pose a
>>> problem. Artists will likely need both texture coords generated form the
>>> base mesh and the displaced mesh for some shading techniques. One idea is to
>>> add a “Displaced” check box to the texture coord node, but I’m not sure if
>>> other nodes would need similar consideration or if theres a better way to
>>> approach the problem.
>>>
>>>
>>> If needed I could write a more detailed proposal. I would very much like
>>> to help make this happen, and look forward to your response.
>>>
>>>
>>> Mai
>>>
>>>
>>> On Sun, Apr 26, 2015 at 12:05 PM, Sergey Sharybin <sergey.vfx at gmail.com>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> We actually had a brief discussion in IRC a while ago. To summarize:
>>>>
>>>> - There was a discussion in the list a while ago:
>>>> http://lists.blender.org/pipermail/bf-cycles/2014-January/001744.html which
>>>> basically concludes choosing approach for micropolygon displacement should
>>>> be done with care
>>>> - There's a patch from Main in the tracker which optimizes memory
>>>> storage for subdivided meshes which i'll try to review in next few days
>>>> https://developer.blender.org/D1242
>>>>
>>>>
>>>> On Sun, Apr 26, 2015 at 8:46 PM, Ton Roosendaal <ton at blender.org> wrote:
>>>>>
>>>>> Hi Mai,
>>>>>
>>>>> Sorry for not getting any reaction sooner. People are busy I guess :
>>>>> Could you be a bit more precise, and explain in more detail how you
>>>>> want this to work?
>>>>>
>>>>> Or, show a paper or some images on a website.
>>>>>
>>>>> -Ton-
>>>>>
>>>>> --------------------------------------------------------
>>>>> Ton Roosendaal  -  ton at blender.org   -   www.blender.org
>>>>> Chairman Blender Foundation - Producer Blender Institute
>>>>> Entrepotdok 57A  -  1018AD Amsterdam  -  The Netherlands
>>>>>
>>>>>
>>>>>
>>>>> On 29 Mar, 2015, at 21:38, Mai wrote:
>>>>>
>>>>> > Hello,
>>>>> >
>>>>> > I had a nice conversation earlier in irc with Marco G, who suggested
>>>>> > I ask here. I'm very interested in helping out with getting micropolygon
>>>>> > displacement working in Cycles. What are the current plans for this and what
>>>>> > could I do to help? I've been poking around a bit at the code but don't want
>>>>> > to get too involved without asking about it first.
>>>>> >
>>>>> > Thank you,
>>>>> > Mai Lavelle
>>>>> > _______________________________________________
>>>>> > Bf-cycles mailing list
>>>>> > Bf-cycles at blender.org
>>>>> > http://lists.blender.org/mailman/listinfo/bf-cycles
>>>>>
>>>>> _______________________________________________
>>>>> Bf-cycles mailing list
>>>>> Bf-cycles at blender.org
>>>>> http://lists.blender.org/mailman/listinfo/bf-cycles
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> With best regards, Sergey Sharybin
>>>>
>>>> _______________________________________________
>>>> Bf-cycles mailing list
>>>> Bf-cycles at blender.org
>>>> http://lists.blender.org/mailman/listinfo/bf-cycles
>>>>
>>>
>>>
>>> _______________________________________________
>>> Bf-cycles mailing list
>>> Bf-cycles at blender.org
>>> http://lists.blender.org/mailman/listinfo/bf-cycles
>>>
>>
>> _______________________________________________
>> Bf-cycles mailing list
>> Bf-cycles at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-cycles
>>
>
>
> _______________________________________________
> Bf-cycles mailing list
> Bf-cycles at blender.org
> http://lists.blender.org/mailman/listinfo/bf-cycles
>