[Soc-2018-dev] Changes needed for CUDA to handle Sparse Grids

Fri Jun 8 11:35:46 CEST 2018

You're right, if we want to voxels to be fully interpolated with a single
tex3D() call we need 10x10x10. If you want to implement this then go ahead.

We could decide at runtime if dense or sparse tiles are most efficient, and
in the future figure out ways to make the sparse storage more memory
efficient by avoiding the padding between tiles that are adjacent.

I don't think still images vs. animation makes any difference in this case.

On Thu, Jun 7, 2018 at 3:18 PM Geraldine Chua <chua.gsk at gmail.com> wrote:

> Hi Brecht,
>
> I assumed we would need a 9x9x9 grid because otherwise, how would the
> voxels on the other 5 faces of the tile be interpolated? Thinking about it
> more, it seems like we would actually need a 10x10x10 grid in that case,
> which is almost 2x the size of an 8x8x8 grid. Looking at the GVDB slides
> though, it looks like that is basically the method used by GVDB as well.
>
> I added a padding function to the grid generator and tried out placing a 1
> voxel pad on all 6 sides of the tile. I got the following memory usages:
>
> Density
> Original: 768.00K
> 10x10x10 tiling: 681.19K
> ~11.3% memory reduction
>
> Color
> Original: 3.00M
> 10x10x10 tiling: 2.66M
> ~11.3% memory reduction
>
> So we would need to weigh the slight memory savings versus the extra
> calculation needed to convert coordinates from the dense to the sparse
> coordinate system. Difficulty of implementation shouldn't be an issue since
> I have already succesfully modified the sparse grid functions to work with
> padding.
>
> Also, as a disclaimer, I have been testing only by rendering still images,
> so I'm not sure what considerations need to be made for animations.
>
> Best regards,
> Geraldine
>
>
> On Thu, Jun 7, 2018 at 1:29 AM, Brecht Van Lommel <
> brechtvanlommel at gmail.com> wrote:
>
>> I was thinking we could use an 8x8x9 grid, which wouldn't was as much
>> memory. Unfortunately it seems there is a CUDA texture size limit of
>> 16384x16384x16384, and we'd hit that quite quickly if we use only one
>> dimension..
>>
>> I'm not sure what the best solution is then, since 1.5x more memory usage
>> is a lot. There's interesting ideas here but it gets quite complicated:
>>
>> https://developer.nvidia.com/sites/default/files/akamai/designworks/docs/GVDB_TechnicalTalk_Siggraph2016.pdf
>>
>> What I suggest to do is to just store 8x8x8 grids and not take advantage
>> of linear interpolation of tex3D() for now, to avoid getting side tracked
>> too much. So for sparse grids replacing each tex3D() with 8 voxel lookups
>> with closest interpolation. You can still do all the changes to remove
>> SparseTile.
>>
>> Does that make sense?
>>
>> Regards,
>> Brecht.
>>
>>
>>
>> On Wed, Jun 6, 2018 at 7:04 PM, Geraldine Chua <chua.gsk at gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> So this mailing list seems to be more for the weekly reports, but I
>>> think it's easier to talk about planning in email, especially since it's
>>> diffcult to align schedules and I think we can have more detailed
>>> discussions this way. Please don't hesitate to comment if there are any
>>> glaring flaws or things I need to take into account :P
>>>
>>> First of all, after Brecht has told me about how CUDA does
>>> interpolation, I do not think the current approach of having a dedicated
>>> SparseTile struct is viable anymore. The sparse grid will need to be
>>> changed to a flat array of floats/float4s/etc. Changing this and all
>>> dependant code shouldn't be difficult or take too long, and will actually
>>> make some parts of the code much cleaner.
>>>
>>> So now dense grids are transformed from some X x Y x Z grid to a long,
>>> thin N x TILE_SIZE x TILE_SIZE sparse grid, where N is the number of active
>>> tiles * TILE_SIZE. Because of this configuration, neighbor tiles in the
>>> sparse grid are not necessarily neighbor tiles in the real image, so there
>>> must be a way of getting the real voxel neighbors since they are needed for
>>> interpolating at the boundaries of a tile.
>>>
>>> The way suggested by Brecht (or at least my interpretation of his
>>> suggestion, I might be wrong) is to give every tile a pad containing its
>>> real neighbors. Thus, each "tile", instead of being an 8x8x8 grid for
>>> example, is now a 9x9x9 grid or larger, depending on how wide a sample the
>>> interpolation requires. There is quite a lot of voxel duplication using
>>> this method though, and it will have a large impact on the memory usage,
>>> since for example, just a 9x9x9 grid is already around 1.5x larger than an
>>> 8x8x8 grid.
>>>
>>> If any of this doesn't seem right or there is a better approach or there
>>> are other functions I need to take into account that will require sparse
>>> lookup support, I would really appreciate your comments :)
>>>
>>> Best regards,
>>> Geraldine
>>>
>>>
>>> --
>>> Soc-2018-dev mailing list
>>> Soc-2018-dev at blender.org
>>> https://lists.blender.org/mailman/listinfo/soc-2018-dev
>>>
>>>
>>
>> --
>> Soc-2018-dev mailing list
>> Soc-2018-dev at blender.org
>> https://lists.blender.org/mailman/listinfo/soc-2018-dev
>>
>>
> --
> Soc-2018-dev mailing list
> Soc-2018-dev at blender.org
> https://lists.blender.org/mailman/listinfo/soc-2018-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.blender.org/pipermail/soc-2018-dev/attachments/20180608/3d68d1c5/attachment-0001.html>