[Bf-cycles] Non-Progressive integrator on GPU
Brecht Van Lommel
brechtvanlommel at pandora.be
Sun May 19 06:23:30 CEST 2013
On Sun, May 19, 2013 at 1:50 AM, Thomas Dinges <blender at dingto.org> wrote:
> Hi Brecht,
> I looked into enabling the Non-Progressive integrator on GPU and want to
> share my findings.
> As far as I can tell there is 1 problem (maybe 2).
> 1) CUDA does not know memset(), called from within
> shader_merge_closures() in kernel_shaders.h.
> I could not find a direct alternative, but it seems there are
> workarounds for it.
We don't need memmove, can just be replaced with:
for(int k = 0; k < size; k++)
scj[k] = scj[k+1];
> (2) With Non Progressive integrator enabled, the CUDA compiler takes a
> lot of memory. I had to disable __HAIR__ in order to keep my RAM alive,
> but even then it took 4.5 GB (just the compiler process, peak) to
> compile the sm_21 kernel.
To reduce memory you could try replacing __device with
__device_noinline for some big functions called from the
non-progressive integrator code. It might reduce performance for the
progressive integrator but might also not, needs testing.
Still not sure we want to have a kernel that pushes against memory
limits again, we should keep it manageable so that things don't break
on every feature added.
More information about the Bf-cycles