[Bf-cycles] Non-Progressive integrator on GPU
blender at dingto.org
Sun May 19 20:08:52 CEST 2013
I don't think it would be that much, more like 10 or so, but hard to say. :)
I tried building it with Toolkit 5.0 as well, and it uses much less RAM,
so I really hope the upcoming Toolkit will handle the code well & give
us the same performance as the Toolkit 4.2.
Am 19.05.2013 19:57, schrieb Matthew Heimlich:
> If anyone's interested in a non-prog hair GPU build I could probably
> provide one, unless you think the build would take more than 24GB of
> RAM. If people express interest I'll go ahead and do it.
> On Sun, May 19, 2013 at 9:10 AM, Thomas Dinges <blender at dingto.org> wrote:
>> Hi Brecht,
>> thank you for the memmove alternative, works fine. :)
>> I agree with you that this is nothing for Trunk, we can re-evaluate this
>> when switch to a new Toolkit.
>> Toolkit 5.0 already works better with our big kernel, but brings a
>> slowdown. Fingers crossed for 5.x, hope nvidia will release that soon.
>> In the meantime, everyone interested in GPU Non-Progressive rendering
>> can use this:
>> Patch: http://blender.dingto.org/patches/non_progressive_gpu.diff
>> Build (Windows x64):
>> This still comes with disabled Hair support on the GPU, so basically
>> Blender 2.67 feature set, just with Non-Progressive on the GPU.
>> Best regards,
>> Am 19.05.2013 06:23, schrieb Brecht Van Lommel:
>>> On Sun, May 19, 2013 at 1:50 AM, Thomas Dinges <blender at dingto.org> wrote:
>>>> Hi Brecht,
>>>> I looked into enabling the Non-Progressive integrator on GPU and want to
>>>> share my findings.
>>>> As far as I can tell there is 1 problem (maybe 2).
>>>> 1) CUDA does not know memset(), called from within
>>>> shader_merge_closures() in kernel_shaders.h.
>>>> I could not find a direct alternative, but it seems there are
>>>> workarounds for it.
>>> We don't need memmove, can just be replaced with:
>>> for(int k = 0; k < size; k++)
>>> scj[k] = scj[k+1];
>>>> (2) With Non Progressive integrator enabled, the CUDA compiler takes a
>>>> lot of memory. I had to disable __HAIR__ in order to keep my RAM alive,
>>>> but even then it took 4.5 GB (just the compiler process, peak) to
>>>> compile the sm_21 kernel.
>>> To reduce memory you could try replacing __device with
>>> __device_noinline for some big functions called from the
>>> non-progressive integrator code. It might reduce performance for the
>>> progressive integrator but might also not, needs testing.
>>> Still not sure we want to have a kernel that pushes against memory
>>> limits again, we should keep it manageable so that things don't break
>>> on every feature added.
>>> Bf-cycles mailing list
>>> Bf-cycles at blender.org
>> Thomas Dinges
>> Blender Developer, Artist and Musician
>> Bf-cycles mailing list
>> Bf-cycles at blender.org
> Bf-cycles mailing list
> Bf-cycles at blender.org
Blender Developer, Artist and Musician
More information about the Bf-cycles