[Bf-committers] massive cuda speed improvements with Cuda 5.0/5.5
Jürgen Herrmann
shadowrom at me.com
Mon Jun 3 21:41:21 CEST 2013
Hi Brecht,
you're welcome ;)
I think I'll have to include these compiler flag optimizations into vs2012 builds otherwise these builds will be significantly slower than other compilers :(
Cycles has two problems when it comes to windows/vs2012 builds:
1. It is much slower on CPU
2. It is slower with cuda, if we don't use opt flags.
As long as MinGW OpenMP isn't fixed we have to stick to vs2008/cuda 4.2 or find a decent solution for VS2012/Cuda 5.5.
As blender seems to be used mostly by windows users this isn't optimal.
/Jürgen
Am 03.06.2013 um 21:20 schrieb Brecht Van Lommel <brechtvanlommel at pandora.be>:
> Thanks for testing. I've also been doing some experimenting with
> compile flags and other things here. So far it seems I can make my
> 650M render a few percentages faster compared to CUDA 4.2, but 460 GT
> is still considerably slower with the BMW scene (2m30s with 5.5
> compared to 2m01s with 4.2), and 580 GTX had a similar difference. It
> seems you are testing with a 6xx card so that makes sense.
>
> Patch attached for those who want to test this with 5.0/5.5.
>
> On Mon, Jun 3, 2013 at 8:46 PM, Jürgen Herrmann <shadowrom at me.com> wrote:
>> Hi there,
>>
>>
>>
>> I did some tests with cuda 5.0 and 5.5 today and changed the nvcc
>> optimization flags for cycles_kernel_cuda.
>>
>>
>>
>> I found out the following:
>>
>>
>>
>> - “--opencc-options “ is deprecated for sm_20 and up and should be
>> removed from compiler options
>>
>> - Stating “-O3” and “—use_fast_math” as nvcc options brings massive
>> speedup on my system (more below)
>>
>> - We shouldn’t complain about new cuda toolsets that are slow, we
>> should find a solution as we can’t use old software forever…
>>
>>
>>
>> To the speedups:
>>
>>
>>
>> Example 1:
>>
>> system: i7-3820 @ 3.60GHz, GeForce GTK 660
>>
>>
>>
>> Blender (cycles_cuda_kernel) compiled with standard settings:
>>
>> Mike_pan file took 02:06.60 to render
>>
>>
>>
>> Blender (cycles_cuda_kernel) compiled with –O3 –use-fast-math:
>>
>> Mike_pan took 01:39:93
>>
>>
>>
>> There is no optical difference in the render results:
>>
>>
>>
>> Image1: http://www.pasteall.org/pic/52757
>>
>> Image2: http://www.pasteall.org/pic/52758
>>
>>
>>
>> I bet there’s more potential in there.
>>
>>
>>
>> /Jürgen
>>
>> _______________________________________________
>> Bf-committers mailing list
>> Bf-committers at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-committers
> <cuda_5.5_tests.txt>
> _______________________________________________
> Bf-committers mailing list
> Bf-committers at blender.org
> http://lists.blender.org/mailman/listinfo/bf-committers
More information about the Bf-committers
mailing list