[Bf-committers] massive cuda speed improvements with Cuda 5.0/5.5

Jürgen Herrmann shadowrom at me.com
Mon Jun 3 21:41:21 CEST 2013


Hi Brecht,

you're welcome ;) 
I think I'll have to include these compiler flag optimizations into vs2012 builds otherwise these builds will be significantly slower than other compilers :(
Cycles has two problems when it comes to windows/vs2012 builds:
1. It is much slower on CPU
2. It is slower with cuda, if we don't use opt flags.

As long as MinGW OpenMP isn't fixed we have to stick to vs2008/cuda 4.2 or find a decent solution for VS2012/Cuda 5.5.

As blender seems to be used mostly by windows users this isn't optimal.

/Jürgen

Am 03.06.2013 um 21:20 schrieb Brecht Van Lommel <brechtvanlommel at pandora.be>:

> Thanks for testing. I've also been doing some experimenting with
> compile flags and other things here. So far it seems I can make my
> 650M render a few percentages faster compared to CUDA 4.2, but 460 GT
> is still considerably slower with the BMW scene (2m30s with 5.5
> compared to 2m01s with 4.2), and 580 GTX had a similar difference. It
> seems you are testing with a 6xx card so that makes sense.
> 
> Patch attached for those who want to test this with 5.0/5.5.
> 
> On Mon, Jun 3, 2013 at 8:46 PM, Jürgen Herrmann <shadowrom at me.com> wrote:
>> Hi there,
>> 
>> 
>> 
>> I did some tests with cuda 5.0 and 5.5 today and changed the nvcc
>> optimization flags for cycles_kernel_cuda.
>> 
>> 
>> 
>> I found out the following:
>> 
>> 
>> 
>> -          “--opencc-options “ is deprecated for sm_20 and up and should be
>> removed from compiler options
>> 
>> -          Stating “-O3” and “—use_fast_math” as nvcc options brings massive
>> speedup on my system (more below)
>> 
>> -          We shouldn’t complain about new cuda toolsets that are slow, we
>> should find a solution as we can’t use old software forever…
>> 
>> 
>> 
>> To the speedups:
>> 
>> 
>> 
>> Example 1:
>> 
>> system: i7-3820 @ 3.60GHz, GeForce GTK 660
>> 
>> 
>> 
>> Blender (cycles_cuda_kernel) compiled with standard settings:
>> 
>> Mike_pan file took 02:06.60 to render
>> 
>> 
>> 
>> Blender (cycles_cuda_kernel) compiled with –O3 –use-fast-math:
>> 
>> Mike_pan took 01:39:93
>> 
>> 
>> 
>> There is no optical difference in the render results:
>> 
>> 
>> 
>> Image1: http://www.pasteall.org/pic/52757
>> 
>> Image2: http://www.pasteall.org/pic/52758
>> 
>> 
>> 
>> I bet there’s more potential in there.
>> 
>> 
>> 
>> /Jürgen
>> 
>> _______________________________________________
>> Bf-committers mailing list
>> Bf-committers at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-committers
> <cuda_5.5_tests.txt>
> _______________________________________________
> Bf-committers mailing list
> Bf-committers at blender.org
> http://lists.blender.org/mailman/listinfo/bf-committers


More information about the Bf-committers mailing list