[Bf-blender-cvs] [e26eb9c] master: Cycles: reduce CUDA stack memory access for Maxwell and up, increasing max registers.

Brecht Van Lommel noreply at git.blender.org
Sun Jun 19 20:36:55 CEST 2016


Commit: e26eb9c93bdeae0b52153a7fcf37bceebffd6304
Author: Brecht Van Lommel
Date:   Sun Jun 19 18:06:22 2016 +0200
Branches: master
https://developer.blender.org/rBe26eb9c93bdeae0b52153a7fcf37bceebffd6304

Cycles: reduce CUDA stack memory access for Maxwell and up, increasing max registers.

For non-branched path tracing with a GTX 960 and CUDA 7.5, this gives a small reduction
in stack usage but mainly: 8% faster render on BMW, 5% on pabellon, 13% on classroom.

===================================================================

M	intern/cycles/kernel/kernels/cuda/kernel.cu

===================================================================

diff --git a/intern/cycles/kernel/kernels/cuda/kernel.cu b/intern/cycles/kernel/kernels/cuda/kernel.cu
index 37fae54..eb2b6ea 100644
--- a/intern/cycles/kernel/kernels/cuda/kernel.cu
+++ b/intern/cycles/kernel/kernels/cuda/kernel.cu
@@ -77,8 +77,8 @@
 #  define CUDA_KERNEL_MAX_REGISTERS 63
 #  define CUDA_KERNEL_BRANCHED_MAX_REGISTERS 63
 
-/* 5.0, 5.2 and 5.3 */
-#elif __CUDA_ARCH__ == 500 || __CUDA_ARCH__ == 520 || __CUDA_ARCH__ == 530
+/* 5.0, 5.2, 5.3, 6.0, 6.1 */
+#elif __CUDA_ARCH__ >= 500
 #  define CUDA_MULTIPRESSOR_MAX_REGISTERS 65536
 #  define CUDA_MULTIPROCESSOR_MAX_BLOCKS 32
 #  define CUDA_BLOCK_MAX_THREADS 1024
@@ -86,7 +86,7 @@
 
 /* tunable parameters */
 #  define CUDA_THREADS_BLOCK_WIDTH 16
-#  define CUDA_KERNEL_MAX_REGISTERS 40
+#  define CUDA_KERNEL_MAX_REGISTERS 48
 #  define CUDA_KERNEL_BRANCHED_MAX_REGISTERS 63
 
 /* unknown architecture */




More information about the Bf-blender-cvs mailing list