[Bf-blender-cvs] [7a8629def86] temp-cycles-denoising: Cycles Denoising: Allocate more shared memory for the CUDA reconstruction kernel

Lukas Stockner noreply at git.blender.org
Fri Apr 14 00:57:23 CEST 2017


Commit: 7a8629def864ea7c9ebe2f8416ba668ac93e5aa5
Author: Lukas Stockner
Date:   Thu Mar 30 00:21:22 2017 +0200
Branches: temp-cycles-denoising
https://developer.blender.org/rB7a8629def864ea7c9ebe2f8416ba668ac93e5aa5

Cycles Denoising: Allocate more shared memory for the CUDA reconstruction kernel

The small shared memory size limited execution to a single block, which caused slowdowns in CUDA denoising due to bad occupancy.

===================================================================

M	intern/cycles/device/device_cuda.cpp

===================================================================

diff --git a/intern/cycles/device/device_cuda.cpp b/intern/cycles/device/device_cuda.cpp
index 5bfaacb2d29..88cb3085a29 100644
--- a/intern/cycles/device/device_cuda.cpp
+++ b/intern/cycles/device/device_cuda.cpp
@@ -1056,7 +1056,7 @@ public:
 		cuda_assert(cuFuncSetCacheConfig(cuNLMCalcDifference,   CU_FUNC_CACHE_PREFER_L1));
 		cuda_assert(cuFuncSetCacheConfig(cuNLMBlur,             CU_FUNC_CACHE_PREFER_L1));
 		cuda_assert(cuFuncSetCacheConfig(cuNLMCalcWeight,       CU_FUNC_CACHE_PREFER_L1));
-		cuda_assert(cuFuncSetCacheConfig(cuNLMConstructGramian, CU_FUNC_CACHE_PREFER_L1));
+		cuda_assert(cuFuncSetCacheConfig(cuNLMConstructGramian, CU_FUNC_CACHE_PREFER_SHARED));
 		cuda_assert(cuFuncSetCacheConfig(cuFinalize,            CU_FUNC_CACHE_PREFER_L1));
 
 		CUDA_GET_BLOCKSIZE(cuNLMCalcDifference,




More information about the Bf-blender-cvs mailing list