[Bf-blender-cvs] [7a8629def86] temp-cycles-denoising: Cycles Denoising: Allocate more shared memory for the CUDA reconstruction kernel
Lukas Stockner
noreply at git.blender.org
Fri Apr 14 00:57:23 CEST 2017
Commit: 7a8629def864ea7c9ebe2f8416ba668ac93e5aa5
Author: Lukas Stockner
Date: Thu Mar 30 00:21:22 2017 +0200
Branches: temp-cycles-denoising
https://developer.blender.org/rB7a8629def864ea7c9ebe2f8416ba668ac93e5aa5
Cycles Denoising: Allocate more shared memory for the CUDA reconstruction kernel
The small shared memory size limited execution to a single block, which caused slowdowns in CUDA denoising due to bad occupancy.
===================================================================
M intern/cycles/device/device_cuda.cpp
===================================================================
diff --git a/intern/cycles/device/device_cuda.cpp b/intern/cycles/device/device_cuda.cpp
index 5bfaacb2d29..88cb3085a29 100644
--- a/intern/cycles/device/device_cuda.cpp
+++ b/intern/cycles/device/device_cuda.cpp
@@ -1056,7 +1056,7 @@ public:
cuda_assert(cuFuncSetCacheConfig(cuNLMCalcDifference, CU_FUNC_CACHE_PREFER_L1));
cuda_assert(cuFuncSetCacheConfig(cuNLMBlur, CU_FUNC_CACHE_PREFER_L1));
cuda_assert(cuFuncSetCacheConfig(cuNLMCalcWeight, CU_FUNC_CACHE_PREFER_L1));
- cuda_assert(cuFuncSetCacheConfig(cuNLMConstructGramian, CU_FUNC_CACHE_PREFER_L1));
+ cuda_assert(cuFuncSetCacheConfig(cuNLMConstructGramian, CU_FUNC_CACHE_PREFER_SHARED));
cuda_assert(cuFuncSetCacheConfig(cuFinalize, CU_FUNC_CACHE_PREFER_L1));
CUDA_GET_BLOCKSIZE(cuNLMCalcDifference,
More information about the Bf-blender-cvs
mailing list