[Bf-blender-cvs] [25df3ca] soc-2016-cycles_denoising: Cycles: Fix wring stride in buffer accesses when the tile size plus overscan wasn't a multiple of 4

Lukas Stockner noreply at git.blender.org
Sun Aug 21 06:18:10 CEST 2016


Commit: 25df3ca1475ebc477ebd9aee4c7626c7195ba0df
Author: Lukas Stockner
Date:   Wed Aug 17 11:33:51 2016 +0200
Branches: soc-2016-cycles_denoising
https://developer.blender.org/rB25df3ca1475ebc477ebd9aee4c7626c7195ba0df

Cycles: Fix wring stride in buffer accesses when the tile size plus overscan wasn't a multiple of 4

===================================================================

M	intern/cycles/device/device_cuda.cpp

===================================================================

diff --git a/intern/cycles/device/device_cuda.cpp b/intern/cycles/device/device_cuda.cpp
index 0131334..6ddf001 100644
--- a/intern/cycles/device/device_cuda.cpp
+++ b/intern/cycles/device/device_cuda.cpp
@@ -856,8 +856,9 @@ public:
 		int yblocks = (rtile.h + ythreads - 1)/ythreads;
 
 		CUdeviceptr d_denoise_buffer;
-		cuda_assert(cuMemAlloc(&d_denoise_buffer, 22*rtile.w*rtile.h*sizeof(float)));
-		int pass_stride = rtile.w*rtile.h;
+		int w = align_up(rtile.w, 4);
+		int pass_stride = w*rtile.h;
+		cuda_assert(cuMemAlloc(&d_denoise_buffer, 22*pass_stride*sizeof(float)));
 #define CUDA_PTR_ADD(ptr, x) ((CUdeviceptr) (((float*) (ptr)) + (x)))
 
 		/* ==== Step 1: Prefilter general features. ==== */




More information about the Bf-blender-cvs mailing list