[Bf-blender-cvs] [25df3ca] soc-2016-cycles_denoising: Cycles: Fix wring stride in buffer accesses when the tile size plus overscan wasn't a multiple of 4
Lukas Stockner
noreply at git.blender.org
Sun Aug 21 06:18:10 CEST 2016
Commit: 25df3ca1475ebc477ebd9aee4c7626c7195ba0df
Author: Lukas Stockner
Date: Wed Aug 17 11:33:51 2016 +0200
Branches: soc-2016-cycles_denoising
https://developer.blender.org/rB25df3ca1475ebc477ebd9aee4c7626c7195ba0df
Cycles: Fix wring stride in buffer accesses when the tile size plus overscan wasn't a multiple of 4
===================================================================
M intern/cycles/device/device_cuda.cpp
===================================================================
diff --git a/intern/cycles/device/device_cuda.cpp b/intern/cycles/device/device_cuda.cpp
index 0131334..6ddf001 100644
--- a/intern/cycles/device/device_cuda.cpp
+++ b/intern/cycles/device/device_cuda.cpp
@@ -856,8 +856,9 @@ public:
int yblocks = (rtile.h + ythreads - 1)/ythreads;
CUdeviceptr d_denoise_buffer;
- cuda_assert(cuMemAlloc(&d_denoise_buffer, 22*rtile.w*rtile.h*sizeof(float)));
- int pass_stride = rtile.w*rtile.h;
+ int w = align_up(rtile.w, 4);
+ int pass_stride = w*rtile.h;
+ cuda_assert(cuMemAlloc(&d_denoise_buffer, 22*pass_stride*sizeof(float)));
#define CUDA_PTR_ADD(ptr, x) ((CUdeviceptr) (((float*) (ptr)) + (x)))
/* ==== Step 1: Prefilter general features. ==== */
More information about the Bf-blender-cvs
mailing list