[Bf-blender-cvs] [a241fc44456] cycles-x: Cleanup: clarify code comments, use same logic for future GPU device

Wed Aug 25 19:25:03 CEST 2021

Commit: a241fc44456a519ff7f255216128ac69c2dbe359
Author: Brecht Van Lommel
Date:   Wed Aug 25 17:37:31 2021 +0200
Branches: cycles-x
https://developer.blender.org/rBa241fc44456a519ff7f255216128ac69c2dbe359

Cleanup: clarify code comments, use same logic for future GPU device

===================================================================

M	intern/cycles/kernel/kernel_work_stealing.h

===================================================================

diff --git a/intern/cycles/kernel/kernel_work_stealing.h b/intern/cycles/kernel/kernel_work_stealing.h
index 2c68b55e207..fab0915c38e 100644
--- a/intern/cycles/kernel/kernel_work_stealing.h
+++ b/intern/cycles/kernel/kernel_work_stealing.h
@@ -29,15 +29,18 @@ ccl_device_inline void get_work_pixel(ccl_global const KernelWorkTile *tile,
                                       ccl_private uint *y,
                                       ccl_private uint *sample)
 {
-#ifdef __KERNEL_CUDA__
-  /* Keeping threads for the same pixel together improves performance on CUDA. */
-  uint sample_offset = global_work_index % tile->num_samples;
-  uint pixel_offset = global_work_index / tile->num_samples;
-#else  /* __KERNEL_CUDA__ */
+#if 0
+  /* Keep threads for the same sample together. */
   uint tile_pixels = tile->w * tile->h;
   uint sample_offset = global_work_index / tile_pixels;
   uint pixel_offset = global_work_index - sample_offset * tile_pixels;
-#endif /* __KERNEL_CUDA__ */
+#else
+  /* Keeping threads for the same pixel together.
+   * Appears to improve performance by a few % on CUDA and OptiX. */
+  uint sample_offset = global_work_index % tile->num_samples;
+  uint pixel_offset = global_work_index / tile->num_samples;
+#endif
+
   uint y_offset = pixel_offset / tile->w;
   uint x_offset = pixel_offset - y_offset * tile->w;