[Bf-blender-cvs] [a241fc44456] cycles-x: Cleanup: clarify code comments, use same logic for future GPU device
Brecht Van Lommel
noreply at git.blender.org
Wed Aug 25 19:25:03 CEST 2021
Commit: a241fc44456a519ff7f255216128ac69c2dbe359
Author: Brecht Van Lommel
Date: Wed Aug 25 17:37:31 2021 +0200
Branches: cycles-x
https://developer.blender.org/rBa241fc44456a519ff7f255216128ac69c2dbe359
Cleanup: clarify code comments, use same logic for future GPU device
===================================================================
M intern/cycles/kernel/kernel_work_stealing.h
===================================================================
diff --git a/intern/cycles/kernel/kernel_work_stealing.h b/intern/cycles/kernel/kernel_work_stealing.h
index 2c68b55e207..fab0915c38e 100644
--- a/intern/cycles/kernel/kernel_work_stealing.h
+++ b/intern/cycles/kernel/kernel_work_stealing.h
@@ -29,15 +29,18 @@ ccl_device_inline void get_work_pixel(ccl_global const KernelWorkTile *tile,
ccl_private uint *y,
ccl_private uint *sample)
{
-#ifdef __KERNEL_CUDA__
- /* Keeping threads for the same pixel together improves performance on CUDA. */
- uint sample_offset = global_work_index % tile->num_samples;
- uint pixel_offset = global_work_index / tile->num_samples;
-#else /* __KERNEL_CUDA__ */
+#if 0
+ /* Keep threads for the same sample together. */
uint tile_pixels = tile->w * tile->h;
uint sample_offset = global_work_index / tile_pixels;
uint pixel_offset = global_work_index - sample_offset * tile_pixels;
-#endif /* __KERNEL_CUDA__ */
+#else
+ /* Keeping threads for the same pixel together.
+ * Appears to improve performance by a few % on CUDA and OptiX. */
+ uint sample_offset = global_work_index % tile->num_samples;
+ uint pixel_offset = global_work_index / tile->num_samples;
+#endif
+
uint y_offset = pixel_offset / tile->w;
uint x_offset = pixel_offset - y_offset * tile->w;
More information about the Bf-blender-cvs
mailing list