[Bf-blender-cvs] [c4e42d70a49] master: Task scheduler: Add minimum number of iterations per thread in parallel range

Sergey Sharybin noreply at git.blender.org
Tue Jan 9 16:11:52 CET 2018


Commit: c4e42d70a4949352f1233574cfc2da30c097439d
Author: Sergey Sharybin
Date:   Mon Jan 8 12:08:18 2018 +0100
Branches: master
https://developer.blender.org/rBc4e42d70a4949352f1233574cfc2da30c097439d

Task scheduler: Add minimum number of iterations per thread in parallel range

The idea is to support following: allow doing parallel for on a small range,
each iteration of which takes lots of compute power, but limit such range to
a subset of threads.

For example, on a machine with 44 threads we can occupy 4 threads to handle
range of 64 elements, 16 elements per thread, where each block of 16 elements
is very complex to compute.

The idea should be to use this setting instead of global use_threading flag,
which is only based on size of array. Proper use of the new flag will improve
threadability.

This commit only contains internal task scheduler changes, this setting is not
used yet by any areas.

===================================================================

M	source/blender/blenlib/BLI_task.h
M	source/blender/blenlib/intern/task.c

===================================================================

diff --git a/source/blender/blenlib/BLI_task.h b/source/blender/blenlib/BLI_task.h
index b4c374d3fe7..0f78d2f4361 100644
--- a/source/blender/blenlib/BLI_task.h
+++ b/source/blender/blenlib/BLI_task.h
@@ -167,6 +167,17 @@ typedef struct ParallelRangeSettings {
 	 * processed.
 	 */
 	TaskParallelRangeFuncFinalize func_finalize;
+	/* Minimum allowed number of range iterators to be handled by a single
+	 * thread. This allows to achieve following:
+	 * - Reduce amount of threading overhead.
+	 * - Partially occupy thread pool with ranges which are computationally
+	 *   expensive, but which are smaller than amount of available threads.
+	 *   For example, it's possible to multi-thread [0 .. 64] range into 4
+	 *   thread which will be doing 16 iterators each.
+	 * This is a preferred way to tell scheduler when to start threading than
+	 * having a global use_threading switch based on just range size.
+	 */
+	int min_iter_per_thread;
 } ParallelRangeSettings;
 
 BLI_INLINE void BLI_parallel_range_settings_defaults(
@@ -203,6 +214,11 @@ BLI_INLINE void BLI_parallel_range_settings_defaults(
 	memset(settings, 0, sizeof(*settings));
 	settings->use_threading = true;
 	settings->scheduling_mode = TASK_SCHEDULING_STATIC;
+	/* NOTE: Current value mimics old behavior, but it's not ideal by any
+	 * means. Would be cool to find a common value which will work good enough
+	 * for both static and dynamic scheduling.
+	 */
+	settings->min_iter_per_thread = 1;
 }
 
 #ifdef __cplusplus
diff --git a/source/blender/blenlib/intern/task.c b/source/blender/blenlib/intern/task.c
index f2a14aa9363..ba600be870b 100644
--- a/source/blender/blenlib/intern/task.c
+++ b/source/blender/blenlib/intern/task.c
@@ -1113,15 +1113,22 @@ void BLI_task_parallel_range(int start, int stop,
 	state.iter = start;
 	switch (settings->scheduling_mode) {
 		case TASK_SCHEDULING_STATIC:
-			state.chunk_size = max_ii(1, (stop - start) / (num_tasks));
+			state.chunk_size = max_ii(
+			        settings->min_iter_per_thread,
+			        (stop - start) / (num_tasks));
 			break;
 		case TASK_SCHEDULING_DYNAMIC:
+			/* TODO(sergey): Make it configurable from min_iter_per_thread. */
 			state.chunk_size = 32;
 			break;
 	}
 
 	num_tasks = min_ii(num_tasks, (stop - start) / state.chunk_size);
 
+	/* TODO(sergey): If number of tasks happened to be 1, use single threaded
+	 * path.
+	 */
+
 	/* NOTE: This way we are adding a memory barrier and ensure all worker
 	 * threads can read and modify the value, without any locks. */
 	atomic_fetch_and_add_int32(&state.iter, 0);



More information about the Bf-blender-cvs mailing list