[Bf-committers] openmp expert needed

Lars Krueger lars_e_krueger at gmx.de
Sat Apr 10 12:44:57 CEST 2010


-------- Original-Nachricht --------
> Datum: Fri, 9 Apr 2010 19:02:19 -0800
> Von: Tom M <letterrip at gmail.com>
> An: bf-blender developers <bf-committers at blender.org>
> Betreff: [Bf-committers] openmp expert needed

> any openmp expert out there who might be willing to look at our sculpt.c
> file?
> 
> Reports in the forum suggest that we are actually getting a slow down
> for many users with openmp enabled for sculpt so perhaps something is
> being done wrong?

>From just looking at the code and your description and no profiling, this sounds like too-fine grained parallelisation resulting in more thread-switch overhead than actual work. 

In revision 28112, ./source/blender/editors/sculpt_paint/sculpt.c
the for loop starting in line 776, you might try to eliminate the lock like this: Partition the nodes in n groups for n threads (omp_get_num_threads) and compute the sums of (out, nout) per thread. Then, compute the sum of sums. This should remove most of the overhead. You have to do this the hard way, as the reduction clause cannot do this with functions, only with operator.

If openmp is not available, you can simply set the number of threads to 1 in order to avoid the #ifdef mess.

The remaining loops look ok for large enough totnode. Keep in mind that you trade the time to setup the treads vs. the time you gain by parallel processing. If your total processing time (without omp) is small (a few milliseconds) you might not gain anything from omp. My experience is with larger loops (seconds of total CPU time) only, so I'm not sure where the break-even is.

My 2 cents,
-- 
Dr. Lars Krueger


GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01


More information about the Bf-committers mailing list