[Bf-committers] OMP to BLI_task: About threading and lockfree operations
montagne29 at wanadoo.fr
Tue Jan 26 21:57:34 CET 2016
So, while working on switching from OMP to BLI_task, I've hit today an
rather hairy issue in pbvh_update_normals() (in BKE's pbvh.c).
Current OMP-based code of that function executes in (roughly) 28ms.
However, reading it, it's obvious it's lacking a `#pragma omp critical`
around main part of the second OMP loop, since affected MVert may be
used by several nodes, and hence suffer concurrency here.
Now, if I add that critical section, code now takes over 100ms to run!
Changing to BLI_task parrallelized code, with same correct protection
(using a spinlock) I can get about 70ms. Using some gcc atomic op
(__sync_fetch_and_and), since the atomic ME_VERT_PBVH_UPDATE flag
checking and clearing is enough to prevent concurrency here,
it can go down to 20ms.
Sad part of the story: because MVert->flag is and 8bit var, there is no
clean way to do the same under OSX, we can only hack around
OSAtomicTestAndClear but it's far from nice and clean (need to retrieve
bit number from bitmask eg.).
Now comes the funny question: since the probability of thread
concurrency here is very low (most vertices are used by only one bvhnode,
and the 'fragile' part of the code is *very* short, two bitflag
operations), do we want to care about absolute safety of computed
Missing 'critical section' in second loop never hurted us so far it seems...
Need your thoughts here, before I loose more time on this hell. ;)
More information about the Bf-committers