[Bf-blender-cvs] [af411d9] master: Cycles: Implement SSE-optimized path of util_max_axis()
Sergey Sharybin
noreply at git.blender.org
Tue Oct 25 15:34:27 CEST 2016
Commit: af411d918e68b487155309f5c1e29bb50924b69a
Author: Sergey Sharybin
Date: Tue Oct 25 13:54:17 2016 +0200
Branches: master
https://developer.blender.org/rBaf411d918e68b487155309f5c1e29bb50924b69a
Cycles: Implement SSE-optimized path of util_max_axis()
The idea here is to avoid if statements which could cause wrong
branch prediction.
Gives a bit of measurable speedup up to ~1%. Still nice :)
Inspired by Maxym Dmytrychenko, thanks!
===================================================================
M intern/cycles/util/util_math.h
===================================================================
diff --git a/intern/cycles/util/util_math.h b/intern/cycles/util/util_math.h
index b9594f7..57cad39 100644
--- a/intern/cycles/util/util_math.h
+++ b/intern/cycles/util/util_math.h
@@ -1629,6 +1629,14 @@ ccl_device_inline float2 map_to_sphere(const float3 co)
ccl_device_inline int util_max_axis(float3 vec)
{
+#ifdef __KERNEL_SSE__
+ __m128 a = shuffle<0,0,1,1>(vec.m128);
+ __m128 b = shuffle<1,2,2,1>(vec.m128);
+ __m128 c = _mm_cmpgt_ps(a, b);
+ int mask = _mm_movemask_ps(c) & 0x7;
+ static const char tab[8] = {2, 2, 2, 0, 1, 2, 1, 0};
+ return tab[mask];
+#else
if(vec.x > vec.y) {
if(vec.x > vec.z)
return 0;
@@ -1641,6 +1649,7 @@ ccl_device_inline int util_max_axis(float3 vec)
else
return 2;
}
+#endif
}
CCL_NAMESPACE_END
More information about the Bf-blender-cvs
mailing list