[Bf-blender-cvs] [cd6129d] master: Cycles: Workaround dead-slow expf() on 64bit linux
Sergey Sharybin
noreply at git.blender.org
Mon Oct 6 12:39:23 CEST 2014
Commit: cd6129d1ff6142c153a99917aa794b668e3b7dd2
Author: Sergey Sharybin
Date: Mon Oct 6 13:43:23 2014 +0600
Branches: master
https://developer.blender.org/rBcd6129d1ff6142c153a99917aa794b668e3b7dd2
Cycles: Workaround dead-slow expf() on 64bit linux
Single precision exponent on 64bit linux tends to be order of magnitude slower
than double precision version even with single<->double precision conversion.
Some feedback in the mailing lists also suggests that logf() is also slow, but
this i didn't confirm here in the studio yet.
Depending on the shader setup it gives ~3% with the secret agent shot and up to
around 15% with the bmw scene here.
===================================================================
M intern/cycles/device/device_cpu.cpp
M intern/cycles/kernel/kernel_compat_cpu.h
===================================================================
diff --git a/intern/cycles/device/device_cpu.cpp b/intern/cycles/device/device_cpu.cpp
index 4623764..c9b8a5b 100644
--- a/intern/cycles/device/device_cpu.cpp
+++ b/intern/cycles/device/device_cpu.cpp
@@ -17,6 +17,11 @@
#include <stdlib.h>
#include <string.h>
+/* So ImathMath is included before our kernel_cpu_compat. */
+#ifdef WITH_OSL
+# include <OSL/oslexec.h>
+#endif
+
#include "device.h"
#include "device_intern.h"
diff --git a/intern/cycles/kernel/kernel_compat_cpu.h b/intern/cycles/kernel/kernel_compat_cpu.h
index c2aab93..2553184 100644
--- a/intern/cycles/kernel/kernel_compat_cpu.h
+++ b/intern/cycles/kernel/kernel_compat_cpu.h
@@ -25,6 +25,13 @@
#include "util_half.h"
#include "util_types.h"
+/* On 64bit linux single precision exponent is really slow comparing to the
+ * double precision version, even with float<->double conversion involved.
+ */
+#if !defined(__KERNEL_GPU__) && defined(__linux__) && defined(__x86_64__)
+# define expf(x) ((float)exp((double)x))
+#endif
+
CCL_NAMESPACE_BEGIN
/* Assertions inside the kernel only work for the CPU device, so we wrap it in
More information about the Bf-blender-cvs
mailing list