[Bf-blender-cvs] [c4d4db39dc2] master: Functions: enable more gcc optimizations for multi-functions

Jacques Lucke noreply at git.blender.org
Sat Jan 7 20:23:33 CET 2023


Commit: c4d4db39dc2e38fb9db50eb89caf9759e79da7e4
Author: Jacques Lucke
Date:   Sat Jan 7 20:23:20 2023 +0100
Branches: master
https://developer.blender.org/rBc4d4db39dc2e38fb9db50eb89caf9759e79da7e4

Functions: enable more gcc optimizations for multi-functions

This mainly helps GCC catch up with Clang in terms of field evaluation
performance in some cases. In some cases this patch can speedup
field evaluation 2-3x (e.g. when there are many float math nodes).
See D16942 for a more detailed benchmark.

===================================================================

M	source/blender/functions/FN_multi_function_builder.hh

===================================================================

diff --git a/source/blender/functions/FN_multi_function_builder.hh b/source/blender/functions/FN_multi_function_builder.hh
index 997f52f4066..20b28658932 100644
--- a/source/blender/functions/FN_multi_function_builder.hh
+++ b/source/blender/functions/FN_multi_function_builder.hh
@@ -124,13 +124,19 @@ namespace detail {
  * instead of a `VArray<int>`).
  */
 template<typename MaskT, typename... Args, typename... ParamTags, size_t... I, typename ElementFn>
-void execute_array(TypeSequence<ParamTags...> /*param_tags*/,
-                   std::index_sequence<I...> /*indices*/,
-                   ElementFn element_fn,
-                   MaskT mask,
-                   /* Use restrict to tell the compiler that pointer inputs do not alias each
-                    * other. This is important for some compiler optimizations. */
-                   Args &&__restrict... args)
+/* Perform additional optimizations on this loop because it is a very hot loop. For example, the
+ * math node in geometry nodes is processed here.  */
+#if (defined(__GNUC__) && !defined(__clang__))
+[[gnu::optimize("-funroll-loops")]] [[gnu::optimize("O3")]]
+#endif
+void execute_array(
+    TypeSequence<ParamTags...> /*param_tags*/,
+    std::index_sequence<I...> /*indices*/,
+    ElementFn element_fn,
+    MaskT mask,
+    /* Use restrict to tell the compiler that pointer inputs do not alias each
+     * other. This is important for some compiler optimizations. */
+    Args &&__restrict... args)
 {
   for (const int64_t i : mask) {
     element_fn([&]() -> decltype(auto) {



More information about the Bf-blender-cvs mailing list