[Bf-blender-cvs] [c1a962b77e0] cycles_oneapi: Cycles: Add support for rendering on Intel GPUs using oneAPI.

Nikita Sirgienko noreply at git.blender.org
Thu Mar 31 14:59:13 CEST 2022


Commit: c1a962b77e0284ff21fbc80f06ad667ddce70e55
Author: Nikita Sirgienko
Date:   Thu Mar 31 14:47:31 2022 +0200
Branches: cycles_oneapi
https://developer.blender.org/rBc1a962b77e0284ff21fbc80f06ad667ddce70e55

Cycles: Add support for rendering on Intel GPUs using oneAPI.

This patch adds a new Cycles device with similar functionality to the existing GPU devices. Kernel compilation and runtime interaction happen via Intel® oneAPI Base toolkit.
The kernel itself and all calls to the SYCL API are encapsulated into a separate dynamic library as this allows compiling the kernel with a different compiler (Intel® oneAPI DPC++/C++ Compiler) and use a different C++ ABI than the rest of Blender. It allows Blender to launch safely on systems without oneAPI dependencies installed.
This implementation has been tested on Tiger Lake and Alder Lake integrated GPUs as well as Intel® Arc™ graphics pre-production silicon which is our current focus. Compilation time for targets prior to Intel® Arc™ is known to be unexpectedly long (currently around an hour) and is being worked on.
A recent driver, 101.1660 or newer on Windows: https://www.intel.com/content/www/us/en/download/19344/intel-graphics-windows-dch-drivers.html xxxx on Linux: https://dgpu-docs.intel.com/installation-guides/index.html is needed at runtime.
The necessary tools for compilation can be downloaded from https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html
In addition, Intel® oneAPI Level Zero is used for low level device queries:
Release oneAPI Level Zero Loader v1.7.15 · oneapi-src/level-zero (github.com)

Being based on the open SYCL standard, this implementation could also be extended to run on other compatible stacks in the future.

Maniphest Tasks: T96840

Differential Revision: https://developer.blender.org/D14480

===================================================================

M	CMakeLists.txt
A	build_files/cmake/Modules/FindLevelZero.cmake
A	build_files/cmake/Modules/FindSYCL.cmake
M	build_files/cmake/config/blender_release.cmake
M	build_files/cmake/platform/platform_win32.cmake
M	intern/cycles/CMakeLists.txt
M	intern/cycles/blender/addon/properties.py
M	intern/cycles/blender/addon/ui.py
M	intern/cycles/blender/device.cpp
M	intern/cycles/blender/python.cpp
M	intern/cycles/cmake/external_libs.cmake
M	intern/cycles/device/CMakeLists.txt
M	intern/cycles/device/device.cpp
M	intern/cycles/device/device.h
A	intern/cycles/device/oneapi/device.cpp
A	intern/cycles/device/oneapi/device.h
A	intern/cycles/device/oneapi/device_impl.cpp
A	intern/cycles/device/oneapi/device_impl.h
A	intern/cycles/device/oneapi/queue.cpp
A	intern/cycles/device/oneapi/queue.h
A	intern/cycles/device/oneapi/sycl.h
A	intern/cycles/device/oneapi/util.cpp
A	intern/cycles/device/oneapi/util.h
M	intern/cycles/integrator/path_trace.cpp
M	intern/cycles/kernel/CMakeLists.txt
M	intern/cycles/kernel/device/gpu/kernel.h
M	intern/cycles/kernel/device/gpu/parallel_active_index.h
A	intern/cycles/kernel/device/oneapi/CMakeLists.txt
A	intern/cycles/kernel/device/oneapi/compat.h
A	intern/cycles/kernel/device/oneapi/context_begin.h
A	intern/cycles/kernel/device/oneapi/context_end.h
A	intern/cycles/kernel/device/oneapi/device_id.h
A	intern/cycles/kernel/device/oneapi/dll_interface_template.h
A	intern/cycles/kernel/device/oneapi/globals.h
A	intern/cycles/kernel/device/oneapi/image.h
A	intern/cycles/kernel/device/oneapi/kernel.cpp
A	intern/cycles/kernel/device/oneapi/kernel.h
A	intern/cycles/kernel/device/oneapi/kernel_templates.h
A	intern/cycles/kernel/device/oneapi/vs2019_aot_config_file.props.cmake
A	intern/cycles/kernel/device/oneapi/vs2019_config_file.props.cmake
M	intern/cycles/kernel/types.h
M	intern/cycles/scene/scene.cpp
M	intern/cycles/util/atomic.h
M	intern/cycles/util/half.h
M	intern/cycles/util/math.h
M	intern/cycles/util/types_float2.h
M	intern/cycles/util/types_float2_impl.h
M	intern/cycles/util/types_float3.h
M	intern/cycles/util/types_float3_impl.h
M	intern/cycles/util/types_float4.h
M	intern/cycles/util/types_float4_impl.h
M	intern/cycles/util/types_float8.h
M	intern/cycles/util/types_float8_impl.h
M	intern/cycles/util/types_int2.h
M	intern/cycles/util/types_int2_impl.h
M	intern/cycles/util/types_int3.h
M	intern/cycles/util/types_int3_impl.h
M	intern/cycles/util/types_int4.h
M	intern/cycles/util/types_int4_impl.h
M	intern/cycles/util/types_uchar2.h
M	intern/cycles/util/types_uchar2_impl.h
M	intern/cycles/util/types_uchar3.h
M	intern/cycles/util/types_uchar3_impl.h
M	intern/cycles/util/types_uchar4.h
M	intern/cycles/util/types_uchar4_impl.h
M	intern/cycles/util/types_uint2.h
M	intern/cycles/util/types_uint2_impl.h
M	intern/cycles/util/types_uint3.h
M	intern/cycles/util/types_uint3_impl.h
M	intern/cycles/util/types_uint4.h
M	intern/cycles/util/types_uint4_impl.h
M	intern/cycles/util/types_ushort4.h
M	source/blender/gpu/opengl/gl_shader.cc
M	source/creator/CMakeLists.txt

===================================================================

diff --git a/CMakeLists.txt b/CMakeLists.txt
index ca457ab6b37..963e72339ab 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -456,6 +456,21 @@ if(APPLE)
   option(WITH_CYCLES_DEVICE_METAL       "Enable Cycles Apple Metal compute support" ON)
 endif()
 
+# oneAPI
+if(NOT APPLE)
+  option(WITH_CYCLES_DEVICE_ONEAPI "Enable Cycles oneAPI compute support" OFF)
+  option(WITH_CYCLES_ONEAPI_BINARIES "Enable Ahead-Of-Time compilation for Cycles oneAPI device" OFF)
+  option(WITH_CYCLES_ONEAPI_SYCL_HOST_ENABLED "Enable use of SYCL host (CPU) device execution by oneAPI implementation. This option is for debugging purposes and impacts GPU execution." OFF)
+
+  SET (CYCLES_ONEAPI_SYCL_TARGET GPUTarget STRING "oneAPI offload target to build binaries for")
+  # List of available options can be found here:
+  # https://www.intel.com/content/www/us/en/develop/documentation/oneapi-dpcpp-cpp-compiler-dev-guide-and-reference/top/compilation/ahead-of-time-compilation.html
+  # Right now public Intel driver don't support this option, because corresponding HW haven't released yet
+  SET (CYCLES_ONEAPI_AOT_TARGETS "dg2" STRING "oneAPI GPU architectures to build binaries for")
+
+  mark_as_advanced(CYCLES_ONEAPI_SYCL_TARGET)
+endif()
+
 # Draw Manager
 option(WITH_DRAW_DEBUG "Add extra debug capabilities to Draw Manager" OFF)
 mark_as_advanced(WITH_DRAW_DEBUG)
diff --git a/build_files/cmake/Modules/FindLevelZero.cmake b/build_files/cmake/Modules/FindLevelZero.cmake
new file mode 100644
index 00000000000..a60d8ba9978
--- /dev/null
+++ b/build_files/cmake/Modules/FindLevelZero.cmake
@@ -0,0 +1,56 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2021-2022 Intel Corporation
+
+# - Find Level Zero library
+# Find Level Zero headers and libraries needed by oneAPI implementation
+# This module defines
+#  LEVEL_ZERO_LIBRARY, libraries to link against in order to use L0.
+#  LEVEL_ZERO_INCLUDE_DIR, directories where L0 headers can be found.
+#  LEVEL_ZERO_ROOT_DIR, The base directory to search for L0 files.
+#                 This can also be an environment variable.
+#  LEVEL_ZERO_FOUND, If false, then don't try to use L0.
+
+IF(NOT LEVEL_ZERO_ROOT_DIR AND NOT $ENV{LEVEL_ZERO_ROOT_DIR} STREQUAL "")
+  SET(LEVEL_ZERO_ROOT_DIR $ENV{LEVEL_ZERO_ROOT_DIR})
+ENDIF()
+
+SET(_level_zero_search_dirs
+  ${LEVEL_ZERO_ROOT_DIR}
+  /usr/lib
+  /usr/local/lib
+)
+
+FIND_LIBRARY(_LEVEL_ZERO_LIBRARY
+  NAMES
+    ze_loader
+  HINTS
+    ${_level_zero_search_dirs}
+  PATH_SUFFIXES
+    lib64 lib
+)
+
+FIND_PATH(_LEVEL_ZERO_INCLUDE_DIR
+  NAMES
+    level_zero/ze_api.h
+  HINTS
+    ${_level_zero_search_dirs}
+  PATH_SUFFIXES
+    include
+)
+
+INCLUDE(FindPackageHandleStandardArgs)
+
+FIND_PACKAGE_HANDLE_STANDARD_ARGS(LevelZero DEFAULT_MSG _LEVEL_ZERO_LIBRARY _LEVEL_ZERO_INCLUDE_DIR)
+
+IF(LevelZero_FOUND)
+  SET(LEVEL_ZERO_LIBRARY ${_LEVEL_ZERO_LIBRARY})
+  SET(LEVEL_ZERO_INCLUDE_DIR ${_LEVEL_ZERO_INCLUDE_DIR} ${_LEVEL_ZERO_INCLUDE_PARENT_DIR})
+  SET(LEVEL_ZERO_FOUND TRUE)
+ELSE()
+  SET(LEVEL_ZERO_FOUND FALSE)
+ENDIF()
+
+MARK_AS_ADVANCED(
+  LEVEL_ZERO_LIBRARY
+  LEVEL_ZERO_INCLUDE_DIR
+)
diff --git a/build_files/cmake/Modules/FindSYCL.cmake b/build_files/cmake/Modules/FindSYCL.cmake
new file mode 100644
index 00000000000..0b42da8cf4f
--- /dev/null
+++ b/build_files/cmake/Modules/FindSYCL.cmake
@@ -0,0 +1,75 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2021-2022 Intel Corporation
+
+# - Find SYCL library
+# Find the native SYCL header and libraries needed by oneAPI implementation
+# This module defines
+#  SYCL_DPCPP_COMPILER, compiler from oneAPI toolkit, which will be used for compilation of SYCL code
+#  SYCL_LIBRARY, libraries to link against in order to use SYCL.
+#  SYCL_INCLUDE_DIR, directories where SYCL headers can be found
+#  SYCL_ROOT_DIR, The base directory to search for SYCL files.
+#                 This can also be an environment variable.
+#  SYCL_FOUND, If false, then don't try to use SYCL.
+
+IF(NOT SYCL_ROOT_DIR AND NOT $ENV{SYCL_ROOT_DIR} STREQUAL "")
+  SET(SYCL_ROOT_DIR $ENV{SYCL_ROOT_DIR})
+ENDIF()
+
+SET(_sycl_search_dirs
+  ${SYCL_ROOT_DIR}
+  /usr/lib
+  /usr/local/lib
+  /opt/intel/oneapi/compiler/latest/linux/
+  C:/Program\ Files\ \(x86\)/Intel/oneAPI/compiler/latest/windows
+)
+
+FIND_PROGRAM(SYCL_DPCPP_COMPILER
+  NAMES
+    dpcpp
+  HINTS
+    ${_sycl_search_dirs}
+  PATH_SUFFIXES
+    bin
+)
+
+FIND_LIBRARY(_SYCL_LIBRARY
+  NAMES
+    sycl
+  HINTS
+    ${_sycl_search_dirs}
+  PATH_SUFFIXES
+    lib64 lib
+)
+
+FIND_PATH(_SYCL_INCLUDE_DIR
+  NAMES
+    CL/sycl.hpp
+  HINTS
+    ${_sycl_search_dirs}
+  PATH_SUFFIXES
+    include
+    include/sycl
+)
+
+INCLUDE(FindPackageHandleStandardArgs)
+
+FIND_PACKAGE_HANDLE_STANDARD_ARGS(SYCL DEFAULT_MSG _SYCL_LIBRARY _SYCL_INCLUDE_DIR)
+
+IF(SYCL_FOUND)
+  SET(SYCL_LIBRARY ${_SYCL_LIBRARY})
+
+  get_filename_component(_SYCL_INCLUDE_PARENT_DIR ${_SYCL_INCLUDE_DIR} DIRECTORY)
+
+  SET(SYCL_INCLUDE_DIR ${_SYCL_INCLUDE_DIR} ${_SYCL_INCLUDE_PARENT_DIR})
+ELSE()
+  SET(SYCL_SYCL_FOUND FALSE)
+ENDIF()
+
+MARK_AS_ADVANCED(
+  SYCL_LIBRARY
+  SYCL_INCLUDE_DIR
+  SYCL_DPCPP_COMPILER
+  _SYCL_INCLUDE_DIR
+  _SYCL_INCLUDE_PARENT_DIR
+  _SYCL_LIBRARY
+)
diff --git a/build_files/cmake/config/blender_release.cmake b/build_files/cmake/config/blender_release.cmake
index 4e96975bd90..a2e2a24f000 100644
--- a/build_files/cmake/config/blender_release.cmake
+++ b/build_files/cmake/config/blender_release.cmake
@@ -85,4 +85,5 @@ if(NOT APPLE)
   set(WITH_CYCLES_CUDA_BINARIES   ON  CACHE BOOL "" FORCE)
   set(WITH_CYCLES_CUBIN_COMPILER  OFF CACHE BOOL "" FORCE)
   set(WITH_CYCLES_HIP_BINARIES    ON  CACHE BOOL "" FORCE)
+  set(WITH_CYCLES_ONEAPI_BINARIES OFF  CACHE BOOL "" FORCE)
 endif()
diff --git a/build_files/cmake/platform/platform_win32.cmake b/build_files/cmake/platform/platform_win32.cmake
index 8ae38e03fb1..8bb89f47235 100644
--- a/build_files/cmake/platform/platform_win32.cmake
+++ b/build_files/cmake/platform/platform_win32.cmake
@@ -877,3 +877,5 @@ endif()
 
 set(ZSTD_INCLUDE_DIRS ${LIBDIR}/zstd/include)
 set(ZSTD_LIBRARIES ${LIBDIR}/zstd/lib/zstd_static.lib)
+
+set(LEVEL_ZERO_ROOT_DIR ${LIBDIR}/level_zero)
\ No newline at end of file
diff --git a/intern/cycles/CMakeLists.txt b/intern/cycles/CMakeLists.txt
index 1cc3dccf426..044f86ffdfd 100644
--- a/intern/cycles/CMakeLists.txt
+++ b/intern/cycles/CMakeLists.txt
@@ -257,6 +257,10 @@ if(WITH_CYCLES_DEVICE_OPTIX)
   endif()
 endif()
 
+if (WITH_CYCLES_DEVICE_ONEAPI)
+  add_definitions(-DWITH_ONEAPI)
+endif()
+
 if(WITH_CYCLES_EMBREE)
   add_definitions(-DWITH_EMBREE)
   add_definitions(-DEMBREE_STATIC_LIB)
diff --git a/intern/cycles/blender/addon/properties.py b/intern/cycles/blender/addon/properties.py
index 24cc5735c96..4f88018354b 100644
--- a/intern/cycles/blender/addon/properties.py
+++ b/intern/cycles/blender/addon/properties.py
@@ -102,7 +102,8 @@ enum_device_type = (
     ('CUDA', "CUDA", "CUDA", 1),
     ('OPTIX', "OptiX", "OptiX", 3),
     ('HIP', "HIP", "HIP", 4),
-    ('METAL', "Metal", "Metal", 5)
+    ('METAL', "Metal", "Metal", 5),
+    ('ONEAPI', "oneAPI", "oneAPI", 6)
 )
 
 enum_texture_limit = (
@@ -1333,7 +1334,8 @@ class CyclesPreferences(bpy.types.AddonPreferences):
 
     def get_device_types(self, context):
         import _cycles
-        has_cuda, has_optix, has_hip, has_metal = _cycles.get_device_types()
+        has_cuda, has_optix, has_hip, has_metal, has_oneapi = _cycles.get_device_types()
+
         list = [('NONE', "None", "Don't use compute device", 0)]
         if has_cuda:
             list.append(('CUDA', "CUDA", "Use CUDA for GPU acceleration", 1))
@@ -1343,6 +1345,8 @@ class CyclesPreferences(bpy.types.AddonPreferences):
             list.append(('HIP', "HIP", "Use HIP for GPU acceleration", 4))
         if has_metal:
             list.append(('METAL', "Metal", "Use Metal for GPU acceleration", 5))
+        if has_oneapi:
+            list.append(('ONEAPI', "oneAPI", "Use oneAPI for GPU acceleration", 6))
 
         return list
 
@@ -1374,7 +1378,7 @@ class CyclesPreferences(bpy.types.AddonPreferences):
 
     def update_device_entries(self, device_list):
         for device in device_list:
-            if not device[1] in {'CUDA', 'OPTIX', 'CPU', 'HIP', 'METAL'}:
+            if not device[1] in {'CUDA', 'OPTIX', 'CPU', 'HIP', 'METAL', 'ONEAPI'}:
                 continue
             # Try to find existing Device entry
             entry = self.find_existing_device_entry(device)
@@ -1418,7 +1422,7 @@ class CyclesPreferences(bpy.types.AddonPreferences):
         import _cycles
         # Ensure `self.devices` is not re-allocated when the second call to
         # get_devices_for_type is made, freeing items from the first list.
-        for device_type in ('CUDA', 'OPTIX', 'HIP', 'METAL'):
+        for device_type in ('CUDA', 'OPTIX', 'HIP', 'METAL', 'ONEAPI'):
             self.update_device_entries(_cycles.available_devices(device_type))
 
     # Deprecated: use refresh_devices instead.
@@ -1483,6 +1487,13 @@ class CyclesPreferences(bpy.types.AddonPreferences):
                 col.label(text="Requires discrete AMD GPU with Vega architecture", icon='BLANK1')
                 if sys.platform[:3] == "win":
                     col.label(text="and AMD Radeon Pro 21.Q4 driver or newer", icon='BLANK1')
+            elif device_type == 'ONEAPI':
+                import sys
+                col.label(text="Requires Intel GPU with Xe architecture", icon='BLANK1')
+                if sys.platform.startswith("win"):
+                    col.label(text="and Windows driver version 101.1660 or newer", icon='BLANK1')
+                elif sys.platform.startswith("linux"):
+                    col.label(text="and Linux driver version xx.xx.20066 or newer", icon='BLANK1')
             elif device_type == 'METAL':
                 col.label(text="Requires Apple Silicon with macOS 12.2 or newer", icon='BLANK1')
                 col.label(text="or AMD with macOS 12.3 or newer", icon='BLANK1')
diff --git a/intern/cycles/blender/addon/ui.py b/intern/cycles/blender/addon/ui.py
index 1f50f3da7ae..1cabebc161c 100644
--- a/intern/cycles/blender/addon/ui.py
+++ b/intern/cycles/blender/addon/ui.py
@@ -106,6 +106,11 @@ def use_optix(context):
 


@@ Diff output truncated at 10240 characters. @@



More information about the Bf-blender-cvs mailing list