[Bf-blender-cvs] [a0f269f682d] master: Cycles: Kernel address space changes for MSL

Michael Jones noreply at git.blender.org
Thu Oct 14 17:23:00 CEST 2021


Commit: a0f269f682dab848afc80cd322d04a0c4a815cae
Author: Michael Jones
Date:   Thu Oct 14 13:53:40 2021 +0100
Branches: master
https://developer.blender.org/rBa0f269f682dab848afc80cd322d04a0c4a815cae

Cycles: Kernel address space changes for MSL

This is the first of a sequence of changes to support compiling Cycles kernels as MSL (Metal Shading Language) in preparation for a Metal GPU device implementation.

MSL requires that all pointer types be declared with explicit address space attributes (device, thread, etc...). There is already precedent for this with Cycles' address space macros (ccl_global, ccl_private, etc...), therefore the first step of MSL-enablement is to apply these consistently. Line-for-line this represents the largest change required to enable MSL. Applying this change first will simplify future patches as well as offering the emergent benefit of enhanced descriptiveness.

The vast majority of deltas in this patch fall into one of two cases:

- Ensuring ccl_private is specified for thread-local pointer types
- Ensuring ccl_global is specified for device-wide pointer types

Additionally, the ccl_addr_space qualifier can be removed. Prior to Cycles X, ccl_addr_space was used as a context-dependent address space qualifier, but now it is either redundant (e.g. in struct typedefs), or can be replaced by ccl_global in the case of pointer types. Associated function variants (e.g. lcg_step_float_addrspace) are also redundant.

In cases where address space qualifiers are chained with "const", this patch places the address space qualifier first. The rationale for this is that the choice of address space is likely to have the greater impact on runtime performance and overall architecture.

The final part of this patch is the addition of a metal/compat.h header. This is partially complete and will be extended in future patches, paving the way for the full Metal implementation.

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D12864

===================================================================

M	intern/cycles/kernel/bvh/bvh.h
M	intern/cycles/kernel/bvh/bvh_local.h
M	intern/cycles/kernel/bvh/bvh_nodes.h
M	intern/cycles/kernel/bvh/bvh_shadow_all.h
M	intern/cycles/kernel/bvh/bvh_traversal.h
M	intern/cycles/kernel/bvh/bvh_util.h
M	intern/cycles/kernel/bvh/bvh_volume.h
M	intern/cycles/kernel/bvh/bvh_volume_all.h
M	intern/cycles/kernel/closure/alloc.h
M	intern/cycles/kernel/closure/bsdf.h
M	intern/cycles/kernel/closure/bsdf_ashikhmin_shirley.h
M	intern/cycles/kernel/closure/bsdf_ashikhmin_velvet.h
M	intern/cycles/kernel/closure/bsdf_diffuse.h
M	intern/cycles/kernel/closure/bsdf_diffuse_ramp.h
M	intern/cycles/kernel/closure/bsdf_hair.h
M	intern/cycles/kernel/closure/bsdf_hair_principled.h
M	intern/cycles/kernel/closure/bsdf_microfacet.h
M	intern/cycles/kernel/closure/bsdf_microfacet_multi.h
M	intern/cycles/kernel/closure/bsdf_microfacet_multi_impl.h
M	intern/cycles/kernel/closure/bsdf_oren_nayar.h
M	intern/cycles/kernel/closure/bsdf_phong_ramp.h
M	intern/cycles/kernel/closure/bsdf_principled_diffuse.h
M	intern/cycles/kernel/closure/bsdf_principled_sheen.h
M	intern/cycles/kernel/closure/bsdf_reflection.h
M	intern/cycles/kernel/closure/bsdf_refraction.h
M	intern/cycles/kernel/closure/bsdf_toon.h
M	intern/cycles/kernel/closure/bsdf_transparent.h
M	intern/cycles/kernel/closure/bsdf_util.h
M	intern/cycles/kernel/closure/bssrdf.h
M	intern/cycles/kernel/closure/emissive.h
M	intern/cycles/kernel/closure/volume.h
M	intern/cycles/kernel/device/cpu/compat.h
M	intern/cycles/kernel/device/cuda/compat.h
M	intern/cycles/kernel/device/hip/compat.h
A	intern/cycles/kernel/device/metal/compat.h
M	intern/cycles/kernel/device/optix/compat.h
M	intern/cycles/kernel/geom/geom_attribute.h
M	intern/cycles/kernel/geom/geom_curve.h
M	intern/cycles/kernel/geom/geom_curve_intersect.h
M	intern/cycles/kernel/geom/geom_motion_curve.h
M	intern/cycles/kernel/geom/geom_motion_triangle.h
M	intern/cycles/kernel/geom/geom_motion_triangle_intersect.h
M	intern/cycles/kernel/geom/geom_motion_triangle_shader.h
M	intern/cycles/kernel/geom/geom_object.h
M	intern/cycles/kernel/geom/geom_patch.h
M	intern/cycles/kernel/geom/geom_primitive.h
M	intern/cycles/kernel/geom/geom_shader_data.h
M	intern/cycles/kernel/geom/geom_subd_triangle.h
M	intern/cycles/kernel/geom/geom_triangle.h
M	intern/cycles/kernel/geom/geom_triangle_intersect.h
M	intern/cycles/kernel/geom/geom_volume.h
M	intern/cycles/kernel/integrator/integrator_init_from_bake.h
M	intern/cycles/kernel/integrator/integrator_init_from_camera.h
M	intern/cycles/kernel/integrator/integrator_intersect_closest.h
M	intern/cycles/kernel/integrator/integrator_intersect_shadow.h
M	intern/cycles/kernel/integrator/integrator_intersect_volume_stack.h
M	intern/cycles/kernel/integrator/integrator_shade_background.h
M	intern/cycles/kernel/integrator/integrator_shade_light.h
M	intern/cycles/kernel/integrator/integrator_shade_shadow.h
M	intern/cycles/kernel/integrator/integrator_shade_surface.h
M	intern/cycles/kernel/integrator/integrator_shade_volume.h
M	intern/cycles/kernel/integrator/integrator_state.h
M	intern/cycles/kernel/integrator/integrator_state_util.h
M	intern/cycles/kernel/integrator/integrator_subsurface.h
M	intern/cycles/kernel/integrator/integrator_subsurface_disk.h
M	intern/cycles/kernel/integrator/integrator_subsurface_random_walk.h
M	intern/cycles/kernel/integrator/integrator_volume_stack.h
M	intern/cycles/kernel/kernel_accumulate.h
M	intern/cycles/kernel/kernel_adaptive_sampling.h
M	intern/cycles/kernel/kernel_bake.h
M	intern/cycles/kernel/kernel_camera.h
M	intern/cycles/kernel/kernel_color.h
M	intern/cycles/kernel/kernel_differential.h
M	intern/cycles/kernel/kernel_emission.h
M	intern/cycles/kernel/kernel_film.h
M	intern/cycles/kernel/kernel_id_passes.h
M	intern/cycles/kernel/kernel_jitter.h
M	intern/cycles/kernel/kernel_light.h
M	intern/cycles/kernel/kernel_light_background.h
M	intern/cycles/kernel/kernel_light_common.h
M	intern/cycles/kernel/kernel_lookup_table.h
M	intern/cycles/kernel/kernel_montecarlo.h
M	intern/cycles/kernel/kernel_passes.h
M	intern/cycles/kernel/kernel_path_state.h
M	intern/cycles/kernel/kernel_projection.h
M	intern/cycles/kernel/kernel_random.h
M	intern/cycles/kernel/kernel_shader.h
M	intern/cycles/kernel/kernel_types.h
M	intern/cycles/kernel/svm/svm.h
M	intern/cycles/kernel/svm/svm_ao.h
M	intern/cycles/kernel/svm/svm_aov.h
M	intern/cycles/kernel/svm/svm_attribute.h
M	intern/cycles/kernel/svm/svm_bevel.h
M	intern/cycles/kernel/svm/svm_blackbody.h
M	intern/cycles/kernel/svm/svm_brick.h
M	intern/cycles/kernel/svm/svm_brightness.h
M	intern/cycles/kernel/svm/svm_bump.h
M	intern/cycles/kernel/svm/svm_camera.h
M	intern/cycles/kernel/svm/svm_checker.h
M	intern/cycles/kernel/svm/svm_clamp.h
M	intern/cycles/kernel/svm/svm_closure.h
M	intern/cycles/kernel/svm/svm_convert.h
M	intern/cycles/kernel/svm/svm_displace.h
M	intern/cycles/kernel/svm/svm_fresnel.h
M	intern/cycles/kernel/svm/svm_gamma.h
M	intern/cycles/kernel/svm/svm_geometry.h
M	intern/cycles/kernel/svm/svm_gradient.h
M	intern/cycles/kernel/svm/svm_hsv.h
M	intern/cycles/kernel/svm/svm_ies.h
M	intern/cycles/kernel/svm/svm_image.h
M	intern/cycles/kernel/svm/svm_invert.h
M	intern/cycles/kernel/svm/svm_light_path.h
M	intern/cycles/kernel/svm/svm_magic.h
M	intern/cycles/kernel/svm/svm_map_range.h
M	intern/cycles/kernel/svm/svm_mapping.h
M	intern/cycles/kernel/svm/svm_math.h
M	intern/cycles/kernel/svm/svm_math_util.h
M	intern/cycles/kernel/svm/svm_mix.h
M	intern/cycles/kernel/svm/svm_musgrave.h
M	intern/cycles/kernel/svm/svm_noisetex.h
M	intern/cycles/kernel/svm/svm_normal.h
M	intern/cycles/kernel/svm/svm_ramp.h
M	intern/cycles/kernel/svm/svm_sepcomb_hsv.h
M	intern/cycles/kernel/svm/svm_sepcomb_vector.h
M	intern/cycles/kernel/svm/svm_sky.h
M	intern/cycles/kernel/svm/svm_tex_coord.h
M	intern/cycles/kernel/svm/svm_value.h
M	intern/cycles/kernel/svm/svm_vector_rotate.h
M	intern/cycles/kernel/svm/svm_vector_transform.h
M	intern/cycles/kernel/svm/svm_vertex_color.h
M	intern/cycles/kernel/svm/svm_voronoi.h
M	intern/cycles/kernel/svm/svm_voxel.h
M	intern/cycles/kernel/svm/svm_wave.h
M	intern/cycles/kernel/svm/svm_wavelength.h
M	intern/cycles/kernel/svm/svm_white_noise.h
M	intern/cycles/kernel/svm/svm_wireframe.h
M	intern/cycles/util/util_color.h
M	intern/cycles/util/util_half.h
M	intern/cycles/util/util_math.h
M	intern/cycles/util/util_math_fast.h
M	intern/cycles/util/util_math_float2.h
M	intern/cycles/util/util_math_float3.h
M	intern/cycles/util/util_math_float4.h
M	intern/cycles/util/util_math_intersect.h
M	intern/cycles/util/util_math_matrix.h
M	intern/cycles/util/util_projection.h
M	intern/cycles/util/util_rect.h
M	intern/cycles/util/util_transform.h

===================================================================

diff --git a/intern/cycles/kernel/bvh/bvh.h b/intern/cycles/kernel/bvh/bvh.h
index 0b44cc5db34..8f6dcd0adb9 100644
--- a/intern/cycles/kernel/bvh/bvh.h
+++ b/intern/cycles/kernel/bvh/bvh.h
@@ -139,7 +139,7 @@ CCL_NAMESPACE_BEGIN
 
 #endif /* __KERNEL_OPTIX__ */
 
-ccl_device_inline bool scene_intersect_valid(const Ray *ray)
+ccl_device_inline bool scene_intersect_valid(ccl_private const Ray *ray)
 {
   /* NOTE: Due to some vectorization code  non-finite origin point might
    * cause lots of false-positive intersections which will overflow traversal
@@ -154,10 +154,10 @@ ccl_device_inline bool scene_intersect_valid(const Ray *ray)
   return isfinite_safe(ray->P.x) && isfinite_safe(ray->D.x) && len_squared(ray->D) != 0.0f;
 }
 
-ccl_device_intersect bool scene_intersect(const KernelGlobals *kg,
-                                          const Ray *ray,
+ccl_device_intersect bool scene_intersect(ccl_global const KernelGlobals *kg,
+                                          ccl_private const Ray *ray,
                                           const uint visibility,
-                                          Intersection *isect)
+                                          ccl_private Intersection *isect)
 {
 #ifdef __KERNEL_OPTIX__
   uint p0 = 0;
@@ -248,11 +248,11 @@ ccl_device_intersect bool scene_intersect(const KernelGlobals *kg,
 }
 
 #ifdef __BVH_LOCAL__
-ccl_device_intersect bool scene_intersect_local(const KernelGlobals *kg,
-                                                const Ray *ray,
-                                                LocalIntersection *local_isect,
+ccl_device_intersect bool scene_intersect_local(ccl_global const KernelGlobals *kg,
+                                                ccl_private const Ray *ray,
+                                                ccl_private LocalIntersection *local_isect,
                                                 int local_object,
-                                                uint *lcg_state,
+                                                ccl_private uint *lcg_state,
                                                 int max_hits)
 {
 #  ifdef __KERNEL_OPTIX__
@@ -360,12 +360,12 @@ ccl_device_intersect bool scene_intersect_local(const KernelGlobals *kg,
 #endif
 
 #ifdef __SHADOW_RECORD_ALL__
-ccl_device_intersect bool scene_intersect_shadow_all(const KernelGlobals *kg,
-                                                     const Ray *ray,
-                                                     Intersection *isect,
+ccl_device_intersect bool scene_intersect_shadow_all(ccl_global const KernelGlobals *kg,
+                                                     ccl_private const Ray *ray,
+                                                     ccl_private Intersection *isect,
                                                      uint visibility,
                                                      uint max_hits,
-                                                     uint *num_hits)
+                                                     ccl_private uint *num_hits)
 {
 #  ifdef __KERNEL_OPTIX__
   uint p0 = ((uint64_t)isect) & 0xFFFFFFFF;
@@ -445,9 +445,9 @@ ccl_device_intersect bool scene_intersect_shadow_all(const KernelGlobals *kg,
 #endif /* __SHADOW_RECORD_ALL__ */
 
 #ifdef __VOLUME__
-ccl_device_intersect bool scene_intersect_volume(const KernelGlobals *kg,
-                                                 const Ray *ray,
-                                                 Intersection *isect,
+ccl_device_intersect bool scene_intersect_volume(ccl_global const KernelGlobals *kg,
+                                                 ccl_private const Ray *ray,
+                                                 ccl_private Intersection *isect,
                                                  const uint visibility)
 {
 #  ifdef __KERNEL_OPTIX__
@@ -507,9 +507,9 @@ ccl_device_intersect bool scene_intersect_volume(const KernelGlobals *kg,
 #endif /* __VOLUME__ */
 
 #ifdef __VOLUME_RECORD_ALL__
-ccl_device_intersect uint scene_intersect_volume_all(const KernelGlobals *kg,
-                                                     const Ray *ray,
-                                                     Intersection *isect,
+ccl_device_intersect uint scene_intersect_volume_all(ccl_global const KernelGlobals *kg,
+                                                     ccl_private const Ray *ray,
+                                                     ccl_private Intersection *isect,
                                                      const uint max_hits,
                                                      const uint visibility)
 {
diff --git a/intern/cycles/kernel/bvh/bvh_local.h b/intern/cycles/kernel/bvh/bvh_local.h
index 90b9f410b29..78ad4a34da9 100644
--- a/intern/cycles/kernel/bvh/bvh_local.h
+++ b/intern/cycles/kernel/bvh/bvh_local.h
@@ -36,11 +36,11 @@ ccl_device
 #else
 ccl_device_inline
 #endif
-    bool BVH_FUNCTION_FULL_NAME(BVH)(const KernelGlobals *kg,
-                                     const Ray *ray,
-                                     LocalIntersection *local_isect,
+    bool BVH_FUNCTION_FULL_NAME(BVH)(ccl_global const KernelGlobals *kg,
+                                     ccl_private const Ray *ray,
+                                     ccl_private LocalIntersection *local_isect,
                                      int local_object,
-                                     uint *lcg_state,
+                                     ccl_private uint *lcg_state,
                                      int max_hits)
 {
   /* todo:
@@ -196,11 +196,11 @@ ccl_device_inline
   return false;
 }
 
-ccl_device_inline bool BVH_FUNCTION_NAME(const KernelGlobals *kg,
-                                         const Ray *ray,
-                                         LocalIntersection *local_isect,
+ccl_device_inline bool BVH_FUNCTION_NAME(ccl_global const KernelGlobals *kg,
+                                         ccl_private const Ray *ray,
+                                         ccl_private LocalIntersection *local_isect,
                                          int local_object,
-                                         uint *lcg_state,
+                                         ccl_private uint *lcg_state,
                                          int max_hits)
 {
   return BVH_FUNCTION_FULL_NAME(BVH)(kg, ray, local_isect, local_object, lcg_state, max_hits);
diff --git a/intern/cycles/kernel/bvh/bvh_nodes.h b/intern/cycles/kernel/bvh/bvh_nodes.h
index 15cd0f22213..49b37f39671 100644
--- a/intern/cycles/kernel/bvh/bvh_nodes.h
+++ b/intern/cycles/kernel/bvh/bvh_nodes.h
@@ -16,7 +16,7 @@
 
 // TODO(sergey): Look into avoid use of full Transform and use 3x3 matrix and
 // 3-vector which might be faster.
-ccl_device_forceinline Transform bvh_unaligned_node_fetch_space(const KernelGlobals *kg,
+ccl_device_forceinline Transform bvh_unaligned_node_fetch_space(ccl_global const KernelGlobals *kg,
                                                                 int node_addr,
                                                                 int child)
 {
@@ -28,7 +28,7 @@ ccl_device_forceinline Transform bvh_unaligned_node_fetch_space(const KernelGlob
   return space;
 }
 
-ccl_device_forceinline int bvh_aligned_node_intersect(const KernelGlobals *kg,
+ccl_device_forceinline int bvh_aligned_node_intersect(ccl_global const KernelGlobals *kg,
                                                       const float3 P,
                                                       const float3 idir,
                                                       const float t,
@@ -76,7 +76,7 @@ ccl_device_forceinline int bvh_aligned_node_intersect(const KernelGlobals *kg,
 #endif
 }
 
-ccl_device_forceinline bool bvh_unaligned_node_intersect_child(const KernelGlobals *kg,
+ccl_device_forceinline bool bvh_unaligned_node_intersect_child(ccl_global const KernelGlobals *kg,
                                                                const float3 P,
                                                                const float3 dir,
                                                                const float t,
@@ -102,7 +102,7 @@ ccl_device_forceinline bool bvh_unaligned_node_intersect_child(const KernelGloba
   return tnear <= tfar;
 }
 
-ccl_device_forceinline int bvh_unaligned_node_intersect(const KernelGlobals *kg,
+ccl_device_forceinline int bvh_unaligned_node_intersect(ccl_global const KernelGlobals *kg,
                                                         const float3 P,
                                                         const float3 dir,
                                                         const float3 idir,
@@ -134,7 +134,7 @@ ccl_device_forceinline int bvh_unaligned_node_intersect(const KernelGlobals *kg,
   return mask;
 }
 
-ccl_device_forceinline int bvh_node_intersect(const KernelGlobals *kg,
+ccl_device_forceinline int bvh_node_intersect(ccl_global const KernelGlobals *kg,
                                               const float3 P,
                                               const float3 dir,
                                               const float3 idir,
diff --git a/intern/cycles/kernel/bvh/bvh_shadow_all.h b/intern/cycles/kernel/bvh/bvh_shadow_all.h
index 82c7c1a8a6c..c67c820edbc 100644
--- a/intern/cycles/kernel/bvh/bvh_shadow_all.h
+++ b/intern/cycles/kernel/bvh/bvh_shadow_all.h
@@ -36,12 +36,12 @@ ccl_device
 #else
 ccl_device_inline
 #endif
-    bool BVH_FUNCTION_FULL_NAME(BVH)(const KernelGlobals *kg,
-                                     const Ray *ray,
-                                     Intersection *isect_array,
+    bool BVH_FUNCTION_FULL_NAME(BVH)(ccl_global const KernelGlobals *kg,
+                                     ccl_private const Ray *ray,
+                                     ccl_private Intersection *isect_array,
                                      const uint visibility,
                                      const uint max_hits,
-                                     uint *num_hits)
+                                     ccl_private uint *num_hits)
 {
   /* todo:
    * - likely and unlikely for if() statements
@@ -71,7 +71,7 @@ ccl_device_inline
   float t_world_to_instance = 1.0f;

@@ Diff output truncated at 10240 characters. @@



More information about the Bf-blender-cvs mailing list