[Bf-blender-cvs] [caee36fcc12] cycles-x: Cycles X: Bring back tiles support

Sergey Sharybin noreply at git.blender.org
Tue Sep 7 11:09:48 CEST 2021


Commit: caee36fcc12704dad1ef820023474272a3b1ffca
Author: Sergey Sharybin
Date:   Fri Aug 13 15:02:47 2021 +0200
Branches: cycles-x
https://developer.blender.org/rBcaee36fcc12704dad1ef820023474272a3b1ffca

Cycles X: Bring back tiles support

The meaning of tiles has shifted form being a performance tweak to
a memory saving technique (avoid having full-frame buffers stored
in memory during rendering).

This is an initial implementation which brings some crucial building
blocks, such as:

- User interface.

  The tiling is default to be "Use Auto Tile". In the current version
  the "auto" part is not implemented, but the idea is to only use tiles
  when needed.

- Internal support for tile manager, render scheduler, path tracer.

Short-term plan is to replace Save Buffers with the new implementation.
In the a-bit-longer term it will also be used for resumable render.

Known limitations:

- Cancelling render without adaptive sampling or sample count pass
  replaces missing tiles with black upon cancel. This is because
  the stored buffer is all 0-ed, and zero alpha channel means fully
  opaque pixel in the render buffers.

  It will be solved by storing a meta-information for per-tile number
  of samples (which is also required for resumable render).

- Denoising happens for both tile and final result. During rendering
  it is possible to see seams on tile borders.

  It will be solved by introducing idea of tile overscan.

- Tiles are not highlighted.

  This requires changes in the highlight code on Blender side. It will
  be worked on separately.

- Peak memory usage is not ideal. Need to somehow free up session
  memory before reading full-frame file. It will be worked on as a
  follow-up development.

  The render result drawing should be done via GPUDisplay, and the pass
  memory in RenderResult is to be lazily allocated. There are spearate
  patches for that under review.

Limitations:

- Changing display pass during rendering does not change displayed pass
  of the finished tiles.

- Memory peak is still higher than with the Save Buffers option.

Those limitations are temporary and will be worked on next.

Differential Revision: https://developer.blender.org/D12309

===================================================================

M	intern/cycles/blender/addon/properties.py
M	intern/cycles/blender/addon/ui.py
M	intern/cycles/blender/blender_session.cpp
M	intern/cycles/blender/blender_sync.cpp
M	intern/cycles/integrator/path_trace.cpp
M	intern/cycles/integrator/path_trace.h
M	intern/cycles/integrator/render_scheduler.cpp
M	intern/cycles/integrator/render_scheduler.h
M	intern/cycles/render/film.cpp
M	intern/cycles/render/film.h
M	intern/cycles/render/session.cpp
M	intern/cycles/render/session.h
M	intern/cycles/render/tile.cpp
M	intern/cycles/render/tile.h
M	intern/cycles/util/util_system.cpp
M	intern/cycles/util/util_system.h

===================================================================

diff --git a/intern/cycles/blender/addon/properties.py b/intern/cycles/blender/addon/properties.py
index 9d9182a66cc..a72b0565756 100644
--- a/intern/cycles/blender/addon/properties.py
+++ b/intern/cycles/blender/addon/properties.py
@@ -736,6 +736,18 @@ class CyclesRenderSettings(bpy.types.PropertyGroup):
         min=0, max=1024,
     )
 
+    use_auto_tile: BoolProperty(
+        name="Auto Tiles",
+        description="Automatically split image into tiles",
+        default=True,
+    )
+    tile_size: IntProperty(
+        name="Tile Size",
+        default=2048,
+        description="",
+        min=0, max=16384,
+    )
+
     # Various fine-tuning debug flags
 
     def _devices_update_callback(self, context):
diff --git a/intern/cycles/blender/addon/ui.py b/intern/cycles/blender/addon/ui.py
index 5f0f62a1263..f1a16e96084 100644
--- a/intern/cycles/blender/addon/ui.py
+++ b/intern/cycles/blender/addon/ui.py
@@ -611,6 +611,25 @@ class CYCLES_RENDER_PT_performance_threads(CyclesButtonsPanel, Panel):
         sub.prop(rd, "threads")
 
 
+class CYCLES_RENDER_PT_performance_tiles(CyclesButtonsPanel, Panel):
+    bl_label = "Tiles"
+    bl_parent_id = "CYCLES_RENDER_PT_performance"
+
+    def draw(self, context):
+        layout = self.layout
+        layout.use_property_split = True
+        layout.use_property_decorate = False
+
+        scene = context.scene
+        cscene = scene.cycles
+
+        col = layout.column()
+        col.prop(cscene, "use_auto_tile")
+        sub = col.column()
+        sub.active = cscene.use_auto_tile
+        sub.prop(cscene, "tile_size")
+
+
 class CYCLES_RENDER_PT_performance_acceleration_structure(CyclesButtonsPanel, Panel):
     bl_label = "Acceleration Structure"
     bl_parent_id = "CYCLES_RENDER_PT_performance"
@@ -2086,6 +2105,7 @@ classes = (
     CYCLES_RENDER_PT_film_transparency,
     CYCLES_RENDER_PT_performance,
     CYCLES_RENDER_PT_performance_threads,
+    CYCLES_RENDER_PT_performance_tiles,
     CYCLES_RENDER_PT_performance_acceleration_structure,
     CYCLES_RENDER_PT_performance_final_render,
     CYCLES_RENDER_PT_performance_viewport,
diff --git a/intern/cycles/blender/blender_session.cpp b/intern/cycles/blender/blender_session.cpp
index 8490f4952ae..66932534482 100644
--- a/intern/cycles/blender/blender_session.cpp
+++ b/intern/cycles/blender/blender_session.cpp
@@ -164,10 +164,14 @@ void BlenderSession::create_session()
     session->set_gpu_display(make_unique<BlenderGPUDisplay>(b_engine, b_scene));
   }
 
-  /* TODO(sergey): Decide on what is to be communicated to the engine here. There is no tiled
-   * rendering for from visual point of view when render buffer fits big tile. But for huge
-   * render resolutions it might still be helpful to see which big tile is being sampled. */
-  /* b_engine.use_highlight_tiles(session_params.progressive_refine == false); */
+  /* Viewport and preview (as in, material preview) does not do tiled rendering, so can inform
+   * engine that no tracking of the tiles state is needed.
+   * The offline rendering will make a decision when tile is being written. The penalty of asking
+   * the engine to keep track of tiles state is minimal, so there is nothing to worry about here
+   * about possible single-tiled final render. */
+  if (!b_engine.is_preview() && !b_v3d) {
+    b_engine.use_highlight_tiles(true);
+  }
 
   update_resumable_tile_manager(session_params.samples);
 }
@@ -254,13 +258,6 @@ void BlenderSession::reset_session(BL::BlendData &b_data, BL::Depsgraph &b_depsg
       b_null_space_view3d, b_null_region_view3d, scene->camera, width, height);
   session->reset(buffer_params, session_params.samples);
 
-  /* TODO(sergey): Decice on what is to be communicated to the engine here. There is no tiled
-   * rendering for from visual point of view when render buffer fits big tile. But for huge
-   * render resolutions it might still be helpful to see which big tile is being sampled. */
-  /* TODO(sergey): If some logic is needed here, de-duplicate it with the constructor using some
-   * sort of utility function. */
-  /* b_engine.use_highlight_tiles(session_params.progressive_refine == false); */
-
   /* reset time */
   start_resize_time = 0.0;
 
@@ -312,8 +309,6 @@ void BlenderSession::read_render_tile()
   for (BL::RenderPass &b_pass : b_rlay.passes) {
     session->set_render_tile_pixels(b_pass.name(), b_pass.channels(), (float *)b_pass.rect());
   }
-
-  b_engine.end_result(b_rr, false, false, false);
 }
 
 void BlenderSession::write_render_tile()
@@ -529,7 +524,7 @@ void BlenderSession::render(BL::Depsgraph &b_depsgraph_)
   stamp_view_layer_metadata(scene, b_rlay_name);
 
   /* free result without merging */
-  b_engine.end_result(b_rr, true, true, false);
+  b_engine.end_result(b_rr, true, false, false);
 
   double total_time, render_time;
   session->progress.get_time(total_time, render_time);
diff --git a/intern/cycles/blender/blender_sync.cpp b/intern/cycles/blender/blender_sync.cpp
index a9d41f4b7d0..373e4237e3d 100644
--- a/intern/cycles/blender/blender_sync.cpp
+++ b/intern/cycles/blender/blender_sync.cpp
@@ -853,6 +853,14 @@ SessionParams BlenderSync::get_session_params(BL::RenderEngine &b_engine,
   params.use_profiling = params.device.has_profiling && !b_engine.is_preview() && background &&
                          BlenderSession::print_render_stats;
 
+  if (background) {
+    params.use_auto_tile = RNA_boolean_get(&cscene, "use_auto_tile");
+    params.tile_size = get_int(cscene, "tile_size");
+  }
+  else {
+    params.use_auto_tile = false;
+  }
+
   return params;
 }
 
diff --git a/intern/cycles/integrator/path_trace.cpp b/intern/cycles/integrator/path_trace.cpp
index f994190a3d5..5805fc045b7 100644
--- a/intern/cycles/integrator/path_trace.cpp
+++ b/intern/cycles/integrator/path_trace.cpp
@@ -23,6 +23,7 @@
 #include "render/gpu_display.h"
 #include "render/pass.h"
 #include "render/scene.h"
+#include "render/tile.h"
 #include "util/util_algorithm.h"
 #include "util/util_logging.h"
 #include "util/util_progress.h"
@@ -31,36 +32,25 @@
 
 CCL_NAMESPACE_BEGIN
 
-namespace {
+PathTrace::PathTrace(Device *device,
+                     Film *film,
+                     DeviceScene *device_scene,
+                     RenderScheduler &render_scheduler,
+                     TileManager &tile_manager)
+    : device_(device),
+      device_scene_(device_scene),
+      render_scheduler_(render_scheduler),
+      tile_manager_(tile_manager)
+{
+  DCHECK_NE(device_, nullptr);
 
-class TempCPURenderBuffers {
- public:
-  /* `device_template` is used to access stats and profiler. */
-  explicit TempCPURenderBuffers(Device *device_template)
   {
     vector<DeviceInfo> cpu_devices;
     device_cpu_info(cpu_devices);
 
-    device.reset(
-        device_cpu_create(cpu_devices[0], device_template->stats, device_template->profiler));
-
-    buffers = make_unique<RenderBuffers>(device.get());
+    cpu_device_.reset(device_cpu_create(cpu_devices[0], device->stats, device->profiler));
   }
 
-  unique_ptr<Device> device;
-  unique_ptr<RenderBuffers> buffers;
-};
-
-}  // namespace
-
-PathTrace::PathTrace(Device *device,
-                     Film *film,
-                     DeviceScene *device_scene,
-                     RenderScheduler &render_scheduler)
-    : device_(device), device_scene_(device_scene), render_scheduler_(render_scheduler)
-{
-  DCHECK_NE(device_, nullptr);
-
   /* Create path tracing work in advance, so that it can be reused by incremental sampling as much
    * as possible. */
   device_->foreach_device([&](Device *path_trace_device) {
@@ -129,8 +119,10 @@ void PathTrace::reset(const BufferParams &full_params, const BufferParams &big_t
   }
 
   render_state_.has_denoised_result = false;
+  render_state_.tile_written = false;
 
   did_draw_after_reset_ = false;
+  full_frame_buffers_ = nullptr;
 }
 
 void PathTrace::set_progress(Progress *progress)
@@ -196,13 +188,12 @@ void PathTrace::render_pipeline(RenderWork render_work)
     return;
   }
 
+  write_tile_buffer(render_work);
   update_display(render_work);
 
   progress_update_if_needed();
 
-  if (render_work.write_final_result) {
-    buffer_write();
-  }
+  process_full_buffer_from_disk(render_work);
 }
 
 void PathTrace::render_init_kernel_execution()
@@ -321,7 +312,7 @@ void PathTrace::init_render_buffers(const RenderWork &render_work)
       path_trace_work->zero_render_buffers();
     });
 
-    buffer_read();
+    tile_buffer_read();
   }
 }
 
@@ -448,7 +439,7 @@ void PathTrace::cryptomatte_postprocess(const RenderWork &render_work)
 
 void PathTrace::denoise(const RenderWork &render_work)
 {
-  if (!render_work.denoise) {
+  if (!render_work.tile.denoise) {
     return;
   }
 
@@ -527,11 +518,11 @@ void PathTrace::update_display(const RenderWork &render_work)
     /* TODO(sergey): Ideally the offline buffers update will be done using same API than the
      * viewport GPU display. Seems to be a matter of moving pixels update API to a more abstract
      * class and using it here instead of `GPUDisplay`. */
-    if (buffer_update_cb) {
+    if (tile_buffer_update_cb) {
       VLOG(3) << "Invoke buffer update callback.";
 
       const double start_time = time_dt();
-      buffer_update_cb();
+      tile_buffer_update_cb();
       render_scheduler_.report_display_update_time(render_work, time_dt() - start_time);
     }
     else {
@@ -615,19 +606,101 @@ void PathTrace::rebalance(const RenderWork &render_work)
     return;
   }
 
-  TempCPURenderBuffers big_tile_cpu_buffers(device_);
-  big_tile_cpu_buffers.buffers->reset(render_state_.effective_big_tile_params);
+  RenderBuffers big_tile_cpu_buffers(cpu_device_.get());
+  big_tile_cpu_buffers.reset(render_state_.effective_big_tile_params);
 
-  copy_to_render_buffers(big_tile_cpu_buffers.buffers.get());
+  copy_to_render_buffers(&big_tile_cpu_buffers);
 
   render_state_.need_reset_params = true;
   update_work_buffer_params_if_needed(render_work);
 
-  copy_from_render_buffers(big_tile_cpu_buffers.buffers.get());
+  copy_from_render_buffers(&big_tile_cpu_buffers);
 
   render_scheduler_.report_rebalance_time(render_work, time_dt() - start_time, true);
 }
 
+void PathTrace::write_tile_buffer(const RenderWork 

@@ Diff output truncated at 10240 characters. @@



More information about the Bf-blender-cvs mailing list