<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    That sounds promising, feel free to submit a patch for this and we

    can check. :) <br>

    <br>

    <div class="moz-cite-prefix">Am 17.05.2016 um 16:40 schrieb Stefan

      Werner:<br>

    </div>

    <blockquote

      cite="mid:77CBFD48-0301-4674-9815-84254B987742@smithmicro.com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html;

        charset=windows-1252">

      <meta name="Title" content="">

      <meta name="Keywords" content="">

      <meta name="Generator" content="Microsoft Word 15 (filtered

        medium)">

      <style><!--

/* Font Definitions */

@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0cm;

        margin-bottom:.0001pt;

        font-size:12.0pt;

        font-family:"Times New Roman";}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {mso-style-priority:99;

        color:purple;

        text-decoration:underline;}

span.EmailStyle17

        {mso-style-type:personal-reply;

        font-family:Calibri;

        color:windowtext;}

span.msoIns

        {mso-style-type:export-only;

        mso-style-name:"";

        text-decoration:underline;

        color:teal;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-size:10.0pt;}

@page WordSection1

        {size:595.0pt 842.0pt;

        margin:72.0pt 72.0pt 72.0pt 72.0pt;}

div.WordSection1

        {page:WordSection1;}

--></style>

      <div class="WordSection1">

        <p class="MsoNormal"><span

            style="font-size:11.0pt;font-family:Calibri">The patch is

            surprisingly clean. It removes some of the #ifdef

            __SPLIT_KERNEL__ blocks and unifies CPU, OpenCL and CUDA a

            bit more. I didn’t run a speed benchmark, and I wouldn’t

            even make speed the ultimate top priority: Right now, the

            problem we see in the field is that people are unable to use

            high-end gaming GPUs because the VRAM is so full of geometry

            and textures that the CUDA runtime doesn’t have room for

            kernel memory any more. On my 1664 core M4000 card, I see a

            simple kernel launch already taking ~1600MB of VRAM with

            almost empty scenes.<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:11.0pt;font-family:Calibri">It looks to me

            like the CUDA compiler reserves room for every stack

            instance of ShaderData (or other structs) in advance, and

            that sharing that memory instead of instantiating it

            separately is an easy way to reduce VRAM requirements

            without changing the code much.<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:11.0pt;font-family:Calibri">-Stefan<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>

        <div style="border:none;border-top:solid #B5C4DF

          1.0pt;padding:3.0pt 0cm 0cm 0cm">

          <p class="MsoNormal"><b><span

                style="font-family:Calibri;color:black">From: </span>

            </b><span style="font-family:Calibri;color:black"><a class="moz-txt-link-rfc2396E" href="mailto:bf-cycles-bounces@blender.org">&lt;bf-cycles-bounces@blender.org&gt;</a>

              on behalf of Sergey Sharybin <a class="moz-txt-link-rfc2396E" href="mailto:sergey.vfx@gmail.com">&lt;sergey.vfx@gmail.com&gt;</a><br>

              <b>Reply-To: </b>Discussion list to assist Cycles render

              engine developers <a class="moz-txt-link-rfc2396E" href="mailto:bf-cycles@blender.org">&lt;bf-cycles@blender.org&gt;</a><br>

              <b>Date: </b>Tuesday, May 17, 2016 at 9:20 AM<br>

              <b>To: </b>Discussion list to assist Cycles render engine

              developers <a class="moz-txt-link-rfc2396E" href="mailto:bf-cycles@blender.org">&lt;bf-cycles@blender.org&gt;</a><br>

              <b>Subject: </b>Re: [Bf-cycles] split kernel and CUDA<o:p></o:p></span></p>

        </div>

        <div>

          <p class="MsoNormal"><o:p> </o:p></p>

        </div>

        <blockquote style="border:none;border-left:solid #B5C4DF

          4.5pt;padding:0cm 0cm 0cm

          4.0pt;margin-left:3.75pt;margin-right:0cm"

          id="MAC_OUTLOOK_ATTRIBUTION_BLOCKQUOTE">

          <div>

            <div>

              <div>

                <div>

                  <p class="MsoNormal">hi,<o:p></o:p></p>

                </div>

                <div>

                  <p class="MsoNormal"><o:p> </o:p></p>

                </div>

                <p class="MsoNormal">Lukas Stocker was doing experiments

                  with CUDA split kernel. With the current design of the

                  split it was taking more VRAM actually, AFAIR.

                  Hopefully he'll read this mail and reply in more

                  details.

                  <o:p></o:p></p>

                <div>

                  <p class="MsoNormal"><o:p> </o:p></p>

                </div>

                <div>

                  <p class="MsoNormal">Would be cool to have this front

                    moving forward, but i fear we'll have to step back

                    and reconsider some things about how split kernel

                    works together with a regular one.<o:p></o:p></p>

                </div>

                <div>

                  <p class="MsoNormal"><o:p> </o:p></p>

                </div>

                <div>

                  <p class="MsoNormal">There are interesting results on

                    the stack memory! I can see number of spill loads go

                    up tho, did you measure if it gives measurable

                    render time slowdown? And how messy is the patch i

                    wonder :)<o:p></o:p></p>

                </div>

              </div>

              <div>

                <p class="MsoNormal"><o:p> </o:p></p>

                <div>

                  <p class="MsoNormal">On Tue, May 17, 2016 at 8:47 AM,

                    Stefan Werner &lt;<a moz-do-not-send="true"

                      href="mailto:swerner@smithmicro.com"

                      target="_blank">swerner@smithmicro.com</a>&gt;

                    wrote:<o:p></o:p></p>

                  <blockquote style="border:none;border-left:solid

                    #CCCCCC 1.0pt;padding:0cm 0cm 0cm

                    6.0pt;margin-left:4.8pt;margin-right:0cm">

                    <p class="MsoNormal">Hi,<br>

                      <br>

                      Has anyone experimented with building a split

                      kernel for CUDA? It seems to me that this could

                      lift some of the limitations on Nvidia hardware,

                      such as the high memory requirements on cards with

                      many CUDA cores or the driver time out. I just

                      tried out what happens when I take the shared

                      ShaderData (KernelGlobals.sd_input) from the split

                      kernel into the CUDA kernel, as opposed to

                      creating separate ShaderData structs on the stack,

                      and it looks like it has an impact:<br>

                      <br>

                      before:<br>

                      ptxas info    : Compiling entry function

                      'kernel_cuda_branched_path_trace' for 'sm_50'<br>

                      ptxas info    : Function properties for

                      kernel_cuda_branched_path_trace<br>

                          68416 bytes stack frame, 1188 bytes spill

                      stores, 3532 bytes spill loads<br>

                      <br>

                      after:<br>

                      ptxas info    : Compiling entry function

                      'kernel_cuda_branched_path_trace' for 'sm_50'<br>

                      ptxas info    : Function properties for

                      kernel_cuda_branched_path_trace<br>

                          58976 bytes stack frame, 1256 bytes spill

                      stores, 3676 bytes spill loads<br>

                      <br>

                      -Stefan<br>

                      <br>

                      _______________________________________________<br>

                      Bf-cycles mailing list<br>

                      <a moz-do-not-send="true"

                        href="mailto:Bf-cycles@blender.org">Bf-cycles@blender.org</a><br>

                      <a moz-do-not-send="true"

                        href="https://lists.blender.org/mailman/listinfo/bf-cycles"

                        target="_blank">https://lists.blender.org/mailman/listinfo/bf-cycles</a><o:p></o:p></p>

                  </blockquote>

                </div>

                <p class="MsoNormal"><br>

                  <br clear="all">

                  <o:p></o:p></p>

                <div>

                  <p class="MsoNormal"><o:p> </o:p></p>

                </div>

                <p class="MsoNormal">-- <o:p></o:p></p>

                <div>

                  <div>

                    <p class="MsoNormal"><span style="color:#666666">With

                        best regards, Sergey Sharybin</span><o:p></o:p></p>

                  </div>

                </div>

              </div>

            </div>

          </div>

        </blockquote>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Bf-cycles mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Bf-cycles@blender.org">Bf-cycles@blender.org</a>

<a class="moz-txt-link-freetext" href="https://lists.blender.org/mailman/listinfo/bf-cycles">https://lists.blender.org/mailman/listinfo/bf-cycles</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>