[Bf-cycles] osx + ati + cycles, looking for insights in debugging

Dalai Felinto dfelinto at gmail.com
Sat Dec 10 09:39:24 CET 2011


Hi Brecht,

Your latest patch [1] fix the problem only partially. Now it doesn't break
with 1280x1280 and lower, but in 2600x2600 it starts to crash. Funny fact,
with 2550x2550 or 2570x2570 I get a black bar on top [2].

Jens said that in Sig (lux and small lux) they are splitting the tasks as
well. So it's a good direction to go.
That still doesn't solve the problem of shaders not building as Jens
reported to me. I wonder if it's already time to address that.

Cheers,
Dalai


[1] - http://www.pasteall.org/27291/diff
[2] - http://www.pasteall.org/pic/show.php?id=22390 in oppose to
http://www.pasteall.org/pic/show.php?id=22391

2011/12/8 Dalai Felinto <dfelinto at gmail.com>

> Hi,
> It seems that the viewport problem is related to ShaderData or
> ShaderClosure alignment or other problems.
> The problem is pretty much narrowed down to the following:
>
> (1) in kernel_path.h : both lines 275 and 277 can break if not commented
> from the code:
> - shader_setup_from_ray(kg, &sd, &isect, &ray);
> - shader_eval_surface(kg, &sd, rbsdf, state.flag);
>
> (2) in kernel_shader.h : shader_eval_surface() consist of only 2 lines of
> code (for opencl). Both lines (486 and 487) can also break if not commented
> out:
>  - bsdf_diffuse_setup(sd, &sd->closure);
> -  sd->closure.weight = make_float3(0.8f, 0.8f, 0.8f);
>
> Notes:
> --------
> 1) by break I mean, "breaks" the viewport or get a pitch black render
> 2) resolutions used for testing(Width x Height @ Tiles): 820x772 at 820(breaks), 820x772 at 818(works)
> 3) by "both lines need to be commented" I mean, if either one is left
> uncommented it breaks.
> 4) I get the same results with or without the kernel_compat_cl.h patch for
> #define float3 float4
>  5) my test file - http://www.pasteall.org/blend/10191
> 6) example of "working" image when commenting the above mentioned lines:
> http://www.pasteall.org/pic/show.php?id=22245
>
> Best regards,
> Dalai
>
> -- getting close ;)
>
> 2011/12/7 Jens Verwiebe <info at jensverwiebe.de>
>
>> Hi
>> As a hint i found these 2 writings:
>>
>> So about that float padding[2]; in the struct definition. This is
>> because of memoryalignment in OpenCL. The best explanation I’ve seen so
>> far is by AndreasStahl<http://forums.amd.com/forum/messageview.cfm?catid=390&threadid=122209&forumid=9> which
>> I will briefly summarize in relation to the struct above.
>> When interpreting a struct, OpenCL accesses the memory in blocks of 16
>> bytes, which is the same as 4 floats (each 4 bytes). So in our example if
>> we did not have the padding, we would not be able to access our int because
>> opencl would have interpreted it as the 3rd float out of the first 16bytes.
>> This can get even more complicated if you have an array of structs, because
>> then the size of you’re struct will need to be a multiple of 16, as
>> explained in the linked forum post.
>> The orig AMD posting:
>>
>>
>> http://forums.amd.com/forum/messageview.cfm?catid=390&threadid=122209&forumid=9
>> *
>> *
>> Conclusion: There are indeed places where the compiler does not take care
>> of alignment ( bug? ), so we must do ourselfes.
>>
>> Jens
>>
>>
>>
>>
>>
>>
>> Am 07.12.2011 um 16:38 schrieb Brecht Van Lommel:
>>
>> Hi,
>>
>> Maybe it would be possible to set up an SSH login on a computer so I
>> can try and narrow this down further? There has to be a pattern here,
>> but I'm not sure what to suggest you to test, what I would do is to
>> keep removing code until it works, and find out exactly which kind of
>> struct variable layout or operation is causing issues.
>>
>> On Wed, Dec 7, 2011 at 7:11 AM, Dalai Felinto <dfelinto at gmail.com> wrote:
>>
>> "Good news", I believe both problems may indeed be related and of
>> alignment.
>>
>> At least the first one.
>>
>>
>> If I change the alignment in kernel_type.h in simple ways [*] I get the Z
>>
>> problem fixed, but the viewport broken in situations it was working
>> before.
>>
>> If I go to camera fly mode the viewport is fine ('dirty' but fine) until I
>>
>> stop then I get the (2) problem. So what are the rules for alignment here?
>>
>>
>> [*] http://www.pasteall.org/27124
>>
>>
>> The alignment issue I was thinking of is when passing structs between
>> the C++ and OpenCL. If they layout variables in structs differently,
>> copying KernelData to the GPU would go wrong. So what I've tried to do
>> is align everything like we do in makesdna, and make all struct sizes
>> a multiple of 16 bytes.
>>
>> What is quite strange, is that the struct where you added alignment
>> does not get passed between C++ and OpenCL, it stays purely in OpenCL,
>> and so I wouldn't expect this to be a problem. Maybe there's a
>> compiler bug related to nested structs?
>>
>> From the specification:
>>
>>
>> "The OpenCL compiler is responsible for aligning data items to the
>> appropriate alignment as required by the data type. For arguments to a
>> __kernel function declared to be a pointer to a data type, the OpenCL
>> compiler can assume that the pointee is always appropriately aligned
>> as required by the data type."
>>
>> So this means that in principle we shouldn't have to worry about
>> alignment of structs that stay on the GPU, and only for KernelData we
>> need to worry about alignment. But of course compiler bugs can make
>> this an issue.
>>
>> "For 3-component vector data types, the size of the data type is 4 *
>> sizeof(component). This means that a 3-component vector data type will
>> be aligned to a 4 * sizeof(component) boundary."
>>
>> This seems to indicate that differential3 should in principle be
>> aligned already. We can add padding in various structs, but according
>> to the spec adding float2 to differential3 shouldn't actually improve
>> alignment, so I'm not sure what the rule would be.
>>
>> Regarding (2) viewport problem:
>>
>> ----------------------------------------------
>>
>> Does F12 rendering work (with resolutions that fail in the viewport) or is
>>
>> it just viewport rendering? ...
>>
>>  http://www.pasteall.org/27116/diff
>>
>> The dumped image is as bad as the viewport one. A note: the image is not
>>
>> 'broken' right away. While the light calculation is still dirty, the image
>>
>> is correct. Only after a few (set_tile() resolution > 8 or so) moments the
>>
>> viewport breaks.
>>
>>
>> Ok, so it's not an opengl issue.
>>
>> Another thing you could test is the workgroup_size in device_opencl.cpp in
>>
>> path_trace and tonemap.
>>
>> workgroup_size is calculated as 16. If I change it to 1, 2, 4, 8, 16 it
>>
>> produces the same result.
>>
>> If I change it to 32 OpenCL throw an error on me (OpenCL error (-54):
>>
>> Invalid work group size)
>>
>>
>> Ok, so workgroup size is probably not the issue.
>>
>> Brecht.
>> _______________________________________________
>> Bf-cycles mailing list
>> Bf-cycles at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-cycles
>>
>>
>>  _____________________________________
>>
>> Jens Verwiebe
>> Allerskehre 44  -  22309 Hamburg
>>
>> Tel.: +49 40 68 78 50
>> mobil: +49 172 400 49 07
>> mailto: info at jensverwiebe.de
>> web:  http://www.jensverwiebe.de
>> _____________________________________
>>
>>
>> _______________________________________________
>> Bf-cycles mailing list
>> Bf-cycles at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-cycles
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.blender.org/pipermail/bf-cycles/attachments/20111210/3c28f5d6/attachment.htm 


More information about the Bf-cycles mailing list