[Bf-cycles] osx + ati + cycles, looking for insights in debugging

Brecht Van Lommel brechtvanlommel at pandora.be
Sat Dec 10 15:28:16 CET 2011


Hi,

I set it to split into a fixed number of parts now, but will add code
so that it's based on doing blocks of max N pixels instead. That
should make it work then. I couldn't find anything in the opencl spec
about global range limitations for execution, but is a simple enough
fix.

What is exactly the problem of shaders not building, that it does only
ambient occlusion render? In kernel_types.h, under /* Kernel Features
*/, you can see which features are enabled. In device_opencl.cpp, it
enables more for NVidia drivers. For Apple drivers, more features can
be enabled too when they are verified to work.

Thanks,
Brecht.

On Sat, Dec 10, 2011 at 9:39 AM, Dalai Felinto <dfelinto at gmail.com> wrote:
> Hi Brecht,
>
> Your latest patch [1] fix the problem only partially. Now it doesn't break
> with 1280x1280 and lower, but in 2600x2600 it starts to crash. Funny fact,
> with 2550x2550 or 2570x2570 I get a black bar on top [2].
>
> Jens said that in Sig (lux and small lux) they are splitting the tasks as
> well. So it's a good direction to go.
> That still doesn't solve the problem of shaders not building as Jens
> reported to me. I wonder if it's already time to address that.
>
> Cheers,
> Dalai
>
>
> [1] - http://www.pasteall.org/27291/diff
> [2] - http://www.pasteall.org/pic/show.php?id=22390 in oppose
> to http://www.pasteall.org/pic/show.php?id=22391
>
>
> 2011/12/8 Dalai Felinto <dfelinto at gmail.com>
>>
>> Hi,
>> It seems that the viewport problem is related to ShaderData or
>> ShaderClosure alignment or other problems.
>> The problem is pretty much narrowed down to the following:
>>
>> (1) in kernel_path.h : both lines 275 and 277 can break if not commented
>> from the code:
>> - shader_setup_from_ray(kg, &sd, &isect, &ray);
>> - shader_eval_surface(kg, &sd, rbsdf, state.flag);
>>
>> (2) in kernel_shader.h : shader_eval_surface() consist of only 2 lines of
>> code (for opencl). Both lines (486 and 487) can also break if not commented
>> out:
>>  - bsdf_diffuse_setup(sd, &sd->closure);
>> -  sd->closure.weight = make_float3(0.8f, 0.8f, 0.8f);
>>
>> Notes:
>> --------
>> 1) by break I mean, "breaks" the viewport or get a pitch black render
>> 2) resolutions used for testing(Width x Height @ Tiles): 820x772 at 820
>> (breaks), 820x772 at 818 (works)
>> 3) by "both lines need to be commented" I mean, if either one is left
>> uncommented it breaks.
>> 4) I get the same results with or without the kernel_compat_cl.h patch for
>> #define float3 float4
>> 5) my test file - http://www.pasteall.org/blend/10191
>> 6) example of "working" image when commenting the above mentioned lines:
>> http://www.pasteall.org/pic/show.php?id=22245
>>
>> Best regards,
>> Dalai
>>
>> -- getting close ;)
>>
>> 2011/12/7 Jens Verwiebe <info at jensverwiebe.de>
>>>
>>> Hi
>>> As a hint i found these 2 writings:
>>>
>>> So about that float padding[2]; in the struct definition. This is because
>>> of memoryalignment in OpenCL. The best explanation I’ve seen so far is
>>> by AndreasStahl which I will briefly summarize in relation to the struct
>>> above.
>>> When interpreting a struct, OpenCL accesses the memory in blocks of 16
>>> bytes, which is the same as 4 floats (each 4 bytes). So in our example if we
>>> did not have the padding, we would not be able to access our int because
>>> opencl would have interpreted it as the 3rd float out of the first 16bytes.
>>> This can get even more complicated if you have an array of structs, because
>>> then the size of you’re struct will need to be a multiple of 16, as
>>> explained in the linked forum post.
>>> The orig AMD posting:
>>>
>>>
>>> http://forums.amd.com/forum/messageview.cfm?catid=390&threadid=122209&forumid=9
>>>
>>> Conclusion: There are indeed places where the compiler does not take care
>>> of alignment ( bug? ), so we must do ourselfes.
>>>
>>> Jens
>>>
>>>
>>>
>>>
>>>
>>>
>>> Am 07.12.2011 um 16:38 schrieb Brecht Van Lommel:
>>>
>>> Hi,
>>>
>>> Maybe it would be possible to set up an SSH login on a computer so I
>>> can try and narrow this down further? There has to be a pattern here,
>>> but I'm not sure what to suggest you to test, what I would do is to
>>> keep removing code until it works, and find out exactly which kind of
>>> struct variable layout or operation is causing issues.
>>>
>>> On Wed, Dec 7, 2011 at 7:11 AM, Dalai Felinto <dfelinto at gmail.com> wrote:
>>>
>>> "Good news", I believe both problems may indeed be related and of
>>> alignment.
>>>
>>> At least the first one.
>>>
>>>
>>> If I change the alignment in kernel_type.h in simple ways [*] I get the Z
>>>
>>> problem fixed, but the viewport broken in situations it was working
>>> before.
>>>
>>> If I go to camera fly mode the viewport is fine ('dirty' but fine) until
>>> I
>>>
>>> stop then I get the (2) problem. So what are the rules for alignment
>>> here?
>>>
>>>
>>> [*] http://www.pasteall.org/27124
>>>
>>>
>>> The alignment issue I was thinking of is when passing structs between
>>> the C++ and OpenCL. If they layout variables in structs differently,
>>> copying KernelData to the GPU would go wrong. So what I've tried to do
>>> is align everything like we do in makesdna, and make all struct sizes
>>> a multiple of 16 bytes.
>>>
>>> What is quite strange, is that the struct where you added alignment
>>> does not get passed between C++ and OpenCL, it stays purely in OpenCL,
>>> and so I wouldn't expect this to be a problem. Maybe there's a
>>> compiler bug related to nested structs?
>>>
>>> From the specification:
>>>
>>>
>>> "The OpenCL compiler is responsible for aligning data items to the
>>> appropriate alignment as required by the data type. For arguments to a
>>> __kernel function declared to be a pointer to a data type, the OpenCL
>>> compiler can assume that the pointee is always appropriately aligned
>>> as required by the data type."
>>>
>>> So this means that in principle we shouldn't have to worry about
>>> alignment of structs that stay on the GPU, and only for KernelData we
>>> need to worry about alignment. But of course compiler bugs can make
>>> this an issue.
>>>
>>> "For 3-component vector data types, the size of the data type is 4 *
>>> sizeof(component). This means that a 3-component vector data type will
>>> be aligned to a 4 * sizeof(component) boundary."
>>>
>>> This seems to indicate that differential3 should in principle be
>>> aligned already. We can add padding in various structs, but according
>>> to the spec adding float2 to differential3 shouldn't actually improve
>>> alignment, so I'm not sure what the rule would be.
>>>
>>> Regarding (2) viewport problem:
>>>
>>> ----------------------------------------------
>>>
>>> Does F12 rendering work (with resolutions that fail in the viewport) or
>>> is
>>>
>>> it just viewport rendering? ...
>>>
>>>  http://www.pasteall.org/27116/diff
>>>
>>> The dumped image is as bad as the viewport one. A note: the image is not
>>>
>>> 'broken' right away. While the light calculation is still dirty, the
>>> image
>>>
>>> is correct. Only after a few (set_tile() resolution > 8 or so) moments
>>> the
>>>
>>> viewport breaks.
>>>
>>>
>>> Ok, so it's not an opengl issue.
>>>
>>> Another thing you could test is the workgroup_size in device_opencl.cpp
>>> in
>>>
>>> path_trace and tonemap.
>>>
>>> workgroup_size is calculated as 16. If I change it to 1, 2, 4, 8, 16 it
>>>
>>> produces the same result.
>>>
>>> If I change it to 32 OpenCL throw an error on me (OpenCL error (-54):
>>>
>>> Invalid work group size)
>>>
>>>
>>> Ok, so workgroup size is probably not the issue.
>>>
>>> Brecht.
>>> _______________________________________________
>>> Bf-cycles mailing list
>>> Bf-cycles at blender.org
>>> http://lists.blender.org/mailman/listinfo/bf-cycles
>>>
>>>
>>> _____________________________________
>>>
>>> Jens Verwiebe
>>> Allerskehre 44  -  22309 Hamburg
>>>
>>> Tel.: +49 40 68 78 50
>>> mobil: +49 172 400 49 07
>>> mailto: info at jensverwiebe.de
>>> web:  http://www.jensverwiebe.de
>>> _____________________________________
>>>
>>>
>>> _______________________________________________
>>> Bf-cycles mailing list
>>> Bf-cycles at blender.org
>>> http://lists.blender.org/mailman/listinfo/bf-cycles
>>>
>>
>
>
> _______________________________________________
> Bf-cycles mailing list
> Bf-cycles at blender.org
> http://lists.blender.org/mailman/listinfo/bf-cycles
>


More information about the Bf-cycles mailing list