[Bf-cycles] osx + ati + cycles, looking for insights in debugging
Brecht Van Lommel
brechtvanlommel at pandora.be
Wed Dec 7 16:38:15 CET 2011
Maybe it would be possible to set up an SSH login on a computer so I
can try and narrow this down further? There has to be a pattern here,
but I'm not sure what to suggest you to test, what I would do is to
keep removing code until it works, and find out exactly which kind of
struct variable layout or operation is causing issues.
On Wed, Dec 7, 2011 at 7:11 AM, Dalai Felinto <dfelinto at gmail.com> wrote:
> "Good news", I believe both problems may indeed be related and of alignment.
> At least the first one.
> If I change the alignment in kernel_type.h in simple ways [*] I get the Z
> problem fixed, but the viewport broken in situations it was working before.
> If I go to camera fly mode the viewport is fine ('dirty' but fine) until I
> stop then I get the (2) problem. So what are the rules for alignment here?
> [*] http://www.pasteall.org/27124
The alignment issue I was thinking of is when passing structs between
the C++ and OpenCL. If they layout variables in structs differently,
copying KernelData to the GPU would go wrong. So what I've tried to do
is align everything like we do in makesdna, and make all struct sizes
a multiple of 16 bytes.
What is quite strange, is that the struct where you added alignment
does not get passed between C++ and OpenCL, it stays purely in OpenCL,
and so I wouldn't expect this to be a problem. Maybe there's a
compiler bug related to nested structs?
>From the specification:
"The OpenCL compiler is responsible for aligning data items to the
appropriate alignment as required by the data type. For arguments to a
__kernel function declared to be a pointer to a data type, the OpenCL
compiler can assume that the pointee is always appropriately aligned
as required by the data type."
So this means that in principle we shouldn't have to worry about
alignment of structs that stay on the GPU, and only for KernelData we
need to worry about alignment. But of course compiler bugs can make
this an issue.
"For 3-component vector data types, the size of the data type is 4 *
sizeof(component). This means that a 3-component vector data type will
be aligned to a 4 * sizeof(component) boundary."
This seems to indicate that differential3 should in principle be
aligned already. We can add padding in various structs, but according
to the spec adding float2 to differential3 shouldn't actually improve
alignment, so I'm not sure what the rule would be.
> Regarding (2) viewport problem:
> Does F12 rendering work (with resolutions that fail in the viewport) or is
> it just viewport rendering? ...
> The dumped image is as bad as the viewport one. A note: the image is not
> 'broken' right away. While the light calculation is still dirty, the image
> is correct. Only after a few (set_tile() resolution > 8 or so) moments the
> viewport breaks.
Ok, so it's not an opengl issue.
>> Another thing you could test is the workgroup_size in device_opencl.cpp in
>> path_trace and tonemap.
> workgroup_size is calculated as 16. If I change it to 1, 2, 4, 8, 16 it
> produces the same result.
> If I change it to 32 OpenCL throw an error on me (OpenCL error (-54):
> Invalid work group size)
Ok, so workgroup size is probably not the issue.
More information about the Bf-cycles