[Bf-cycles] osx + ati + cycles, looking for insights in debugging

Dalai Felinto dfelinto at gmail.com
Sun Dec 11 00:22:41 CET 2011


Hi,
I have good and bad news ...

In my computer the kernel options that work are:
#define __SVM__
#define __TRANSPARENT_SHADOWS__

That only if the "define float3" patch is not applied. Otherwise I get an
alpha viewport. Also when it's applied I have this warning, not sure is
related:
http://www.pasteall.org/27310

Regardless of this we have all this warnings going on:
http://www.pasteall.org/27311

Note, enabling any of the other options (e.g. __EMISSION__) I get:
"Compiling OpenCL kernel ...
*[a few seconds later ...]*
Error getting function data from server"

Some options take a few seconds and throw this problem, others take more
time.

The minimalistic patch I have with SVM working is:
http://www.pasteall.org/27313/diff

A screenshot to prove I'm not lying:
http://www.pasteall.org/pic/22421
Now if I "rip" the viewport apart the new editor only renders 'alpha'
http://www.pasteall.org/pic/22422

Blender test file:
http://www.pasteall.org/blend/10223

--
Dalai

2011/12/10 Brecht Van Lommel <brechtvanlommel at pandora.be>

> Hi,
>
> I set it to split into a fixed number of parts now, but will add code
> so that it's based on doing blocks of max N pixels instead. That
> should make it work then. I couldn't find anything in the opencl spec
> about global range limitations for execution, but is a simple enough
> fix.
>
> What is exactly the problem of shaders not building, that it does only
> ambient occlusion render? In kernel_types.h, under /* Kernel Features
> */, you can see which features are enabled. In device_opencl.cpp, it
> enables more for NVidia drivers. For Apple drivers, more features can
> be enabled too when they are verified to work.
>
> Thanks,
> Brecht.
>
> On Sat, Dec 10, 2011 at 9:39 AM, Dalai Felinto <dfelinto at gmail.com> wrote:
> > Hi Brecht,
> >
> > Your latest patch [1] fix the problem only partially. Now it doesn't
> break
> > with 1280x1280 and lower, but in 2600x2600 it starts to crash. Funny
> fact,
> > with 2550x2550 or 2570x2570 I get a black bar on top [2].
> >
> > Jens said that in Sig (lux and small lux) they are splitting the tasks as
> > well. So it's a good direction to go.
> > That still doesn't solve the problem of shaders not building as Jens
> > reported to me. I wonder if it's already time to address that.
> >
> > Cheers,
> > Dalai
> >
> >
> > [1] - http://www.pasteall.org/27291/diff
> > [2] - http://www.pasteall.org/pic/show.php?id=22390 in oppose
> > to http://www.pasteall.org/pic/show.php?id=22391
> >
> >
> > 2011/12/8 Dalai Felinto <dfelinto at gmail.com>
> >>
> >> Hi,
> >> It seems that the viewport problem is related to ShaderData or
> >> ShaderClosure alignment or other problems.
> >> The problem is pretty much narrowed down to the following:
> >>
> >> (1) in kernel_path.h : both lines 275 and 277 can break if not commented
> >> from the code:
> >> - shader_setup_from_ray(kg, &sd, &isect, &ray);
> >> - shader_eval_surface(kg, &sd, rbsdf, state.flag);
> >>
> >> (2) in kernel_shader.h : shader_eval_surface() consist of only 2 lines
> of
> >> code (for opencl). Both lines (486 and 487) can also break if not
> commented
> >> out:
> >>  - bsdf_diffuse_setup(sd, &sd->closure);
> >> -  sd->closure.weight = make_float3(0.8f, 0.8f, 0.8f);
> >>
> >> Notes:
> >> --------
> >> 1) by break I mean, "breaks" the viewport or get a pitch black render
> >> 2) resolutions used for testing(Width x Height @ Tiles): 820x772 at 820
> >> (breaks), 820x772 at 818 (works)
> >> 3) by "both lines need to be commented" I mean, if either one is left
> >> uncommented it breaks.
> >> 4) I get the same results with or without the kernel_compat_cl.h patch
> for
> >> #define float3 float4
> >> 5) my test file - http://www.pasteall.org/blend/10191
> >> 6) example of "working" image when commenting the above mentioned lines:
> >> http://www.pasteall.org/pic/show.php?id=22245
> >>
> >> Best regards,
> >> Dalai
> >>
> >> -- getting close ;)
> >>
> >> 2011/12/7 Jens Verwiebe <info at jensverwiebe.de>
> >>>
> >>> Hi
> >>> As a hint i found these 2 writings:
> >>>
> >>> So about that float padding[2]; in the struct definition. This is
> because
> >>> of memoryalignment in OpenCL. The best explanation I’ve seen so far is
> >>> by AndreasStahl which I will briefly summarize in relation to the
> struct
> >>> above.
> >>> When interpreting a struct, OpenCL accesses the memory in blocks of 16
> >>> bytes, which is the same as 4 floats (each 4 bytes). So in our example
> if we
> >>> did not have the padding, we would not be able to access our int
> because
> >>> opencl would have interpreted it as the 3rd float out of the first
> 16bytes.
> >>> This can get even more complicated if you have an array of structs,
> because
> >>> then the size of you’re struct will need to be a multiple of 16, as
> >>> explained in the linked forum post.
> >>> The orig AMD posting:
> >>>
> >>>
> >>>
> http://forums.amd.com/forum/messageview.cfm?catid=390&threadid=122209&forumid=9
> >>>
> >>> Conclusion: There are indeed places where the compiler does not take
> care
> >>> of alignment ( bug? ), so we must do ourselfes.
> >>>
> >>> Jens
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Am 07.12.2011 um 16:38 schrieb Brecht Van Lommel:
> >>>
> >>> Hi,
> >>>
> >>> Maybe it would be possible to set up an SSH login on a computer so I
> >>> can try and narrow this down further? There has to be a pattern here,
> >>> but I'm not sure what to suggest you to test, what I would do is to
> >>> keep removing code until it works, and find out exactly which kind of
> >>> struct variable layout or operation is causing issues.
> >>>
> >>> On Wed, Dec 7, 2011 at 7:11 AM, Dalai Felinto <dfelinto at gmail.com>
> wrote:
> >>>
> >>> "Good news", I believe both problems may indeed be related and of
> >>> alignment.
> >>>
> >>> At least the first one.
> >>>
> >>>
> >>> If I change the alignment in kernel_type.h in simple ways [*] I get
> the Z
> >>>
> >>> problem fixed, but the viewport broken in situations it was working
> >>> before.
> >>>
> >>> If I go to camera fly mode the viewport is fine ('dirty' but fine)
> until
> >>> I
> >>>
> >>> stop then I get the (2) problem. So what are the rules for alignment
> >>> here?
> >>>
> >>>
> >>> [*] http://www.pasteall.org/27124
> >>>
> >>>
> >>> The alignment issue I was thinking of is when passing structs between
> >>> the C++ and OpenCL. If they layout variables in structs differently,
> >>> copying KernelData to the GPU would go wrong. So what I've tried to do
> >>> is align everything like we do in makesdna, and make all struct sizes
> >>> a multiple of 16 bytes.
> >>>
> >>> What is quite strange, is that the struct where you added alignment
> >>> does not get passed between C++ and OpenCL, it stays purely in OpenCL,
> >>> and so I wouldn't expect this to be a problem. Maybe there's a
> >>> compiler bug related to nested structs?
> >>>
> >>> From the specification:
> >>>
> >>>
> >>> "The OpenCL compiler is responsible for aligning data items to the
> >>> appropriate alignment as required by the data type. For arguments to a
> >>> __kernel function declared to be a pointer to a data type, the OpenCL
> >>> compiler can assume that the pointee is always appropriately aligned
> >>> as required by the data type."
> >>>
> >>> So this means that in principle we shouldn't have to worry about
> >>> alignment of structs that stay on the GPU, and only for KernelData we
> >>> need to worry about alignment. But of course compiler bugs can make
> >>> this an issue.
> >>>
> >>> "For 3-component vector data types, the size of the data type is 4 *
> >>> sizeof(component). This means that a 3-component vector data type will
> >>> be aligned to a 4 * sizeof(component) boundary."
> >>>
> >>> This seems to indicate that differential3 should in principle be
> >>> aligned already. We can add padding in various structs, but according
> >>> to the spec adding float2 to differential3 shouldn't actually improve
> >>> alignment, so I'm not sure what the rule would be.
> >>>
> >>> Regarding (2) viewport problem:
> >>>
> >>> ----------------------------------------------
> >>>
> >>> Does F12 rendering work (with resolutions that fail in the viewport) or
> >>> is
> >>>
> >>> it just viewport rendering? ...
> >>>
> >>>  http://www.pasteall.org/27116/diff
> >>>
> >>> The dumped image is as bad as the viewport one. A note: the image is
> not
> >>>
> >>> 'broken' right away. While the light calculation is still dirty, the
> >>> image
> >>>
> >>> is correct. Only after a few (set_tile() resolution > 8 or so) moments
> >>> the
> >>>
> >>> viewport breaks.
> >>>
> >>>
> >>> Ok, so it's not an opengl issue.
> >>>
> >>> Another thing you could test is the workgroup_size in device_opencl.cpp
> >>> in
> >>>
> >>> path_trace and tonemap.
> >>>
> >>> workgroup_size is calculated as 16. If I change it to 1, 2, 4, 8, 16 it
> >>>
> >>> produces the same result.
> >>>
> >>> If I change it to 32 OpenCL throw an error on me (OpenCL error (-54):
> >>>
> >>> Invalid work group size)
> >>>
> >>>
> >>> Ok, so workgroup size is probably not the issue.
> >>>
> >>> Brecht.
> >>> _______________________________________________
> >>> Bf-cycles mailing list
> >>> Bf-cycles at blender.org
> >>> http://lists.blender.org/mailman/listinfo/bf-cycles
> >>>
> >>>
> >>> _____________________________________
> >>>
> >>> Jens Verwiebe
> >>> Allerskehre 44  -  22309 Hamburg
> >>>
> >>> Tel.: +49 40 68 78 50
> >>> mobil: +49 172 400 49 07
> >>> mailto: info at jensverwiebe.de
> >>> web:  http://www.jensverwiebe.de
> >>> _____________________________________
> >>>
> >>>
> >>> _______________________________________________
> >>> Bf-cycles mailing list
> >>> Bf-cycles at blender.org
> >>> http://lists.blender.org/mailman/listinfo/bf-cycles
> >>>
> >>
> >
> >
> > _______________________________________________
> > Bf-cycles mailing list
> > Bf-cycles at blender.org
> > http://lists.blender.org/mailman/listinfo/bf-cycles
> >
> _______________________________________________
> Bf-cycles mailing list
> Bf-cycles at blender.org
> http://lists.blender.org/mailman/listinfo/bf-cycles
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.blender.org/pipermail/bf-cycles/attachments/20111210/b5c50d0d/attachment.htm 


More information about the Bf-cycles mailing list