[Bf-committers] Viewport FX Design

Fri Jun 1 22:29:05 CEST 2012

On Fri, Jun 1, 2012 at 2:00 PM, Brecht Van Lommel
<brechtvanlommel at pandora.be> wrote:
> I just meant release / free the arrays, or whatever the API needs to
> do depending if it's using VBO's, vertex arrays, immediate/retained,

ok

>> Also, and most importantly, doing it the way you suggest requires that
>> the programmer re-specify colors and normals for every vertex instead
>> of relying on the cached copy to be there.  This is significantly
>> different than how OpenGL has worked.
>
> It's a different yes, but just a small modification I think.

> Regarding
> performance, note that can depend a lot on the GPU, drivers and the
> type of mesh used. If your GPU is significantly faster than the CPU,
> it matters more. At least on recent GPU's I tested basic viewport
> performance has not improved much because the CPU was the bottleneck.
> It just seems a bit overkill to have 7 conditionals per vertex when
> really all that needs to happen is just copying some data into an
> array.

gpu_vertex_copy definitely violates my principle that we want our
replacement code to be "dumb" :)

I have ideas to make it faster, but right now I've gotten most of my
speed ups from just reducing how many times OpenGL APIs are called.
gpuImmediate really should not be used for huge batches that eat up
CPU bandwidth.  That is why I'm willing to sacrifice some performance
for convenience.

> This does not seem like a good reason to me, if they are mismatched
> I'd rather have it throw me an error.

I do have a system set up already to allow us to add lots of run-time
checks that can be turned off (WITH_GPU_SAFETY).  I was thinking of
checking for this anyway.  It would allow the implementation to be
made more strict later if we need it to be.

>> 2) I was entertaining the idea of using write combined memory for the
>> buffer (NV_vertex_array_range), and such a buffer needs to be written
>> sequentially.
>
> This point I don't understand. But anyways, if the performance is not
> affected I don't care that much, and it can still be optimized
> afterwards if it turns out to be a bottleneck, would not affect the
> design really.

Just FYI, Allocating a buffer in write-combined memory allows it to be
written very quickly because caching is turned off and special
hardware is used to combine writes into big chunks.  But it requires
the memory to be written sequentially.  The user does not know what
order to write things in, bur gpu_copy_vertex does, so it writes
things sequentially.  Such a thing would still be important for CUDA,
but the only way to get this for vertex arrays is to use
NV_vertex_array_range.  I'm sure that the OpenGL driver also uses
write combined memory internally depending on what usage you set for
your VBO.  This isn't something I'm like to actually ever try so I
shouldn't have brought it up.

>> It is legal to nest these commands, but the format isn't changed.
>> This is another reason why I cache the current vertex instead of
>> copying it directly into the array, it allows for code with compatible
>> formats to be nested like this without worrying about firing an
>> assertion or having to add checks to the glCurrentX functions.
>
> I don't think nesting should be allowed? I imagine any code using
> these should keep them near the associated gpuBegin/gpuEnd to keep
> things from getting out of sync.

I believe nesting is cheap and useful, I first used it to keep
BLF_draw easy to use because it locks itself automatically, but it can
also be locked explicitly for large blocks of many BLF_draw.  The only
way this really works is to keep track of a lock count, otherwise we
do not know when to unlock.  I could have made the code in blf a
special case and have it track if it had set up the vertex format or
not, but I felt it was probably going to be a common pattern.

>> If you want to use flags/enums, why not just use something like OpenGL
>> 1.2's interleaved array formats?  Again though, it isn't incredibly
>> important so long as its easy to read and use.
>
> Mainly I want something that is more clear than the example usage that
> is in e.g. blenfont, found that difficult to read. With something like
> GPU_TEX_COORD_3F it's immediately clear that this matches
> gpuTexCoord3f. But if it's abbreviated that's also fine with me.

Oh, I probably need to replace the code in blf.c with the convenience function.

> Anyways, overall I agree with your plans then.
>
> Brecht.

And of course they are subject to change as actual experience
dictates.  I'll keep your suggestions in mind as I go and if it become
apparent I made a bad assumption I'll certainly backtrack and change
what I'm doing.

I'm certainly willing to try not having a "staging area" for vertexes
and just directly copying them, it is just that I estimated it might
take 2 or 3 times longer to port Blender to use the new library if I
did.

> _______________________________________________
> Bf-committers mailing list
> Bf-committers at blender.org
> http://lists.blender.org/mailman/listinfo/bf-committers