[Bf-committers] Viewport FX Design

Fri Jun 8 02:45:33 CEST 2012

Hi

Most (if not all) projection functions like glOrtho are preceded by
glLoadIdentity(). Therefore, I think to drop multiplication, and just write
to the current matrix.
Also, it easier to have gpuPopMatrix to call gpuMatrixCommmit() to
automatically restore previous state (at least for now).
For now,  gpuPopMatrix   will also check if matrix is committed before
exiting, so we won't have strange bugs.

Best,
Alex

On Fri, Jun 1, 2012 at 4:29 PM, Jason Wilkins <jason.a.wilkins at gmail.com>wrote:

> On Fri, Jun 1, 2012 at 2:00 PM, Brecht Van Lommel
> <brechtvanlommel at pandora.be> wrote:
> > I just meant release / free the arrays, or whatever the API needs to
> > do depending if it's using VBO's, vertex arrays, immediate/retained,
>
> ok
>
> >> Also, and most importantly, doing it the way you suggest requires that
> >> the programmer re-specify colors and normals for every vertex instead
> >> of relying on the cached copy to be there.  This is significantly
> >> different than how OpenGL has worked.
> >
> > It's a different yes, but just a small modification I think.
>
>
> > Regarding
> > performance, note that can depend a lot on the GPU, drivers and the
> > type of mesh used. If your GPU is significantly faster than the CPU,
> > it matters more. At least on recent GPU's I tested basic viewport
> > performance has not improved much because the CPU was the bottleneck.
> > It just seems a bit overkill to have 7 conditionals per vertex when
> > really all that needs to happen is just copying some data into an
> > array.
>
> gpu_vertex_copy definitely violates my principle that we want our
> replacement code to be "dumb" :)
>
> I have ideas to make it faster, but right now I've gotten most of my
> speed ups from just reducing how many times OpenGL APIs are called.
> gpuImmediate really should not be used for huge batches that eat up
> CPU bandwidth.  That is why I'm willing to sacrifice some performance
> for convenience.
>
> > This does not seem like a good reason to me, if they are mismatched
> > I'd rather have it throw me an error.
>
> I do have a system set up already to allow us to add lots of run-time
> checks that can be turned off (WITH_GPU_SAFETY).  I was thinking of
> checking for this anyway.  It would allow the implementation to be
> made more strict later if we need it to be.
>
> >> 2) I was entertaining the idea of using write combined memory for the
> >> buffer (NV_vertex_array_range), and such a buffer needs to be written
> >> sequentially.
> >
> > This point I don't understand. But anyways, if the performance is not
> > affected I don't care that much, and it can still be optimized
> > afterwards if it turns out to be a bottleneck, would not affect the
> > design really.
>
> Just FYI, Allocating a buffer in write-combined memory allows it to be
> written very quickly because caching is turned off and special
> hardware is used to combine writes into big chunks.  But it requires
> the memory to be written sequentially.  The user does not know what
> order to write things in, bur gpu_copy_vertex does, so it writes
> things sequentially.  Such a thing would still be important for CUDA,
> but the only way to get this for vertex arrays is to use
> NV_vertex_array_range.  I'm sure that the OpenGL driver also uses
> write combined memory internally depending on what usage you set for
> your VBO.  This isn't something I'm like to actually ever try so I
> shouldn't have brought it up.
>
> >> It is legal to nest these commands, but the format isn't changed.
> >> This is another reason why I cache the current vertex instead of
> >> copying it directly into the array, it allows for code with compatible
> >> formats to be nested like this without worrying about firing an
> >> assertion or having to add checks to the glCurrentX functions.
> >
> > I don't think nesting should be allowed? I imagine any code using
> > these should keep them near the associated gpuBegin/gpuEnd to keep
> > things from getting out of sync.
>
> I believe nesting is cheap and useful, I first used it to keep
> BLF_draw easy to use because it locks itself automatically, but it can
> also be locked explicitly for large blocks of many BLF_draw.  The only
> way this really works is to keep track of a lock count, otherwise we
> do not know when to unlock.  I could have made the code in blf a
> special case and have it track if it had set up the vertex format or
> not, but I felt it was probably going to be a common pattern.
>
> >> If you want to use flags/enums, why not just use something like OpenGL
> >> 1.2's interleaved array formats?  Again though, it isn't incredibly
> >> important so long as its easy to read and use.
> >
> > Mainly I want something that is more clear than the example usage that
> > is in e.g. blenfont, found that difficult to read. With something like
> > GPU_TEX_COORD_3F it's immediately clear that this matches
> > gpuTexCoord3f. But if it's abbreviated that's also fine with me.
>
> Oh, I probably need to replace the code in blf.c with the convenience
> function.
>
> > Anyways, overall I agree with your plans then.
> >
> > Brecht.
>
>
> And of course they are subject to change as actual experience
> dictates.  I'll keep your suggestions in mind as I go and if it become
> apparent I made a bad assumption I'll certainly backtrack and change
> what I'm doing.
>
> I'm certainly willing to try not having a "staging area" for vertexes
> and just directly copying them, it is just that I estimated it might
> take 2 or 3 times longer to port Blender to use the new library if I
> did.
>
> > _______________________________________________
> > Bf-committers mailing list
> > Bf-committers at blender.org
> > http://lists.blender.org/mailman/listinfo/bf-committers
> _______________________________________________
> Bf-committers mailing list
> Bf-committers at blender.org
> http://lists.blender.org/mailman/listinfo/bf-committers
>