[Bf-committers] Viewport FX Design

Jason Wilkins jason.a.wilkins at gmail.com
Fri Jun 1 20:02:50 CEST 2012

On Fri, Jun 1, 2012 at 9:05 AM, Brecht Van Lommel
<brechtvanlommel at pandora.be> wrote:
> I may be saying again what you already wrote, but let me give my view
> on the immediate mode emulation stuff. What we need:
> I think the current immediate mode emulation is a good start, but we
> can unify it with vertex arrays and VBO's, and make it more than just
> immediate mode emulation. It already requires to specify up front
> which types of data are going to be passed, so it's already close in
> usage. The big remaining difference is that you don't need to specify
> up front how many vertices you're going to pass, and we can make it
> support both cases with a single API.

The specification is of a buffer size, if the buffer is full it
restarts the buffer.  So the size is actually unlimited.  I just
realized that my restart code is a bit dumb and will drop a primitive,
but that's just a bug.

> It would be the API to pass primitives and their attributes to GL,
> which are either drawn immediately or retained. There would still be
> an immediate mode component to avoid continuously allocating or
> freeing memory, but most of it can be shared with retained mode.

This has definitely been my plan.  The same API will be used to fill
in retained buffers.

> Basically the API will always be, allocate an array for each needed
> attribute, fill up the arrays, and then give them back.

Give them back?  I'm not sure what you mean.  We must assume that if
the client code has a pointer to the buffer then that buffer cannot be

> Filling up these arrays can be wrapped in nice immediate-mode-ish
> functions, but for best performance, after function inlining it should
> just end up filling arrays and increasing a counter for each vertex
> (the current gpu_vector_copy function still has too much overhead).

I'm not convince there is any real performance to be gained.
gpu_vector_copy was always less than 5-10% of cpu time in my profiles
and I was torturing it.  We are either filling up large buffers once
or small buffers many times.

Whatever I am doing, it is already much less overhead than the
original immediate mode calls it is meant to replace (for copying,
switching vertex arrays off and on is a different story).

Also, and most importantly, doing it the way you suggest requires that
the programmer re-specify colors and normals for every vertex instead
of relying on the cached copy to be there.  This is significantly
different than how OpenGL has worked.

I have two other justifications that could be thrown out the window.

1) It keeps the code simple and safe.  No checks that you used
TexCoord3 when you said there were only 2 texcoords.
2) I was entertaining the idea of using write combined memory for the
buffer (NV_vertex_array_range), and such a buffer needs to be written

> For cases where you don't specify the number of vertices up front, the
> arrays could have a reasonably big default size, and then every N
> vertices it could send out the data in a batch, that still would just
> be a single if() test per vertex.

That is how it works now, the maxVertexCount is a buffer size to build
up before sending the geometry.

The retained mode interface would use the same GPUimmediate structure
(probably renamed to avoid confusion), but the maxVertexCount would be
a hard limit.

> Specifying the needed attributes should be totally simple for simple
> cases, it could be a bitflag to specify the attributes you will use.
> For more complex cases like arbitrary GLSL attributes it can be more
> complicated of course, but that's the exception. We also do not need
> to support multiple texture coordinates, they're only used in game
> engine immediate mode drawing which can just be dropped.

Right now I have a set of functions that combine common vertex format
specifications with locking in the format.  Since there are so few
actual formats I figured I could just make them inline and not bother
with the logic to implement bit flags.  But its obviously just a style

My utility functions have names like: gpuImmediateFormat_C4_N3_V3();

And one command to end it, gpuImmediateUnformat();

It is legal to nest these commands, but the format isn't changed.
This is another reason why I cache the current vertex instead of
copying it directly into the array, it allows for code with compatible
formats to be nested like this without worrying about firing an
assertion or having to add checks to the glCurrentX functions.

I'll reiterate that I think this flexibility and simplicity (and
safety) is advantageous and that my profiles do not compel me to
conclude that it is significantly slower in situations where it is
meant to be used (or even in places where you abuse it).

> So in most cases (almost everything except mesh drawing) it could look
> something like this.
> gpuBegin(..)
> ...
> gpuEnd(..)
> gpuImmediateEnd()

If you want to use flags/enums, why not just use something like OpenGL
1.2's interleaved array formats?  Again though, it isn't incredibly
important so long as its easy to read and use.

> Brecht.
> _______________________________________________
> Bf-committers mailing list
> Bf-committers at blender.org
> http://lists.blender.org/mailman/listinfo/bf-committers

More information about the Bf-committers mailing list