External dependencies in the renderer?

Nathaniel_J_Fries · April 17, 2013, 10:30pm

Driedfruit wrote:

I see a couple options here:

Add another function for rendering the same texture multiple times:

Code:

int SDL_RenderCopyMulti(SDL_Renderer *, SDL_Texture *, int nTimes,
const SDL_Rect *);

Just my 2 cents, but this would be very lovely to have in any case.

Aye, this would be helpful for an old idea of mine as well.
And I just realized I botched that function definition, that function call is not terribly useful, taking only a destination rect (ditto for the other one).

Code:
int SDL_RenderCopyMulti(SDL_Renderer *, SDL_Texture *, int, const SDL_Rect *, const SDL_Rect *);------------------------
Nate Fries

Forest_Hale · April 18, 2013, 5:24am

Z buffering does not solve problems for blended transparency, only alpha test, blended transparency still requires sorting back to front, which totally wrecks any texture batching optimizations.On 04/16/2013 05:48 PM, Mason Wheeler wrote:

OK, since it’s apparently not clear from my original proposal, I wasn’t
talking about sending Z coordinates to OpenGL or Direct3D in any way.
I was talking about using them on the SDL side. You’d end up with a
certain number of layers, (most 2D games draw 4 or 5 distinct layers
IME,) and each layer would have its own Z number.

Each Z layer would have its own texture-to-coordinates multimap.
When it’s time to render everything, it looks like this (pseudocode):

for each multimap in layers:
for each texture in multimap:
CreateCoordArrays(multimap[texture])
SelectTexture(texture)
RenderArrays

It’s really that simple, in concept. Everything draws on top of what
it’s supposed to draw on top of. There’s no need to send Z ordering
to the GPU. There’s no atrociously slow one-API-render-per-call.
I’ve tested it. It works, and it’s about 3x faster than the current system
on large, complicated scenes.

There are only two real downsides: 1) it requires a multimap to work
properly, which we need a library for because libc provides neither a
multimap implementation nor the fundamental primitives needed to
build one(a map and a dynamic array).
And 2) SDL_RenderCopy does not currently have a Z parameter on
it, whichis needed to make layering work correctly.

Mason

From: Sik the hedgehog <sik.the.hedgehog at gmail.com>
To: SDL Development List
Sent: Tuesday, April 16, 2013 4:37 PM
Subject: Re: [SDL] External dependencies in the renderer?

The problem is that I think the idea is to use a single batch for
everything… Again, I’m not sure at all that this kind of Z ordering
is reliable in that case. The problem is that the safest way is
sending one thing at a time, i.e. one draw call per SDL function,
which is the very thing we’re trying to avoid…

Also yeah, the Z range is why I said we could run out of them. On PCs
we have 24-bit depth buffer, OK (though somebody could still attempt
to set 16-bit, and I guess on 2D this could make sense), but on mobile
I wonder how the Z range is handled (especially on referred renderers
as opposed to standard rasterizer ones).

And yes, OpenGL numerates textures from 1 onwards (this is true for
all objects, really), but remember you can create gaps by deleting
textures, and OpenGL will attempt to fill those if I recall correctly
(I’m not sure about the details).

2013/4/16, John <john at leafygreengames.com <mailto:john at leafygreengames.com>>:

That sounds like ordering by Z to me, no?

The GLES device vendors advise against implementing your own depth sorting
because the GPU depth test does it much faster, more efficiently, can
correctly
handle overlaps, and runs in parallel with the CPU.

Also, z is floating point in transformed view coordinates which means there
may
not be many duplicate z values to group by.

Have you measured the cost of switching the active texture unit? The number
of
switches that will be saved by this optimization is easy to calculate, it’s

roughly the number of primitives minus the number of textures.

On 04/16/2013 12:41 PM, Mason Wheeler wrote:

It’s not “ordering by Z and texture” but “grouping by Z and texture”.
Every
render with a Z of 1 will get sent before every render with a Z of 2, and
so
on. That’s why I said you end up with an array of multimaps.

Mason

From: Sik the hedgehog <sik.the.hedgehog at gmail.com <mailto:sik.the.hedgehog at gmail.com>>
To: Mason Wheeler <masonwheeler at yahoo.com <mailto:masonwheeler at yahoo.com>>; SDL Development List
<sdl at lists.libsdl.org <mailto:sdl at lists.libsdl.org>>
Sent: Monday, April 15, 2013 10:25 PM
Subject: Re: [SDL] External dependencies in the renderer?

Is there any guarantee in OpenGL at all that primitives are drawn in
the order they appear in the buffer (which would seem inefficient)?
Otherwise ordering by Z is pretty much eventually going to break in
the future.

2013/4/16, Mason Wheeler <masonwheeler at yahoo.com <mailto:masonwheeler at yahoo.com>
<mailto:masonwheeler at yahoo.com <mailto:masonwheeler at yahoo.com>>>:

Not exactly. The optimization assumes that the principal rendering
bottleneck is the overhead involved in sending scene data to the
graphics card, which assumption is borne out by testing data. It
intends to minimize the number of drawing calls by delaying
primitives
and sending them in batches, ordered by Z and texture.

You avoid having to cache “the entire GL state” by the simple
expedient
of flushing the to-do buffer if a call comes in that changes the GL
state.
All you need to keep cached is the map of textures to arrays of
coordinates.
And transparency works fine as long as you have a Z parameter to order
by. Things get drawn on top of each other in the prescribed order.
I’ve
been using this for a while now. The system works.

From: John <john at leafygreengames.com <mailto:john at leafygreengames.com>
<mailto:john at leafygreengames.com <mailto:john at leafygreengames.com>>>
To: sdl at lists.libsdl.org <mailto:sdl at lists.libsdl.org> <mailto:sdl at lists.libsdl.org <mailto:sdl at lists.libsdl.org>>
Sent: Monday, April 15, 2013 8:36 PM
Subject: Re: [SDL] External dependencies in the renderer?

Ok, so the optimization assumes that a rendering bottleneck is the cost
of
switching textures, and intends to minimize the number texture switches
by
delaying primitives, then re-ordering them by texture and Z.

I’ve seen this before. It can be done, but there are caveats. The
biggest
challenge is you need to cache the entire GL state for each delayed
primitive.
The implementation is effectively an “intermediate mode” layer unto
itself.
The
layer is a massive todo buffer with three phases: queue everything,
analyze
(re-order) the queue, then execute the queue as a batch. If you don’t
choose
the
batch size wisely, it’s possible to lose any parallelism that you might
have
had
when GL calls were mixed in with scene graph calls. The second
challenge is
to
support transparency and other effects that depend on multiple passes
in a
specific order, or that play games with the z-buffer (or other tests.)

On 04/15/2013 10:41 PM, Mason Wheeler wrote:

Here’s the basic idea.

The internals of SDL’s rendering API are atrocious, to put it bluntly.
It
does
everything in Immediate Mode, which modern versions of OpenGL and
Direct3D
have
moved away from because it’s so slow. GLES doesn’t even support
Immediate
Mode,
so if you look at SDL’s GLES renderer, it does the closest thing it
can
find to
Immediate Mode, sending one call to OpenGL every time someone calls
SDL_RenderCopy.

The way to do rendering fast is to keep the number of library calls to
a
minimum, and pass as much data as possible all at once in an array.
Of
course,
that’s not the way people use SDL; they use SDL to draw a bunch of
sprites, one
at a time. So to be fast, SDL has to keep track of the bookkeeping
for
them.

The way to do this is with a multimap, mapping textures to lists of
drawing
coordinates. You turn SDL_RenderCopy into an operation that adds a
pair
of
rects to a texture’s mapped list, and SDL_RenderPresent into an
operation
that
iterates over the multimap and for each texture, builds two arrays of
vertices
(one for screen coordinates and one for texture coordinates) as
buffers
and
passes them to the renderer all at once.

I’ve got a Delphi implementation that sped up my rendering
significantly,
about
3x faster than stock SDL rendering. With a multimap in C, I could
port
this
concept to the SDL internals.

The one tricky thing here, the concept that my renderer has that SDL
doesn’t, is
Z-order. If you’re no longer deterministically drawing in the order
in
which
draw calls are received, but instead grouping them by texture, which
are
in turn
sorted by hash order (essentially random,) you need a Z-order
parameter to
make
sure the right things draw on top of the right things, and what you
end up
with
is an array of multimaps.

I know it probably sounds very complicated, but it’s only a few
hundred
lines of
code (plus the implementations of the hash and the dynamic array,
because
C
doesn’t have them built in) and it makes rendering much faster.

Mason

From: Ryan C. Gordon <icculus at icculus.org <mailto:icculus at icculus.org>
<mailto:icculus at icculus.org <mailto:icculus at icculus.org>>>
To: SDL Development List <sdl at lists.libsdl.org <mailto:sdl at lists.libsdl.org>
<mailto:sdl at lists.libsdl.org <mailto:sdl at lists.libsdl.org>>>
Sent: Monday, April 15, 2013 6:20 PM
Subject: Re: [SDL] External dependencies in the renderer?

On 4/15/13 2:46 PM, Mason Wheeler wrote:

Does anyone (particularly Sam and Ryan) have any objections to
pulling
an external library into SDL? Because I have an idea that could
significantly improve the performance of SDL’s 3d-accelerated
rendering,
but it would require a multimap. Neither SDL nor the C standard
library
has a multimap implementation, but I could build one with uthash
and
utarray http://troydhanson.github.io/uthash/, which are both
fairly
small and BSD-licensed.

I’d rather we have a simple hashtable implementation in SDL.

What’s the plan?

–ryan.

SDL mailing list
SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org> <mailto:SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org>>
<mailto:SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org> <mailto:SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org>>>
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

SDL mailing list
SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org> <mailto:SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org>>
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

SDL mailing list
SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org> <mailto:SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org>>
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

SDL mailing list
SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org>
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

SDL mailing list
SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org>
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

SDL mailing list
SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org>
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

–
LordHavoc
Author of DarkPlaces Quake1 engine - LadyHavoc's DarkPlaces Quake Modification
Co-designer of Nexuiz - Nexuiz Classic – Alientrap
“War does not prove who is right, it proves who is left.” - Unknown
“Any sufficiently advanced technology is indistinguishable from a rigged demo.” - James Klass
“A game is a series of interesting choices.” - Sid Meier

Sik_the_hedgehog · April 18, 2013, 5:34am

But he insists it’s still a huge speed up.

But yeah, I was thinking, and it’s very likely a lot of programmers
will just leave blending turned on, and unless you keep track of all
the pixels in the texture or something you’ll have to assume
translucency conflicts can happen. The only “easy” workaround would be
that dirty rectangles-like suggestion from earlier.

Of course there’s also the question about how much overlap is between
each draw (i.e. how much you draw on top of what’s already drawn).

2013/4/18, Forest Hale :> Z buffering does not solve problems for blended transparency, only alpha

test, blended transparency still requires sorting back to front, which
totally wrecks any texture batching optimizations.

Mason_Wheeler · April 18, 2013, 4:44pm

facepalm

Did you not read what I just wrote?

Did you seriously not read it at all?? I just got through explaining that my
implementation DOES NOT USE Z-BUFFERING; that it does work by
sorting back to front, that it I’ve been using it and timed it and it speeds
things up by a factor of 3 on largescenes?

And now you reply and say “no, you can’t use Z-buffering; you need to
sort back to front to avoid screwing up blended transparency, and you
can’t do that or it will wreck the performance gains”?!?
Seriously?

Mason________________________________
From: Forest Hale
To: sdl at lists.libsdl.org
Sent: Wednesday, April 17, 2013 10:24 PM
Subject: Re: [SDL] External dependencies in the renderer?

Z buffering does not solve problems for blended transparency, only alpha test, blended transparency still requires sorting back to front, which totally wrecks any texture batching optimizations.

On 04/16/2013 05:48 PM, Mason Wheeler wrote:

OK, since it’s apparently not clear from my original proposal, I wasn’t
talking about sending Z coordinates to OpenGL or Direct3D in any way.
I was talking about using them on the SDL side.? You’d end up with a
certain number of layers, (most 2D games draw 4 or 5 distinct layers
IME,) and each layer would have its own Z number.

Each Z layer would have its own texture-to-coordinates multimap.
When it’s time to render everything, it looks like this (pseudocode):

for each multimap in layers:
? ? for each texture in multimap:
? ? ? CreateCoordArrays(multimap[texture])
? ? ? SelectTexture(texture)
? ? ? RenderArrays

It’s really that simple, in concept.? Everything draws on top of what
it’s supposed to draw on top of.? There’s no need to send Z ordering
to the GPU.? There’s no atrociously slow one-API-render-per-call.
I’ve tested it.? It works, and it’s about 3x faster than the current system
on large, complicated scenes.

There are only two real downsides: 1) it requires a multimap to work
properly, which we need a library for because libc provides neither a
multimap implementation nor the fundamental primitives needed to
build one(a map and a dynamic array).
And 2) SDL_RenderCopy does not currently have a Z parameter on
it, whichis needed to make layering work correctly.

Mason

Nathaniel_J_Fries · April 18, 2013, 10:12pm

The terminology disagreement is due to the same term (“z”) being seen in two different contexts.

Z layering (or Z ordering) is a technique in 2D rendering used to separate layers. Any item with the same layer (“z”) can be rendered in any order, and the resulting graphical output would be the same (or at least close enough for the programmer’s needs). Most 2D games use this technique (either explicitly or simply by a proper ordering of draw operations) in order to prevent a ground tile from being rendered on top of the player and other such issues.

Z buffering is a technique in 3D rendering (better termed depth buffering) that allows the programmer to define the depth of objects, which is often used by hardware to cull the scene.

Transluscency is an issue for depth buffering, since an opaque texture drawn behind a transluscent texture will be culled.
It is not usually an issue for Z layering (unless this is implemented using the hardware’s depth buffer), since culling is not the purpose (render order is).

What Mason is suggesting is Z layering, and not Z buffering, which means that nothing is culled.------------------------
Nate Fries

Mason_Wheeler · April 18, 2013, 10:22pm

Yes, that’s correct.

Mason________________________________
From: Nathaniel J Fries
To: sdl at lists.libsdl.org
Sent: Thursday, April 18, 2013 3:12 PM
Subject: Re: [SDL] External dependencies in the renderer?

The terminology disagreement is due to the same term (“z”) being seen in two different contexts.

Z layering (or Z ordering) is a technique in 2D rendering used to separate layers. Any item with the same layer (“z”) can be rendered in any order, and the resulting graphical output would be the same (or at least close enough for the programmer’s needs). Most 2D games use this technique (either explicitly or simply by a proper ordering of draw operations) in order to prevent a ground tile from being rendered on top of the player and other such issues.

Z buffering is a technique in 3D rendering (better termed depth buffering) that allows the programmer to define the depth of objects, which is often used by hardware to cull the scene.

Transluscency is an issue for depth buffering, since an opaque texture drawn behind a transluscent texture will be culled.
It is not usually an issue for Z layering (unless this is implemented using the hardware’s depth buffer), since culling is not the purpose (render order is).

What Mason is suggesting is Z layering, and not Z buffering, which means that nothing is culled.

Nate Fries

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Nathaniel_J_Fries · April 19, 2013, 12:56am

Despite understanding what you’re referring to, I don’t like the idea of having z-ordering in the SDL core library.

However, the problem you identified is fairly important. Although SDL is a generic solution, and generic solutions can never be as optimal as hand-tailored ones, by not providing a means of optimizing for the case of order-independent rendering of the same texture (which is quite common in 2D games, especially if spritesheets are used), SDL unnecessarily reduces framerates in OpenGL and Direct3D.

I would again refer to the notion of a simple function to perform this, along the lines of SDL_RenderCopyMulti (or SDL_RenderNCopy, etc).------------------------
Nate Fries

Mason_Wheeler · April 19, 2013, 2:20am

The problem with that is that it forces the developer to do essentially the same thing I’m proposing, just on their end.

If you have a scene with a bunch of sprites in it, they’re most likely not ordered by texture, and certainly not grouped by texture.? That’s not a natural way to set it up, and not something someone’s going to do unless they’re specifically trying to do what I’m trying to do here.? Which means that at draw time, at some point, someone somewhere has to translate the list of what’s being drawn into some sort of structure that’s grouped by texture–such as a multimap.

As long as “group by texture” has to be done one way or another in order to get the performance benefits we’re talking about here, why force it to be outside of the API and require every developer to reinvent the wheel?? That’s what libraries are for, isn’t it?

Mason________________________________
From: Nathaniel J Fries
To: sdl at lists.libsdl.org
Sent: Thursday, April 18, 2013 5:56 PM
Subject: Re: [SDL] External dependencies in the renderer?

Despite understanding what you’re referring to, I don’t like the idea of having z-ordering in the SDL core library.

However, the problem you identified is fairly important. Although SDL is a generic solution, and generic solutions can never be as optimal as hand-tailored ones, by not providing a means of optimizing for the case of order-independent rendering of the same texture (which is quite common in 2D games, especially if spritesheets are used), SDL unnecessarily reduces framerates in OpenGL and Direct3D.

I would again refer to the notion of a simple function to perform this, along the lines of SDL_RenderCopyMulti (or SDL_RenderNCopy, etc).

Nate Fries

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Sik_the_hedgehog · April 19, 2013, 3:17am

First of all, before we continue: does anywhere in the OpenGL or
Direct3D specs say that primitives are guaranteed to be rendered in
the same order as they’re sent? Because otherwise I really doubt
that ordering is going to work reliably. It may work on some systems
but break miserably on others.

(and that reminds me: we’ll need patches for all hardware-accelerated
renderers if we want to accept this method :P)

2013/4/18, Mason Wheeler :> The problem with that is that it forces the developer to do essentially the

same thing I’m proposing, just on their end.

If you have a scene with a bunch of sprites in it, they’re most likely not
ordered by texture, and certainly not grouped by texture.? That’s not a
natural way to set it up, and not something someone’s going to do unless
they’re specifically trying to do what I’m trying to do here.? Which means
that at draw time, at some point, someone somewhere has to translate the
list of what’s being drawn into some sort of structure that’s grouped by
texture–such as a multimap.

As long as “group by texture” has to be done one way or another in order to
get the performance benefits we’re talking about here, why force it to be
outside of the API and require every developer to reinvent the wheel?
That’s what libraries are for, isn’t it?

Mason

From: Nathaniel J Fries
To: sdl at lists.libsdl.org
Sent: Thursday, April 18, 2013 5:56 PM
Subject: Re: [SDL] External dependencies in the renderer?

Despite understanding what you’re referring to, I don’t like the idea of
having z-ordering in the SDL core library.

However, the problem you identified is fairly important. Although SDL is a
generic solution, and generic solutions can never be as optimal as
hand-tailored ones, by not providing a means of optimizing for the case of
order-independent rendering of the same texture (which is quite common in 2D
games, especially if spritesheets are used), SDL unnecessarily reduces
framerates in OpenGL and Direct3D.

I would again refer to the notion of a simple function to perform this,
along the lines of SDL_RenderCopyMulti (or SDL_RenderNCopy, etc).

Nate Fries

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Mason_Wheeler · April 19, 2013, 5:53am

Back to this again?

Do you understand the difference between ordering and grouping?

Mason________________________________
From: Sik the hedgehog <sik.the.hedgehog at gmail.com>
To: Mason Wheeler <@Mason_Wheeler>; SDL Development List
Sent: Thursday, April 18, 2013 8:17 PM
Subject: Re: [SDL] External dependencies in the renderer?

First of all, before we continue: does anywhere in the OpenGL or
Direct3D specs say that primitives are guaranteed to be rendered in
the same order as they’re sent? Because otherwise I really doubt
that ordering is going to work reliably. It may work on some systems
but break miserably on others.

(and that reminds me: we’ll need patches for all hardware-accelerated
renderers if we want to accept this method :P)

2013/4/18, Mason Wheeler <@Mason_Wheeler>:

The problem with that is that it forces the developer to do essentially the
same thing I’m proposing, just on their end.

If you have a scene with a bunch of sprites in it, they’re most likely not
ordered by texture, and certainly not grouped by texture.? That’s not a
natural way to set it up, and not something someone’s going to do unless
they’re specifically trying to do what I’m trying to do here.? Which means
that at draw time, at some point, someone somewhere has to translate the
list of what’s being drawn into some sort of structure that’s grouped by
texture–such as a multimap.

As long as “group by texture” has to be done one way or another in order to
get the performance benefits we’re talking about here, why force it to be
outside of the API and require every developer to reinvent the wheel?
That’s what libraries are for, isn’t it?

Mason

? From: Nathaniel J Fries
To: sdl at lists.libsdl.org
Sent: Thursday, April 18, 2013 5:56 PM
Subject: Re: [SDL] External dependencies in the renderer?

Despite understanding what you’re referring to, I don’t like the idea of
having z-ordering in the SDL core library.

However, the problem you identified is fairly important. Although SDL is a
generic solution, and generic solutions can never be as optimal as
hand-tailored ones, by not providing a means of optimizing for the case of
order-independent rendering of the same texture (which is quite common in 2D
games, especially if spritesheets are used), SDL unnecessarily reduces
framerates in OpenGL and Direct3D.

I would again refer to the notion of a simple function to perform this,
along the lines of SDL_RenderCopyMulti (or SDL_RenderNCopy, etc).

Nate Fries

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

John6 · April 19, 2013, 2:57pm

Yes. Otherwise it’d be impossible to composite anything reliably without
flushing after every primitive.On 04/18/2013 11:17 PM, Sik the hedgehog wrote:

First of all, before we continue: does anywhere in the OpenGL or
Direct3D specs say that primitives are guaranteed to be rendered in
the same order as they’re sent?

John6 · April 19, 2013, 3:10pm

We all understand the difference. You have proposed to re-order primitives
according to their texture, and that is why we are discussing “ordering”.On 04/19/2013 01:53 AM, Mason Wheeler wrote:

Back to this again?

Do you understand the difference between ordering and grouping?

Mason

From: Sik the hedgehog <sik.the.hedgehog at gmail.com>
To: Mason Wheeler ; SDL Development List

Sent: Thursday, April 18, 2013 8:17 PM
Subject: Re: [SDL] External dependencies in the renderer?

First of all, before we continue: does anywhere in the OpenGL or
Direct3D specs say that primitives are guaranteed to be rendered in
the same order as they’re sent? Because otherwise I really doubt
that ordering is going to work reliably. It may work on some systems
but break miserably on others.

(and that reminds me: we’ll need patches for all hardware-accelerated
renderers if we want to accept this method :P)

2013/4/18, Mason Wheeler <masonwheeler at yahoo.com <mailto:masonwheeler at yahoo.com>>:

The problem with that is that it forces the developer to do essentially the
same thing I’m proposing, just on their end.

If you have a scene with a bunch of sprites in it, they’re most likely not
ordered by texture, and certainly not grouped by texture. That’s not a
natural way to set it up, and not something someone’s going to do unless
they’re specifically trying to do what I’m trying to do here. Which means
that at draw time, at some point, someone somewhere has to translate the
list of what’s being drawn into some sort of structure that’s grouped by
texture–such as a multimap.

As long as “group by texture” has to be done one way or another in order to
get the performance benefits we’re talking about here, why force it to be
outside of the API and require every developer to reinvent the wheel?
That’s what libraries are for, isn’t it?

Mason

From: Nathaniel J Fries <nfries88 at yahoo.com <mailto:nfries88 at yahoo.com>>
To: sdl at lists.libsdl.org <mailto:sdl at lists.libsdl.org>
Sent: Thursday, April 18, 2013 5:56 PM
Subject: Re: [SDL] External dependencies in the renderer?

Despite understanding what you’re referring to, I don’t like the idea of
having z-ordering in the SDL core library.

However, the problem you identified is fairly important. Although SDL is a
generic solution, and generic solutions can never be as optimal as
hand-tailored ones, by not providing a means of optimizing for the case of
order-independent rendering of the same texture (which is quite common in 2D
games, especially if spritesheets are used), SDL unnecessarily reduces
framerates in OpenGL and Direct3D.

I would again refer to the notion of a simple function to perform this,
along the lines of SDL_RenderCopyMulti (or SDL_RenderNCopy, etc).

Nate Fries

SDL mailing list
SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org>
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Nathaniel_J_Fries · April 19, 2013, 4:00pm

Mason Wheeler wrote:

The problem with that is that it forces the developer to do essentially the same thing I’m proposing, just on their end.

If you have a scene with a bunch of sprites in it, they’re most likely not ordered by texture, and certainly not grouped by texture. That’s not a natural way to set it up, and not something someone’s going to do unless they’re specifically trying to do what I’m trying to do here. Which means that at draw time, at some point, someone somewhere has to translate the list of what’s being drawn into some sort of structure that’s grouped by texture–such as a multimap.

As long as “group by texture” has to be done one way or another in order to get the performance benefits we’re talking about here, why force it to be outside of the API and require every developer to reinvent the wheel? That’s what libraries are for, isn’t it?

The programmer would need to implement ordering on top of the library anyway in order to feed the graphics to SDL in the right order. And whatever APIs SDL could expose to access the underlying ordering structure, many programmers would probably find less preferable to the way they’ve always done it. So you have a situation of redundant ordering.

Grouping is what is actually needed, but grouping without considering order effectively negates order, so SDL must either group and order or do neither.

I proposed earlier a spritebatch mechanism for SDL which would do all this internally, but it was suggested that this be an extension library; however to even implement that in a non-hackish manner, SDL would still need to provide an interface for rendering the same texture multiple times.------------------------
Nate Fries

Forest_Hale · April 19, 2013, 10:44pm

I did read what you said, but my interpretation was that you wanted to blast all primitives out sequentially by texture without regard to Z layer, and then have the Z buffer hardware sort it out.
This interpretation was incorrect, I apologize.

Is there a reason to use a multimap rather than a radix sort? Presumably a sort key consisting of several bytes of state (Z layer, blendfunc, texture) would achieve the desired results.On 04/18/2013 09:44 AM, Mason Wheeler wrote:

facepalm

Did you not read what I just wrote?

Did you seriously not read it at all? I just got through explaining that my
implementation DOES NOT USE Z-BUFFERING; that it does work by
sorting back to front, that it I’ve been using it and timed it and it speeds
things up by a factor of 3 on largescenes?

And now you reply and say “no, you can’t use Z-buffering; you need to
sort back to front to avoid screwing up blended transparency, and you
can’t do that or it will wreck the performance gains”?!?

Seriously?

Mason

From: Forest Hale <@Forest_Hale>
To: sdl at lists.libsdl.org
Sent: Wednesday, April 17, 2013 10:24 PM
Subject: Re: [SDL] External dependencies in the renderer?

Z buffering does not solve problems for blended transparency, only alpha test, blended transparency still requires sorting back to front, which totally wrecks any texture batching optimizations.

On 04/16/2013 05:48 PM, Mason Wheeler wrote:

OK, since it’s apparently not clear from my original proposal, I wasn’t
talking about sending Z coordinates to OpenGL or Direct3D in any way.
I was talking about using them on the SDL side. You’d end up with a
certain number of layers, (most 2D games draw 4 or 5 distinct layers
IME,) and each layer would have its own Z number.

Each Z layer would have its own texture-to-coordinates multimap.
When it’s time to render everything, it looks like this (pseudocode):

for each multimap in layers:
for each texture in multimap:
CreateCoordArrays(multimap[texture])
SelectTexture(texture)
RenderArrays

It’s really that simple, in concept. Everything draws on top of what
it’s supposed to draw on top of. There’s no need to send Z ordering
to the GPU. There’s no atrociously slow one-API-render-per-call.
I’ve tested it. It works, and it’s about 3x faster than the current system
on large, complicated scenes.

There are only two real downsides: 1) it requires a multimap to work
properly, which we need a library for because libc provides neither a
multimap implementation nor the fundamental primitives needed to
build one(a map and a dynamic array).
And 2) SDL_RenderCopy does not currently have a Z parameter on
it, whichis needed to make layering work correctly.

Mason

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

–
LordHavoc
Author of DarkPlaces Quake1 engine - LadyHavoc's DarkPlaces Quake Modification
Co-designer of Nexuiz - Nexuiz Classic – Alientrap
“War does not prove who is right, it proves who is left.” - Unknown
“Any sufficiently advanced technology is indistinguishable from a rigged demo.” - James Klass
“A game is a series of interesting choices.” - Sid Meier

Jared_Maddox · April 20, 2013, 2:24am

Message-ID:
<1366338036.95197.YahooMailNeo at web122502.mail.ne1.yahoo.com>
Content-Type: text/plain; charset=“iso-8859-1”

The problem with that is that it forces the developer to do essentially the
same thing I’m proposing, just on their end.

It’s still perfectly fine. Any language that supports C’s qsort, C++'s
std::map, or any half-way similar functionality, will by definition
provide all of the primitives needed to actually implement a solution
to this. It’s not a big deal once they recognize that they need a
sorted data structure. If SDL provided a generic C-language tree
implementation then it would certainly be more convenient to everyone,
but that’s a minor thing.

If you have a scene with a bunch of sprites in it, they’re most likely not
ordered by texture, and certainly not grouped by texture.? That’s not a
natural way to set it up, and not something someone’s going to do unless
they’re specifically trying to do what I’m trying to do here.? Which means
that at draw time, at some point, someone somewhere has to translate the
list of what’s being drawn into some sort of structure that’s grouped by
texture–such as a multimap.

I’ve written 3d code that does the job. Depending on the complexity of
your modelling, you can do this in C++'s standard containers in as
little as ~50 lines of code (and that’s a very-ballpark estimate, I
use a lot of whitespace in my code).

As long as “group by texture” has to be done one way or another in order to
get the performance benefits we’re talking about here, why force it to be
outside of the API and require every developer to reinvent the wheel??
That’s what libraries are for, isn’t it?

You know, perhaps I’m confusing you with someone else, but I seem to
remember you wanting to rip OUT portions of SDL. Now you’re trying to
add in parts that the rest of us consider only partially appropriate,
DESPITE already having been told that it requires a forbidden API
change?> Date: Thu, 18 Apr 2013 19:20:36 -0700 (PDT)

From: Mason Wheeler
To: “sdl at lists.libsdl.org”
Subject: Re: [SDL] External dependencies in the renderer?

Date: Fri, 19 Apr 2013 09:00:09 -0700
From: “Nathaniel J Fries”
To: sdl at lists.libsdl.org
Subject: Re: [SDL] External dependencies in the renderer?
Message-ID: <1366387208.m2f.36714 at forums.libsdl.org>
Content-Type: text/plain; charset=“iso-8859-1”

Grouping is what is actually needed, but grouping without considering order
effectively negates order, so SDL must either group and order or do
neither.

I proposed earlier a spritebatch mechanism for SDL which would do all this
internally, but it was suggested that this be an extension library; however
to even implement that in a non-hackish manner, SDL would still need to
provide an interface for rendering the same texture multiple times.

For that matter, if the “multi-render” function were added then that
would be enough for my “buffering-render” suggestion to be implemented
with an external library. It’s a really straightforward optimization,
and doesn’t need to break the API.

Nathaniel_J_Fries · April 20, 2013, 2:32pm

Jared Maddox wrote:

Grouping is what is actually needed, but grouping without considering order
effectively negates order, so SDL must either group and order or do
neither.

I proposed earlier a spritebatch mechanism for SDL which would do all this
internally, but it was suggested that this be an extension library; however
to even implement that in a non-hackish manner, SDL would still need to
provide an interface for rendering the same texture multiple times.

For that matter, if the “multi-render” function were added then that
would be enough for my “buffering-render” suggestion to be implemented
with an external library. It’s a really straightforward optimization,
and doesn’t need to break the API.

I’m not honestly a fan of the spritebatch mechanism, I’d rather group them myself and call a multi-rendering function.
However, the spritebatch mechanism already has real-world uses (in fact, it is the simplest method for rendering 2D graphics in Microsoft’s XNA), whereas a simple multi-copy function does not AFAIK.------------------------
Nate Fries

Pallav_Nawani · April 22, 2013, 7:36am

If Mason wants to implement a portable, performance improving
optimization in the SDL renderer pipeline,
I totally don’t see a problem with it.

Some may not want an external Hash Table implementation - okay, but I
don’t see why SDL shouldn’t be rendering stuff faster than it already is.
I don’t understand the opposition - at all.

If it doesn’t work - well, that’s what source control is for.On 4/20/2013 7:54 AM, Jared Maddox wrote:

Date: Thu, 18 Apr 2013 19:20:36 -0700 (PDT)
From: Mason Wheeler
To: “sdl at lists.libsdl.org”
Subject: Re: [SDL] External dependencies in the renderer?
Message-ID:
<1366338036.95197.YahooMailNeo at web122502.mail.ne1.yahoo.com>
Content-Type: text/plain; charset=“iso-8859-1”

The problem with that is that it forces the developer to do essentially the
same thing I’m proposing, just on their end.

It’s still perfectly fine. Any language that supports C’s qsort, C++'s
std::map, or any half-way similar functionality, will by definition
provide all of the primitives needed to actually implement a solution
to this. It’s not a big deal once they recognize that they need a
sorted data structure. If SDL provided a generic C-language tree
implementation then it would certainly be more convenient to everyone,
but that’s a minor thing.

If you have a scene with a bunch of sprites in it, they’re most likely not
ordered by texture, and certainly not grouped by texture.? That’s not a
natural way to set it up, and not something someone’s going to do unless
they’re specifically trying to do what I’m trying to do here.? Which means
that at draw time, at some point, someone somewhere has to translate the
list of what’s being drawn into some sort of structure that’s grouped by
texture–such as a multimap.

I’ve written 3d code that does the job. Depending on the complexity of
your modelling, you can do this in C++'s standard containers in as
little as ~50 lines of code (and that’s a very-ballpark estimate, I
use a lot of whitespace in my code).

As long as “group by texture” has to be done one way or another in order to
get the performance benefits we’re talking about here, why force it to be
outside of the API and require every developer to reinvent the wheel??
That’s what libraries are for, isn’t it?

You know, perhaps I’m confusing you with someone else, but I seem to
remember you wanting to rip OUT portions of SDL. Now you’re trying to
add in parts that the rest of us consider only partially appropriate,
DESPITE already having been told that it requires a forbidden API
change?

Date: Fri, 19 Apr 2013 09:00:09 -0700
From: “Nathaniel J Fries”
To: sdl at lists.libsdl.org
Subject: Re: [SDL] External dependencies in the renderer?
Message-ID: <1366387208.m2f.36714 at forums.libsdl.org>
Content-Type: text/plain; charset=“iso-8859-1”

Grouping is what is actually needed, but grouping without considering order
effectively negates order, so SDL must either group and order or do
neither.

I proposed earlier a spritebatch mechanism for SDL which would do all this
internally, but it was suggested that this be an extension library; however
to even implement that in a non-hackish manner, SDL would still need to
provide an interface for rendering the same texture multiple times.

For that matter, if the “multi-render” function were added then that
would be enough for my “buffering-render” suggestion to be implemented
with an external library. It’s a really straightforward optimization,
and doesn’t need to break the API.

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org
.

–
Pallav Nawani
Game Designer/CEO

Twitter: x.com
Facebook: Ironcode Gaming

Scott_Percival · April 22, 2013, 10:25am

There’s no opposition against the idea of faster draw calls. What you’re
seeing here is interest in finding the best way to implement the
optimisation, both for the developers who’ll use it and the SDL maintainers.

The renderer API was designed to use the unbatched painter’s algorithm
approach to blitting. As discussed, it’s non-trivial to cache a bunch of
these draw calls when there’s zero guarantee that the state will remain the
same between each one. The worst case is that you’ll end up breaking a pile
of software which relies on the expectation that blits will happen
immediately after calling SDL_RenderCopy. Hence the discussion about adding
a new batch rendering method alongside the old one.

The SDL 2.0 API has been frozen, and there is released software using this
API; now is exactly the wrong time to be cavalier about breaking things.On 22 April 2013 15:36, Pallav Nawani wrote:

If Mason wants to implement a portable, performance improving optimization
in the SDL renderer pipeline,
I totally don’t see a problem with it.

Some may not want an external Hash Table implementation - okay, but I
don’t see why SDL shouldn’t be rendering stuff faster than it already is.
I don’t understand the opposition - at all.

If it doesn’t work - well, that’s what source control is for.

On 4/20/2013 7:54 AM, Jared Maddox wrote:

Date: Thu, 18 Apr 2013 19:20:36 -0700 (PDT)

From: Mason Wheeler
To: “sdl at lists.libsdl.org”
Subject: Re: [SDL] External dependencies in the renderer?
Message-ID:
<1366338036.95197.**YahooMailNeo at web122502.mail.**ne1.yahoo.com<1366338036.95197.YahooMailNeo at web122502.mail.ne1.yahoo.com>

Content-Type: text/plain; charset=“iso-8859-1”

The problem with that is that it forces the developer to do essentially
the
same thing I’m proposing, just on their end.

It’s still perfectly fine. Any language that supports C’s qsort, C++'s
std::map, or any half-way similar functionality, will by definition
provide all of the primitives needed to actually implement a solution
to this. It’s not a big deal once they recognize that they need a
sorted data structure. If SDL provided a generic C-language tree
implementation then it would certainly be more convenient to everyone,
but that’s a minor thing.

If you have a scene with a bunch of sprites in it, they’re most likely

not
ordered by texture, and certainly not grouped by texture.? That’s not a
natural way to set it up, and not something someone’s going to do unless
they’re specifically trying to do what I’m trying to do here.? Which
means
that at draw time, at some point, someone somewhere has to translate the
list of what’s being drawn into some sort of structure that’s grouped by
texture–such as a multimap.

I’ve written 3d code that does the job. Depending on the complexity of
your modelling, you can do this in C++'s standard containers in as
little as ~50 lines of code (and that’s a very-ballpark estimate, I
use a lot of whitespace in my code).

As long as “group by texture” has to be done one way or another in order

to
get the performance benefits we’re talking about here, why force it to be
outside of the API and require every developer to reinvent the wheel??
That’s what libraries are for, isn’t it?

You know, perhaps I’m confusing you with someone else, but I seem to
remember you wanting to rip OUT portions of SDL. Now you’re trying to
add in parts that the rest of us consider only partially appropriate,
DESPITE already having been told that it requires a forbidden API
change?

Date: Fri, 19 Apr 2013 09:00:09 -0700

From: “Nathaniel J Fries”
To: sdl at lists.libsdl.org
Subject: Re: [SDL] External dependencies in the renderer?
Message-ID: <1366387208.m2f.36714 at forums.**libsdl.org<1366387208.m2f.36714 at forums.libsdl.org>

Content-Type: text/plain; charset=“iso-8859-1”

Grouping is what is actually needed, but grouping without considering
order
effectively negates order, so SDL must either group and order or do
neither.

I proposed earlier a spritebatch mechanism for SDL which would do all
this
internally, but it was suggested that this be an extension library;
however
to even implement that in a non-hackish manner, SDL would still need to
provide an interface for rendering the same texture multiple times.

For that matter, if the “multi-render” function were added then that
would be enough for my “buffering-render” suggestion to be implemented
with an external library. It’s a really straightforward optimization,
and doesn’t need to break the API.
_____________**
SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/**listinfo.cgi/sdl-libsdl.org http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org
.

–
Pallav Nawani
Game Designer/CEO
http://www.ironcode.com
Twitter: x.com http://twitter.com/Ironcode_Gaming
Facebook: Redirecting...http://www.facebook.com/Ironcode.Gaming

_____________**
SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/**listinfo.cgi/sdl-libsdl.org http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org