Optimisation strategy for background layers ie.large texture

Hi there-
quick question-
for large images (presuming they fit within a given renderer’s max texture dimensions), would you get better performance
(a) from breaking down the image into screen-dimension-sized fragments and blitting those to the screen as needed?
or
(b) keeping the image as one singular image, just blitting to the screen and letting renderer sort out the areas which’re actually displayed?
or
© keeping the image as one singular image, and doing your own cropped blit to the screen?

I’m guessing the answer is both renderer-and-driver dependent, and possibly device-dependent, but I’d like best-guesses here.

Well, I remember glSDL (SDL 1.2 2D API implementation over OpenGL)
performing a lot better with 256x256 textures for mysterious reasons,
despite all the glSDL tiling overhead - but that was about a decade
ago! I suspect those cards didn’t actually support anything above
256x256 in hardware, so it might just have been that glSDL tiling
(trivial) was much faster than tiling in the GL driver (rather
complex)…

More generally speaking; whether it’s SDL clipping rectangles or
GL/D3D clipping polygons, those are trivial operations. No off-screen
pixels (or texels ) are even looked at, so any attempts at eliminating
them on the application level is most likely just going to double the
(minor) clipping/culling overhead that’s invariably in SDL and/or the
driver already.

Now, if you have something like hundreds or thousands of tiles or
sprites on a scrolling map, culling the off-screen ones as early as
possible in your engine can obviously be a big win, as you completely
avoid a lot of code and API call overhead. That’s a different issue,
though.On Tue, Feb 18, 2014 at 4:18 AM, mattbentley wrote:

Hi there-
quick question-
for large images (presuming they fit within a given renderer’s max texture
dimensions), would you get better performance
(a) from breaking down the image into screen-dimension-sized fragments and
blitting those to the screen as needed?
or
(b) keeping the image as one singular image, just blitting to the screen and
letting renderer sort out the areas which’re actually displayed?
or
© keeping the image as one singular image, and doing your own cropped blit
to the screen?

I’m guessing the answer is both renderer-and-driver dependent, and possibly
device-dependent, but I’d like best-guesses here.


//David Olofson - Consultant, Developer, Artist, Open Source Advocate

.— Games, examples, libraries, scripting, sound, music, graphics —.
| http://consulting.olofson.net http://olofsonarcade.com |
’---------------------------------------------------------------------’

I vote b. If the image is on the card. I vote b if its in GPU men but I
could be wrong. Profiles/demos don’t lie in a use case.On Feb 17, 2014 9:18 PM, “mattbentley” wrote:

Hi there-
quick question-
for large images (presuming they fit within a given renderer’s max texture
dimensions), would you get better performance
(a) from breaking down the image into screen-dimension-sized fragments and
blitting those to the screen as needed?
or
(b) keeping the image as one singular image, just blitting to the screen
and letting renderer sort out the areas which’re actually displayed?
or
© keeping the image as one singular image, and doing your own cropped
blit to the screen?

I’m guessing the answer is both renderer-and-driver dependent, and
possibly device-dependent, but I’d like best-guesses here.


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Okay, that’s a couple of definitive votes for B, @David yes, about a decade ago 256x256 would have been a fairly common max video hardware texture size.

Let’s try a second question. Assuming a max texture size of 8192x8192 (fairly common nowadays) and an 800x600 screen size, let’s say we’re holding a background texture in vid memory at 8192x600. Scrolling along, scrolling along…
at any point, does it become more performant to SDLRenderCopy using a source rect rather than just blitting the entire background?

Okay, that’s a couple of definitive votes for B, @David yes, about a decade
ago 256x256 would have been a fairly common max video hardware texture size.

Well, yes, but that’s a different issue. These cards (or at least, the
drivers) explicitly supported much larger textures. (1024x1024 IIRC.)
They just did it with a significant performance penalty, for no
obvious reason. :slight_smile:

Let’s try a second question. Assuming a max texture size of 8192x8192
(fairly common nowadays) and an 800x600 screen size, let’s say we’re holding
a background texture in vid memory at 8192x600. Scrolling along, scrolling
along…
at any point, does it become more performant to SDLRenderCopy using a source
rect rather than just blitting the entire background?

There may be a slight impact on the GPU for accessing small chunks of
texture memory over a large address range, but I suspect this would
just barely be measurable unless you start rotating or scaling so that
you’re actually touching more memory.

Either way, it’s not going to make a difference how you tell the GPU
about the operation. It still ends up reading only the texels it needs
to render the visible area into the frame buffer. Clipping is done on
a much higher level, regardless of what API and calls you use.On Thu, Feb 20, 2014 at 2:14 AM, mattbentley wrote:


//David Olofson - Consultant, Developer, Artist, Open Source Advocate

.— Games, examples, libraries, scripting, sound, music, graphics —.
| http://consulting.olofson.net http://olofsonarcade.com |
’---------------------------------------------------------------------’

If you’re referring to early 3Dfx Voodoo cards (before the Voodoo 4 I believe), they had only the hardware to fetch texture data from up to 256x256, they also placed limits on the aspect ratio of the
texture (the most extreme supported was 1x8 or 8x1, so that means 256x32 or 32x256).

If you used larger texture sizes, they had to split the polygons in software along texture coordinate boundaries and address the texture data in memory specially (separating it into several separate
images) as the hardware simply lacked the bits in its texture fetch engine to do anything beyond 256x256.

They also had only a 16bit depth buffer (but it was floating point), and used a clever 4x4 ordered dither pattern when rendering, with a special logic in the raster scanout that would turn 4x4
dithered 16bit color back into 24bit color as best it could, this would fall apart in certain cases of stacked transparent polygons however.On 02/19/2014 05:38 PM, David Olofson wrote:

On Thu, Feb 20, 2014 at 2:14 AM, mattbentley wrote:

Okay, that’s a couple of definitive votes for B, @David yes, about a decade
ago 256x256 would have been a fairly common max video hardware texture size.

Well, yes, but that’s a different issue. These cards (or at least, the
drivers) explicitly supported much larger textures. (1024x1024 IIRC.)
They just did it with a significant performance penalty, for no
obvious reason. :slight_smile:

Let’s try a second question. Assuming a max texture size of 8192x8192
(fairly common nowadays) and an 800x600 screen size, let’s say we’re holding
a background texture in vid memory at 8192x600. Scrolling along, scrolling
along…
at any point, does it become more performant to SDLRenderCopy using a source
rect rather than just blitting the entire background?

There may be a slight impact on the GPU for accessing small chunks of
texture memory over a large address range, but I suspect this would
just barely be measurable unless you start rotating or scaling so that
you’re actually touching more memory.

Either way, it’s not going to make a difference how you tell the GPU
about the operation. It still ends up reading only the texels it needs
to render the visible area into the frame buffer. Clipping is done on
a much higher level, regardless of what API and calls you use.


LordHavoc
Author of DarkPlaces Quake1 engine - http://icculus.org/twilight/darkplaces
Co-designer of Nexuiz - http://alientrap.org/nexuiz
"War does not prove who is right, it proves who is left." - Unknown
"Any sufficiently advanced technology is indistinguishable from a rigged demo." - James Klass
"A game is a series of interesting choices." - Sid Meier

LordHavoc wrote:

If you’re referring to early 3Dfx Voodoo cards (before the Voodoo 4 I believe), they had only the hardware to fetch texture data from up to 256x256, they also placed limits on the aspect ratio of the
texture (the most extreme supported was 1x8 or 8x1, so that means 256x32 or 32x256).

If you used larger texture sizes, they had to split the polygons in software along texture coordinate boundaries and address the texture data in memory specially (separating it into several separate
images) as the hardware simply lacked the bits in its texture fetch engine to do anything beyond 256x256.

They also had only a 16bit depth buffer (but it was floating point), and used a clever 4x4 ordered dither pattern when rendering, with a special logic in the raster scanout that would turn 4x4
dithered 16bit color back into 24bit color as best it could, this would fall apart in certain cases of stacked transparent polygons however.

Oh hey LordHavoc, haven’t talked to you since early days doing Epsilon.
Thanks for the contribution, it was my suspicion that in David’s case they were splitting it up in software.