Hi guys, thanks.
I guess the general rule is then “IF low level pixel manipulation by
software is insufficient THEN go for openGL shaders EVEN if it’s 2D art”…
right? =)
Yeah, that’s basically it.
Actually, with anything but ancient hardware or the occasional oddball shared
memory solution, you’re probably better off using OpenGL (or Direct3D, if you
really need maximum compatibility on that other OS) for presenting your
software “frame buffer”, even if you’re doing 100% software rendering. Even
uploading and blitting the whole screen every frame tends to be faster than
any tricks you can play with a 2D API these days. Well, unless you only need
to update a small fraction of the screen area every frame, of course.
It’s the busmaster DMA transfers from system memory to VRAM you need, and old
drivers for obsolete APIs usually don’t provide those. Some do (DGA and
DirectDraw, I think), but they still seem to be a lot slower than the 3D APIs
for some reason.
[…]
possibly quick enough that your time is better spent optimizing your 24/32
bpp code than supporting multiple pixel formats.
sure, but I meant that, even if I supported more formats, say for saving
space, it wouldn’t save space in the actual (v)ram since it would have to
be extracted anyway before you can mess with the pixels… thus its of no
much use anyway, as you say =) only perhaps for saving loading time? (in
case decompression is quicker than HD, which it is in most cases, right?)
That depends… You can reduce the bandwidth significantly by using a 16 bpp
framebuffer instead 24/32 bpp, and that makes a big difference if you’re using
the CPU to push pixels into VRAM.
Obviously, it still makes a difference when using texture uploading with
OpenGL or Direct3D, though if the driver implements proper DMA transfers,
you’ll usually have all the bandwidth you need and then some anyway.
[…]
maybe it’s a better idea to just implement your software rendering as
pixel shaders under OpenGL or Direct3D?
Hmm… then the rule is, IF it becomes an issue, go for openGL (yes, im
prejudiced against directX lol),
Right; as far as I’m concerned, Direct3D is just a few hundred euros worth of
extra work for nothing, theoretically. It adds nothing but more code to
maintain.
However, massive forces are still pushing Direct3D, and over at the indiegamer
forums, successful developers are still warning about OpenGL on windows, at
least for casual games. I’m still not sure what the current situation actually
is like, but considering the rather non-casual nature of the game I’m working
on right now, I suspect my time is better spent on other things. If you’re
into puzzle games and the like, you might come to a different conclusion.
[…]
but, as you mentioned shaders… how
precise are they? I mean, can I go to the level of the individual pixel,
or will it invariably mess with my logic, such as in interpolation, etc…
Pixel shaders process on the framebuffer pixel level (anything else would be a
terrible waste of bandwidth), so you should have full control.
less importantly (just curious =) in case you know), can 2D only art
bypass the 3D only steps of the pipeline in modern graphic cards?
Yes and no. You can “blit” directly to the screen, but that’s an absolute last
resort. If it’s not properly optimized in the driver (DMA), it’ll be horribly
slow, and perhaps more importantly, it requires “hard sync” of the GPU and
CPU, which is a total waste of good cycles on both sides.
Basically: Don’t do that!
indicated by it’s pixelformat field. That’s all there is to it, pretty
much.
Thus I could create a line, efficiently, the ol’way by editing pixels
inside a surface and only THEN pasting that surface to the screen?
Theoretically, yes, but in the general case, that doesn’t exactly seem like
the most efficient way of doing it. Indeed, if you’re running massive particle
effects, they might be better off rendered as Wu-pixels into an RGBA OpenGL
texture - or why not an RGB texture with additive blending?
If you’re using the SDL 2D API, though, you’re probably better off doing all
rendering in a shadow surface. SDL 1.2 alpha blending is all software, and as
such, is a very bad idea to use directly in VRAM in the general case. (Reads
from VRAM tend to be many times slower than writes - and writes are pretty
slow already.)
(I dont
remember if we have direct access to the screen surface in SDL)
You do, but see above, about reading VRAM in particular.
[…]
However, keep in mind that many of the methods and algorithms used for
"traditional" software rendering, and the related optimization tricks,
aren’t really up to date with how modern CPUs and computers work. And,
hardware accelerated rendering via high level APIs is a different beast
entirely - except for the pixel shader bit.>
I thought we were comparing shader with high level APIs (:S)
Uhm, well… GPUs are another beast entirely yet again - massive arrays of
small cores running in parallel.
Anyway, I was rather thinking about old “standard” solutions like using look-
up tables to avoid expensive operations and that sort of stuff. The thing is,
while in the past, multiplication, division, floating point math and various
other things were anything from “expensive” (10x slower than an addition or
so) through “uselessly expensive” (hundreds or thousands of cycles per
calculation), whereas these days, most of those operations are single cycle,
while running out of cache memory can cost tens or hundreds of cycles per
access, making LUTs viable only for really expensive operations.
All that said, the usual rules apply; optimize on the highest levels first,
reducing the work you actually have to do, and then benchmark and tune as
needed for your actual target platforms.On Friday 05 November 2010, at 00.01.57, Marcos Marin wrote:
–
//David Olofson - Consultant, Developer, Artist, Open Source Advocate
.— Games, examples, libraries, scripting, sound, music, graphics —.
| http://olofson.net http://olofsonarcade.com http://kobodeluxe.com |
’---------------------------------------------------------------------’