Article:Animation in SDL

Animation in SDL:
http://linux.oreillynet.com/pub/a/linux/2003/05/15/sdl_anim.html?page=last#thread

What can I say, you put articles into the pipeline and eventually they
get published. I didn’t expect to get two out in one week. At least this
one is on the web where we can all see it.

	Bob Pendleton-- 

±----------------------------------+

  • Bob Pendleton: independent writer +
  • and programmer. +
  • email: Bob at Pendleton.com +
    ±----------------------------------+

I just found this over there this morning. However, I noticed a
reference to the source file but I did not see the link to the file
itself. Am I missing it?

Jason> ----- Original Message -----

From: sdl-admin@libsdl.org [mailto:sdl-admin at libsdl.org] On Behalf Of
Bob Pendleton
Sent: Friday, May 16, 2003 2:08 PM
To: Gameprogrammer Mailing List; SDL Mailing List; ALG Mailing List
Subject: [SDL] Article:Animation in SDL

Animation in SDL:
http://linux.oreillynet.com/pub/a/linux/2003/05/15/sdl_anim.html?page=la
st#thread

What can I say, you put articles into the pipeline and eventually they
get published. I didn’t expect to get two out in one week. At least this
one is on the web where we can all see it.

  Bob Pendleton


±----------------------------------+

  • Bob Pendleton: independent writer +
  • and programmer. +
  • email: Bob at Pendleton.com +
    ±----------------------------------+

SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

wow bob, congratulations thats rad (:> ----- Original Message -----

From: bob@pendleton.com (Bob Pendleton)
To: “Gameprogrammer Mailing List” ; “SDL Mailing
List” ; “ALG Mailing List”
Sent: Friday, May 16, 2003 12:08 PM
Subject: [SDL] Article:Animation in SDL

Animation in SDL:

http://linux.oreillynet.com/pub/a/linux/2003/05/15/sdl_anim.html?page=last#t
hread

What can I say, you put articles into the pipeline and eventually they
get published. I didn’t expect to get two out in one week. At least this
one is on the web where we can all see it.

Bob Pendleton


±----------------------------------+

  • Bob Pendleton: independent writer +
  • and programmer. +
  • email: Bob at Pendleton.com +
    ±----------------------------------+

SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

I think you can make the 24bpp line drawing version a little less slower
by copying a short and a char instead of calling memcpy().

Lic. Gabriel Gambetta
ARTech - GeneXus Development Team
ggambett at artech.com.uy> ----- Original Message -----

From: Bob Pendleton [mailto:bob@pendleton.com]
Sent: Viernes, 16 de Mayo de 2003 04:08 p.m.
To: Gameprogrammer Mailing List; SDL Mailing List; ALG Mailing List
Subject: [SDL] Article:Animation in SDL

Animation in SDL:
http://linux.oreillynet.com/pub/a/linux/2003/05/15/sdl_anim.html?page=la
st#thread

What can I say, you put articles into the pipeline and eventually they
get published. I didn’t expect to get two out in one week. At least this
one is on the web where we can all see it.

  Bob Pendleton


±----------------------------------+

  • Bob Pendleton: independent writer +
  • and programmer. +
  • email: Bob at Pendleton.com +
    ±----------------------------------+

SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

I just found this over there this morning. However, I noticed a
reference to the source file but I did not see the link to the file
itself. Am I missing it?

No, it was missing. It is there now.

	Bob PendletonOn Fri, 2003-05-16 at 14:17, Jason Brunson wrote:

Jason

-----Original Message-----
From: sdl-admin at libsdl.org [mailto:sdl-admin at libsdl.org] On Behalf Of
Bob Pendleton
Sent: Friday, May 16, 2003 2:08 PM
To: Gameprogrammer Mailing List; SDL Mailing List; ALG Mailing List
Subject: [SDL] Article:Animation in SDL

Animation in SDL:
http://linux.oreillynet.com/pub/a/linux/2003/05/15/sdl_anim.html?page=la
st#thread

What can I say, you put articles into the pipeline and eventually they
get published. I didn’t expect to get two out in one week. At least this
one is on the web where we can all see it.

  Bob Pendleton


±----------------------------------+

  • Bob Pendleton: independent writer +
  • and programmer. +
  • email: Bob at Pendleton.com +
    ±----------------------------------+

I think you can make the 24bpp line drawing version a little less slower
by copying a short and a char instead of calling memcpy().

GCC, and pretty much any modern C/C++ compiler, expands memcpy() inline.
The code it generates is really good, especially when there is a
constant byte count. That is, there is no call to memcpy() just three
byte stores in a row. Unless, of course, you turn off optimization.

The alternative that you suggest requires code for the case when the
pixel starts on an even byte address and code for when the pixel starts
on an odd byte address. You have to select the correct store pattern
either short-char or char-short. By the time you have done the test and
branched to the correct pattern you have lost any advantage over just
storing 3 chars in a row. The three stores can be pipelined while the
test and branch can cause a pipeline stall.

If I were really trying to optimize the line drawer I would have it
compute spans, not pixels, and make sure that the span filler was as
efficient as possible.

	Bob PendletonOn Fri, 2003-05-16 at 14:32, Gabriel Gambetta wrote:

Lic. Gabriel Gambetta
ARTech - GeneXus Development Team
ggambett at artech.com.uy

-----Original Message-----
From: Bob Pendleton [mailto:@Bob_Pendleton]
Sent: Viernes, 16 de Mayo de 2003 04:08 p.m.
To: Gameprogrammer Mailing List; SDL Mailing List; ALG Mailing List
Subject: [SDL] Article:Animation in SDL

Animation in SDL:
http://linux.oreillynet.com/pub/a/linux/2003/05/15/sdl_anim.html?page=la
st#thread

What can I say, you put articles into the pipeline and eventually they
get published. I didn’t expect to get two out in one week. At least this
one is on the web where we can all see it.

  Bob Pendleton


±----------------------------------+

  • Bob Pendleton: independent writer +
  • and programmer. +
  • email: Bob at Pendleton.com +
    ±----------------------------------+

I think you can make the 24bpp line drawing version a little less slower
by copying a short and a char instead of calling memcpy().
GCC, and pretty much any modern C/C++ compiler, expands memcpy() inline.
The code it generates is really good, especially when there is a
constant byte count. That is, there is no call to memcpy() just three
byte stores in a row. Unless, of course, you turn off optimization.

I didn’t know GCC did THAT much optimization!

Why not three char stores in a row, then?

The alternative that you suggest requires code for the case when the
pixel starts on an even byte address and code for when the pixel starts
on an odd byte address. You have to select the correct store pattern
either short-char or char-short. By the time you have done the test and
branched to the correct pattern you have lost any advantage over just
storing 3 chars in a row. The three stores can be pipelined while the
test and branch can cause a pipeline stall.

Either the optimized memcpy() does that checking too (the byte count might be constant but the destination address won’t be known at compile time), or it expands to three char stores. I think it’s clearer to store three chars :slight_smile:

If I were really trying to optimize the line drawer I would have it
compute spans, not pixels, and make sure that the span filler was as
efficient as possible.

I know you were trying to make things clear, not to optimize, as it’s a very entry-level article. In that case, integer divisions and multiplies by powers of two would look clearer than shifts and I’d bet GCC optimizes that too :slight_smile:

I suggested stores instead of memcpy() to make things slightly faster and to have homogeneous code in all the functions, which is more clear. I didn’t mean to offend you in any way. I’m really sorry if I sounded that way!

–Gabriel

I think you can make the 24bpp line drawing version a little less
slower

by copying a short and a char instead of calling memcpy().
GCC, and pretty much any modern C/C++ compiler, expands memcpy()
inline.
The code it generates is really good, especially when there is a
constant byte count. That is, there is no call to memcpy() just
three
byte stores in a row. Unless, of course, you turn off optimization.

I didn’t know GCC did THAT much optimization!

It was a surprise to me too. I have a graphics library that I originally
wrote in Pascal on DOS, converted to assembly language for the 68000,
then to a mixture of C and assembly for DOS, Win 3.1, and Win95(before
the GDK which became DirectX). I converted it to work with SDL about a
year ago and at that time I went through and tested a LOT of stuff. I
found that many of the things I had done that were optimizations for
those old systems and compilers actually slowed down the code when
compiled for Linux/GCC/SDL.

Why not three char stores in a row, then?

Usually a char is stored in a byte so byte stores are equivalent to char
stores. Just different names.

The alternative that you suggest requires code for the case when the
pixel starts on an even byte address and code for when the pixel
starts
on an odd byte address. You have to select the correct store pattern
either short-char or char-short. By the time you have done the test
and
branched to the correct pattern you have lost any advantage over
just
storing 3 chars in a row. The three stores can be pipelined while
the
test and branch can cause a pipeline stall.

Either the optimized memcpy() does that checking too (the byte count
might be constant but the destination address won’t be known at
compile time), or it expands to three char stores. I think it’s
clearer to store three chars :slight_smile:

You are right. Since a char is usually a byte we are actually saying the
same thing.

If I were really trying to optimize the line drawer I would have it
compute spans, not pixels, and make sure that the span filler was as
efficient as possible.

I know you were trying to make things clear, not to optimize, as it’s
a very entry-level article. In that case, integer divisions and
multiplies by powers of two would look clearer than shifts and I’d bet
GCC optimizes that too :slight_smile:

It does, but this is very old code, the original version was written in
’83. I only made as many changes as I needed to.

I suggested stores instead of memcpy() to make things slightly faster
and to have homogeneous code in all the functions, which is more
clear.

The memcpy() was there because when I tested the code memcpy() was the
fastest way to do it. The shifts are there because it is very old code
and I didn’t make any more changes than I had to.

I didn’t mean to offend you in any way. I’m really sorry if I sounded
that way!

Don’t worry about it. I know you were not trying to offend me. You
happened to hit on a “hot button” topic. I have seen so many programmers
spend so much time doing “optimizations” that simply don’t matter. So
much of their lives wasted on premature optimization. It is a crime. And
yet, the programming “culture” rewards people for doing it. Sort of like
getting famous for how much time you spend drunk. I guess it feels good
while you are doing it. But, you are still throwing your life away.

As long as the code is fast enough for the application, it is fast
enough. You don’t optimize until you have an execution trace that tells
you where the time is really being spent.

–Gabriel

Bob PendletonOn Fri, 2003-05-16 at 16:46, Gabriel Gambetta wrote:


±----------------------------------+

  • Bob Pendleton: independent writer +
  • and programmer. +
  • email: Bob at Pendleton.com +
    ±----------------------------------+

Bob Pendleton wrote

You happened to hit on a “hot button” topic. I have seen so many programmers
spend so much time doing “optimizations” that simply don’t matter. So
much of their lives wasted on premature optimization. It is a crime. And
yet, the programming “culture” rewards people for doing it. Sort of like
getting famous for how much time you spend drunk. I guess it feels good
while you are doing it. But, you are still throwing your life away.

As long as the code is fast enough for the application, it is fast
enough. You don’t optimize until you have an execution trace that tells
you where the time is really being spent.

This is a great lesson to learn. I started programming many years ago,
but never really made it my career, and only recently began looking at
C++. With machines now running at 2-3GHz instead of 8-25MHz (like when
I started learning C), it makes more sense to only look at optimation
when it’s needed.

Back then, there was no such thing as real-time code. Games didn’t look
at the clock of your computer and synchronize gameplay with real-time.
The programmers simply tried to make their programs run as fast as they
possibly could, which often required a LOT of optimization. I think the
compilers we use today make many of those optimizations automatically, and
the CPU’s run much much faster, with less need for other optimizations.

Thanks for the reminder! I’m still trying to get my head out from under
the old way of thinking, where you planned ahead to optimize everything you
possibly could, in order to save time fixing problems later. Now, you just
simply don’t need to spend that time until you learn that you need to look
at specific parts of your code.

-Chris

Hear, hear! :^) Personally, I like readable code that runs well enough
than unreadable code that may, or may not, be the most efficient.

CPU and memory are cheap.

(Egads! I’ve become the thing I hated, back when all I had was a 6502-based
Atari 1200XL with 64K of RAM! ;^) )

-bill!On Fri, May 16, 2003 at 10:45:37PM -0500, Bob Pendleton wrote:

As long as the code is fast enough for the application, it is fast
enough.

[…]

This is a great lesson to learn. I started programming many years
ago, but never really made it my career, and only recently began
looking at C++. With machines now running at 2-3GHz instead of
8-25MHz (like when I started learning C), it makes more sense to
only look at optimation when it’s needed.

Yes indeed.

Back then, there was no such thing as real-time code. Games didn’t
look at the clock of your computer and synchronize gameplay with
real-time. The programmers simply tried to make their programs run
as fast as they possibly could, which often required a LOT of
optimization.

Here, I have to disagree.

Actually, many games from the 8 and 16 years were not just real time,
but hard real time applications. They didn’t have to look at the
RTC (if there was one), simply because their clock was the video
frame rate, which was either 50 or 60 Hz depending on version. (Games
had to be tuned once for PAL and once for NTSC, and then released as
two versions, or two-in-one.)

These games never missed a frame, because the code was fast enough
to meet the deadlines at all times, and because there was no stupid
"general purpose" OS to screw up scheduling all the time.

The fact that you usually had to optimize pretty much anything that
was called from the main loop was an effect of a different
graphics/logic CPU time balance. Graphics bandwidth was low thanks to
low resolutions, text generators and hardware sprites, but game logic
wasn’t that much simpler than in modern games. You had to control
those NPCs, even if it meant you had to code the whole AI in asm and
spend weeks optimizing it.

I think the compilers we use today make many of
those optimizations automatically, and the CPU’s run much much
faster, with less need for other optimizations.

I think it’s worth noting that compilers are able to do many of these
optimizations only because we write relatively plain and simple high
level code now. There isn’t much bit fiddling going on. No fixed
point calculations where you make use of the double word size
capabilities of MUL and DIV instructions. No CPU specific shortcuts
that you can’t even express in a mid or high level language.

The need for these tricks started going away when the 16 bit machines
arrived, and that made life easier, even if you were still coding
mostly or entirely in asm. This made it possible to use high level
languages for more than the highest level stuff, and thus, opened the
path towards writing real applications entirely in high level
languages.

Better optimization and faster CPUs are just further steps in the same
general direction; towards a higher bandwidth/complexity ratio.
Consider this example:

We have a 2D shooter game with full screen scrolling at 50
pixels/second. There’s one player ship and five enemies on the
screen. Now, let’s consider two implementations of this:

C64:
* 1 player ship (X, Y, h/w collision bit)
* 5 enemiy ships (X, Y, h/w collision bits)
* H/w scrolling multicolor text (40x25; 3 bytes/char)

PC/VGA:
* 1 player ship (24*21 pixels ==> 504 bytes)
* 5 enemiy ships (2520 bytes)
* S/w scrolling 256 colors (320x200 ==> 64000 bytes)

The C64 would have shuffle around about 20 kB of data per second to
keep the graphics alive. (Hardware scrolling means you only have to
repaint the screen every 8th frame.) Meanwhile the PC would have to
deal with 3.35 MB/s to produce essentially the same display! Of
course, the PC has enough CPU power to do that.

Now, if both machines have about the same amount of spare CPU time
left after the graphics is done, it turns out that the PC has about
170 times the bandwidth of the C64. This does not strictly map to CPU
power, but it does suggest that implementing the AI isn’t exactly
going to be a matter of hacking fast code on the PC.

//David Olofson - Programmer, Composer, Open Source Advocate

.- The Return of Audiality! --------------------------------.
| Free/Open Source Audio Engine for use in Games or Studio. |
| RT and off-line synth. Scripting. Sample accurate timing. |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Saturday 17 May 2003 07.46, Chris Palmer wrote:

The only thing one can use to help optimization is to check that the
algorithm used is optimal. The instruction level optimization is boring
and CPU does that multiple times faster than programmer.On Saturday 17 May 2003 10:37, Bill Kendrick wrote:

On Fri, May 16, 2003 at 10:45:37PM -0500, Bob Pendleton wrote:

As long as the code is fast enough for the application, it is fast
enough.

Hear, hear! :^) Personally, I like readable code that runs well
enough than unreadable code that may, or may not, be the most
efficient.

The article was really interesting.On Friday 16 May 2003 09:08 pm, Bob Pendleton wrote:

Animation in SDL:
http://linux.oreillynet.com/pub/a/linux/2003/05/15/sdl_anim.html?page=last#
thread

What can I say, you put articles into the pipeline and eventually they
get published. I didn’t expect to get two out in one week. At least this
one is on the web where we can all see it.

  Bob Pendleton


Paulo Pinto