Fast(er) scrolling

Charlie_Robson_at_sc · August 16, 1999, 5:18pm

two apologies in advance:

1, the length of this post: i don’t have a website to post this stuff, so
unfortunately i have to post it directly to the list.

2, if any of this is obvious and appears to be patronising: i don’t want to
exclude anyone

the ideas here were developed during the making of the game ‘War Of The Worlds’
for GT Interactive. this was a little while ago, and my brain lets me down a lot
of the time, but what i’m telling you here is to the best of my knowledge
correct! it isn’t a detailed how-to, i just want to expose some ideas. if i tell
you exactly how to do it (even if i could remember then you’re less likely
to grasp the ideas fully, and innovate something better for yourself! having
said that… it should be possible for me to distill some code to accompany this
post at some point, but please be patient! i am really busy right now.

ok, on with the show.

basic scrolling. imagine a case where our backbuffer is larger than the screen.
we can happily blit full screen-sized rectangles from the backbuffer to the
screen to effect simple scrolling without redrawing anything, as long as we
don’t go past the boundaries of the backbuffer, which would cause erroneous data
to be displayed, and probably crash the program.

if however, our blitter was a little smarter we could happily blit a
screen-sized area from the backbuffer if we ‘wrap’ when blitting… when the
backbuffer right-edge is reached, the blit continues from the left for the
remaining pixels. in practical terms, this can be achieved using four blits.

i’ll spend some time going over this, because this is the most important concept
presented here.

the scroll origin is at Ax,Ay. we start by blitting the backbuffer A to the
screen at 0,0. we then continue blitting each other region to the screen as
shown below, until all the screen has been blitted, constructing the image in
the screenbuffer shown here:

backbuffer: screen:
±---------±---------+ ±---------±---------+
|C |D | |A |B |
| | | | | |
| | | | | |
±---------±---------+ ±---------±---------+
|B |A | |C |D |
| | | | | |
| | | | | |
±---------±---------+ ±---------±---------+

without any further work, this method will work fine for a tiled background
which wraps. when, as in most real-world projects, the background is not a
continuous checkerboard pattern, we need to expand this concept a little
further.

make the backbuffer size an even multiple of _tilesize. the backbuffer must be
larger than the screen by at least one tile in width and height, providing a
border, or gutter.

A> backbuffer with gutter, no scroll±-------------------±+
|A | |
| | |
| | |
| | |
| | |
| | |
±-------------------+ |
±---------------------+

B> backbuffer with gutter, after big scroll
±--------±±---------+
|C | |D |
| | | |
| | | |
±--------±±---------+
±--------±±---------+
|B | |A |
| | | |
| | | |
±--------±±---------+

it should be clear now that we could scroll by an amount of _tilesize in any
direction without updating the backbuffer. diagram A above shows the backbuffer
in it’s simplest state: there is either no scroll or the scroll value is an even
multiple of the backbuffer size. a scroll value of up to _tilesize will allow us
to update the screen with one blit, showing from none to _tilesize pixels from
the gutter.

you have also probably guessed that when we scroll by more than this amount, or
have moved incrementally past this limit, that we need to fill in strips of
background in the backbuffer. how many strips is dependent upon the amount
scrolled.

a horizontal scroll of _tilesize*2 pixels will leave a backbuffer which looks
like this:

±–±-----------------+
|B |A |
| | |
| | |
| | |
| | |
| | |
±–±-----------------+
±–±-----------------+

we need to have filled in 2 strips of tile into the backbuffer to replace the
data which has scrolled out.

that was the easy bit. to make things go a bit faster, we must only update
portions of the backbuffer which have changed.

to do this, we will associate each tile, or group of tiles, with an entry in a
rectangular array known to as a damage map: <an object can be thought of as
damaging the backbuffer, requiring repairs>

when an object’s position is updated, two rectangular regions are created, the
region where the object was, and the region where the object is about to be put.
entries are made in the damage map relating to these regions which will mean
that both the background at the old position and the object at the new position
will be updated in the backbuffer in the next render cycle.

this is where the complications arise. updating the damage regions for an object
happens when it moves. let’s say it doesn’t move for 4 frames. when it does
move, it’s damagemap setting must reflect the position of the scroll when it was
last updated.

when we come to actually rendering the objects into the backbuffer, we may need
to render each object up to four times, dependent upon how many quadrants the
object appears in. as you can see below, if an object sits in the middle of the
screen and the backbuffer wraps at that point, we need to draw it multiple
times. the overhead is slight if clipping is efficient when drawing the object.

backbuffer: screen:
±---------±---------+ ±---------±---------+
|C #|# | |A |B |
| | | | | |
| | | | #|# |
±---------±---------+ ±---------±---------+
|B |A | |C #|# |
| | | | | |
|# | #| | | |
±---------±---------+ ±---------±---------+

the usual run task / draw it’s object method of coding obviously won’t work
here. the renderer needs to keep close tabs on it’s objects. it’s better to
create a drawlist into which structures representing the visual portion of an
object are registered. this way, the rendering side of the game may manipulate
the objects as it needs to, noting last positions, judging when the object has
moved etc, etc.

having a list of all objects which require rendering allows depth sorting and
other such necessary evils to be done easily. it also however, requires careful
management to allow speed optimisations. i usually sort all of the entries in my
list by y-coordinate, this then limits the range of objects which need to be
scanned at drawing time.

if the rendering side of the game runs in a seperate thread from the object
update, then many additional precautions over and above the usual thread
programming disciplines need to be taken when sharing graphic data. not least,
when objects are removed from the game, they cannot remove their graphic
representation until the renderer has finished with it. in WoW we used a
technique known as reference counting to ensure that no object was removed until
all other objects which were referencing it had finished with it.

for maximum speed, we also need to damagemap the backbuffer to the screen … but
i’ll leave that as an exercise for the reader

the eagle-eyed amongst will have spotted the fact that when scrolling, a full
backbuffer to screen copy will be required. this is life.

other optimisations can be applied to the damagemap: generally it is cheaper to
process one large region than many small ones. various algorithms can be applied
to the damagemap in order to reduce it’s complexity.

      *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

many may be concerned at this point that there is a lot of processing going on
here, and that surely, in order to increase the speed of code, you need to do
less work, and not more. well, i’m sure that in many cases this is true, but in
this instance, the speed benefits gained by using this technique far outweigh
any overheads incurred.

Charlie Robson
Sony Computer Entertainment Europe

IMPORTANT NEWS ANNOUNCEMENT FROM SCEE MIS

New scee.net email address will replace playstation.sony.com

This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
postmaster at scee.net

This footnote also confirms that this email message has been checked
for all known viruses.

SCEE 1999