Bob wrote:
Quote:
Yeah, Bob, I’m curious how often the other variants would actually be used?
That is a fair question. And, it has been long enough since I looked
at it that I can look at my code with fresh (or at least forgetful)
eyes.
Here is the list that nfries88 suggested:
AtomicIncrement
AtomicDecrement
AtomicExchange
AtomicCompareExchange
TestAndSet
The first thing you have to do when comparing this list is to kick out
the exchange operations because I found they were not widely
supported. Not that many architecture/OS combos seem to have an atomic
exchange. If you have atomicCompareExchange you don’t need testAndSet.
Because the exchange operations seem so rare I didn’t feel good about
including them in the list. Without an exchange you have to add an
atomic clear to clear values set by TestAndSet. That gives us this
list:
AtomicIncrement
AtomicDecrement
AtomicClear
TestAndSet
That is actually pretty close to what I was originally going to put in
SDL_atomic.h. [image: Smile]
Fair enough. I wasn’t aware that support for atomic exchange was a rarity,
but then again I mostly develop for Windows so what do I know about other
systems?
It came as a surprise to me too. It seems like the great basic atomic
operation. But, it was missing in a lot of places I looked. One reference I
found said that TestAndSet was found on most architectures, Atomic Exchange
was found on x86, and the swap and compare operations were from the Motorola
68000.
I first saw testAndSet on a UNIVAC 1108a built in the late '60s. The
difference between the 1108 and the 1108a was that the “a” model supported
multiple CPUs and multiple IOUs and added the testAndSet instruction to
support building mutexes. I first ran into atomic exchange on the 8086 which
came along at least 10 years later. The 8086 added a “lock” prefix byte that
turned a lot of instructions into atomic operations. But that was at least
10 years later.
You know, if I had to guess, I would guess that Intel had a patent on
something to do with atomic exchange so everyone else either used testAndSet
or invented new ones like atomic compare and exchange. OTOH, it could just
be that no one else wanted to spend that much silicon on adding more than
the minimum number of useful instructions. RISC has it good points.
Anyone really know?
Quote:
Here is the list of the basic ops from SDL_atomic.c. classes of systems
SDL_AtomicTestThenSet
SDL_AtomicClear
SDL_AtomicFetchThenIncrement
SDL_AtomicFetchThenDecrement
SDL_AtomicIncrementThenFetch
SDL_AtomicDecrementThenFetch
SDL_AtomicFetchThenAdd
SDL_AtomicFetchThenSubtract
SDL_AtomicAddThenFetch
SDL_AtomicSubtractThenFetch
SDL_AtomicLock
SDL_AtomicUnlock
There are two of each of these, one ending in 32 and one ending in 64
because I wanted to be able to do atomic operations on 32 and 64 bit
values. Without 64 bit support you can’t do atomic operations on
pointers or reference counts. I expect both sets to be used for a long
time. I’m focused on 64 bit machines, but even I want some code to run
on both 32 and 64 bit architectures. The original lists included
operations on 8 and 16 bit values. But, after doing a lot of research
I found that there aren’t many architecture/OS combos that support all
those. Even the 64 bit ops are not available on all 32 bit
architectures.
This seems like a pretty good list. 64-bit ops can be simulated on 32-bit
architectures not supporting them easily enough.
Why wouldn’t systems have all of those basic ops, though?
Economics and architecture. On word oriented architectures it is costly to
add hardware that allows byte oriented operations. That is why a lot of
machines don’t support atomic operations on 8 and 16 bit fields.
Then look at what you have to do to implement these instructions. Take a
look at testandset. The original form is very simple. It just grabs the
contents of and address, writes a 1 to it, and returns the original value to
the CPU. It is so simple you could build it into the memory hardware as a
variation of the read operation. On the old core memory machines testAndSet
originated on, reading memory erased what you read so all the memory
controllers would do a read followed by writing back the original value.
Making the write set the memory to 1 would be pretty simple. OTOH, exchange,
increment, swap and compare, and all the others require that the contents of
memory reach the CPU, but processed, and the be written back to memory. That
means the whole memory system of the machine must be locked for the use of a
single CPU during a full read/modify/write sequence. RISC machines didn’t
even have instructions like increment that require a single instruction to
perform a read/modify/write. They had load instructions, store instructions,
and modified values in registers. Pretty much everything out there that
isn’t an x86 architecture is derived for a RISC architecture.
Quote:
BTW, I considered using size_t instead of explicit 32 and 64 bit
values. Doing that would get rid of half of my list. AFAIK SDL doesn’t
have a type equivalent to size_t, but it does use size_t. What do
folks think about making the atomic operations take a size_t? Is there
any OS where size_t is not 64 bits when a pointer is 64 bits?
The C standard requires that size_t be the size of a pointer. That said,
I’m sure there’s platforms where it’s always defined as a 32-bit integer…
I would not be opposed to that. I would personally prefer explicit 32 and
64 bit operations, though.
Yeah, me too.
Quote:
Nfries88 didn’t say if his idea of atomic increment and decrement
operations include a fetch as part of the atomic operations. I assume
he did because those operations are pretty much useless with out an
included fetch.
[…]
Why both pre and post fetch versions? I almost didn’t do that.
Obviously, a fetch should be included. I didn’t mention it because, like
you said, they’d be pretty much useless without one.
I’ve also never seen system-defined functions for these that didn’t include
a fetch (although, as I said, my experience is mostly with Windows).
In fact, the Windows SDK doesn’t provide both pre-fetch and post-fetch
versions of their operations, and MSDN doesn’t state which one that the
operations use (so I’m not even certain).
They don’t provide pre-fetch versions of increment and decrement. But, MS
does provide a pre-fetch version of addition so you can implement them if
you want to.
I also can’t see many developers missing pre-fetch too much. But, I can’t
say it’s a bad idea to include them either.
OTOH, QNX only provides pre-fetch operations. And it does implement
atomic operations like add that do not return a value. That blew my poor
little mind until I realized that QNX runs on many different
architectures. I suspect they know more about this than I do.
I also love the way GCC handles atomic operations. They have taken the Intel
Itanium atomic ops and defined a function name for each of them. They made
those function into builtins in GCC. They then warn you that not all of
these operations will be available on all machines. If they are not
available on your target machine they will generate a function call to an
unimplemented function… That way you can do ahead and write a replacement
if you want to. But, they only provide atomic ops that can be mapped
directly to the hardware. Interesting approach…
Quote:
That brings us to the add and subtract operations. When working on
pointers, you rarely want to change them by just one. You usually want
to change them by the SIZEOF(some type). That means that if you want
to do atomic operations on pointers you want add and subtract, not
just increment and decrement. (Not that C automatically increments and
decrements by the SIZEOF the type of the pointer.) Increment and
decrement are fine for counters, but not for pointers. Ok, so why have
both addition subtraction and increment, decrement? Isn’t atomicAdd(1)
the same as increment? Yep, it is and so is atomicSubtract(-1). But,
it seems that increment and decrement are some times faster (no
argument to mess with) than add and subtract. Also, it seems that more
systems directly implement atomic increment and decrement than
directly implement atomic add and subtract. If add and subtract are
more likely to be slow and/or emulated, then you don’t want them to be
the only atomic arithmetic operations you provide. That is why I have
add and subtract as well as increment and decrement. Add and subtract
were added to support atomic operations on pointers.
I think that’s a good call. I’ve never actually done that type of thing
before, so it hadn’t come to mind when I made my list.
I could see its utility, though.
Quote:
I also included a spin lock. I did that because so many of the
algorithms I found were based on spin locks and I wanted to allow an
implementation of the library to provide the most efficient
implementation of a spin lock that was possible on that platform.
Systems have have and atomic exchange can use it to implement the spin
lock and testAndSet. Otherwise programmers all over the world would be
implementing their own spin locks using testAndSet that is implemented
using exchange. Thinking about wasting that many cycles and person
years hurt. Also, by adding spin lock to the library I was able to
keep the size of a lock variable set at 32 bits so there was no
possibility of forcing the world to use 64 bits for a 1 bit lock.
This was a bit unnecessary, and a big reason why I included CMPXCHG in the
list.
However, I can’t argue your reasons for adding it (I didn’t know that
systems implemented testAndSet over CMPXCHG).
Quote:
Currently each of the arithmetic operations comes in a pre and post
fetch version and in 32 and 64 bit versions for a total of 4 of each
operation. It looks to me like the two lists aren’t that different.
The key differences are the result of:
- the general lack of exchange operations,
- my decision to include spin locks in the library,
- my desire to do atomic operations on pointers,
- and support for 64 bit applications.
It seems like a good list. I do still think prefetch operations would not
be missed, though.
Sorry I couldn’t be more help with this, and made you waste all that time
typing.
I do not mind. I appreciate the fact that you took the time to look at what
I did and thought about it. Hey, once you brought up the question even Sam
realized he wanted a better explanation of what I did. Now I have it.
Not to mention you forced me to point out places in the design that I have
questions about.
Anybody else want to comment or make a suggestion? Please do it. Reducing
the size of the API is a good goal.
Bob PendletonOn Sat, Nov 21, 2009 at 12:33 AM, nfries88 wrote:
On Wed, Nov 18, 2009 at 11:38 PM, Sam Lantinga <> wrote:
SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org
–
±----------------------------------------------------------