SDL_WaitEvent in 1.3

I should have said I didn’t want to make it difficult for Sam or the
SDL community. Just to clarify things a bit…On Thu, Jan 8, 2009 at 9:09 PM, Donny Viszneki <@Donny_Viszneki> wrote:

If I really put my weight into forking SDL for any reason, I’ll put a
good deal of effort into making sure it is nominally API compatible
with Sam’s, so even if I did copy glib code, it is not my intention to
make it difficult for Sam by fragmenting the API. He can catch up on
implementation at his own pace if it comes down to it!


http://codebad.com/

A gracious offer. Might I also add that glib’s event handling provides
quite an excellent API model and is probably also worth looking into.
glib supports multithreaded applications, multiple “event loops,” and
APIs for adding new “event sources” (which for all I know may boil
down to adding your own file descriptors to a collection that glib
select()s, but perhaps it’s more interesting and sophisticated than
that!)

You won’t know how interesting glib is until you look…

i am not a fan of depending on glib. that would making sdl on embedded
platforms difficult. for example building with uclibc.

matt

I agree. I think we need Sam’s opinion! I’d be thrilled to work toward
a new interrupt/signal/select() driven processing subsystem for SDL.

I’d love to see something like this for SDL 1.3.

Here are the requirements that I can think of off the top of my head:

  • Support existing 1.2 API
  • Support posting custom SDL events from separate threads and waking as soon as possible to deliver them.
  • Support existing event sources (window events, multiple mice, joystick events, etc.)
  • Must work on UNIX, Windows, and Mac OS X.

I’m trying my hand at it, on Mac OS X, because it’s the one I know the
least, looks like it might be the most complicated, and it’s what’s
running on my fastest computer at home. :wink:

I think I figured out how to make the existing “nib-less” SDLmain.m
less hacky (which, by the way, does not denote being incorrect, just
being very clever and non-obvious, maybe overly so being the problem
here), which, incidentally, I wouldn’t need at all for my own idea,
which is to make a simple wrapper for NSRunLoop (see
http://developer.apple.com/documentation/Cocoa/Reference/Foundation/Classes/NSRunLoop_Class/Reference/Reference.html
for more info), and deliver (most) events as they are received,
without any queueing or thread.

That last item is why I’m not just going with fastevent right off the
bat. Just about every platform (I looked at some wacky ones like
Photon/QNX, but not Nintendo DS, I have to admit) has something along
the lines of select() that can be used to wait on all the kinds of
events you need for the SDL event sources, and even more. But those
mechanisms are platform-dependent, of course, and also depend on the
display driver you’re using at the moment (just like the current event
code). It really screams to be done in there correctly. And on those
wacky platforms that don’t have anything, you can fall back to the
current old school polling, or the way fastevent does it (which is as
good as you can go without changing the core of SDL, which is what I’m
proposing, but requires threads, which might not be available
everywhere).

There would have to be some queueing still, for events coming from
other threads (duh). But the main loop would try to avoid queuing as
much as possible and dispatch right away. This means not overflowing
the event queue ever, and also means giving the proper control flow
information to the display servers (when you’re overwhelmed by mouse
movement events, the thing to do is not “get more mouse movement
events”, it’s to have the display server coalesce them a bit and get
less).

I’m not 100% sure about the requirements for supporting all the
existing event sources, but it’s just that I’m not sure it can be some
without any polling. This design would certainly support adding a
(non-threaded) time-based event to poll for input from other devices.

The trickiest requirement is to support the 1.2 API. My first reaction
would be to ask why we can’t just break away, like it was done for the
drawing code, but I realize that SDL is popular for porting those
crufty games that are designed around polling, so this has to work
too. And annoyingly, like I said, it’s easier to make a non-blocking
API block than a blocking API not block… Oh well. My plan is an API
where you call a method to run, erm, the runloop for an amount of
time, which might also be 0 (just dispatch one event, or a few already
ready events, no waiting) or infinite (just run until told to quit).
This should provide the opportunity to setup an emulation for a
polling API.

I’m also having to learn a bit of SDL 1.3, enough that I can make
myself a little test rig, but that should go quick enough.On Tue, Jan 6, 2009 at 11:22 PM, Sam Lantinga wrote:


http://pphaneuf.livejournal.com/

I think I figured out how to make the existing “nib-less” SDLmain.m
less hacky (which, by the way, does not denote being incorrect, just
being very clever and non-obvious, maybe overly so being the problem
here), which, incidentally, I wouldn’t need at all for my own idea,
which is to make a simple wrapper for NSRunLoop and deliver
(most) events as they are received,
without any queueing or thread.

There would have to be some queueing still, for events coming from
other threads (duh). But the main loop would try to avoid queuing as
much as possible and dispatch right away. This means not overflowing
the event queue ever,

It’s important to distinguish between concurrent execution frames,
logical threads, and kernel multi-threading provisions. If you propose
to deliver events (ie. dispatch event handlers) without any queueing
or threading, you must be proposing to interrupt event handlers to
push other event handlers onto the stack within the same frame of
execution (remember, no threads?) Are you sure this is what you meant?
You cannot makes guarantees about when an event handler will return,
so you are forced to solve a concurrency dilemma – until now, the
event queue has been the route, and I suggest that we have no
delusions about doing away with the event queue, only that we find
ways to skip over the event queue whenever the programmer has better
ideas about how his application will behave. A great example of that
is coming up…

and also means giving the proper control flow
information to the display servers (when you’re overwhelmed by mouse
movement events, the thing to do is not “get more mouse movement
events”, it’s to have the display server coalesce them a bit and get
less).

Is that not the proper thing to do? We could debate that, but that
isn’t the point. The point is that we don’t need to make that
decision, some applications will expect one type of behavior and
others expect another (think hand-writing recognition, or drawing
something with a mouse – if the CPU hiccups, pen strokes or other
mouse movement information would be lost to oblivion if we
auto-consolidate incoming pointer events.) The real future for SDL
event handling is to let the user decide: a default event handler
provided by SDL will mimic the simple, backward-compatible behavior of
SDL 1.2, or the user can register his own handler to do what he wants.
It might also make sense to build a few likely candidates into SDL so
the user doesn’t have to implement them his or herself (for instance,
consolidating mouse movement data might be very appropriate behavior
for some applications; maybe some applications only want that behavior
some of the time. Wouldn’t that be an interesting application…)

I’m not 100% sure about the requirements for supporting all the
existing event sources, but it’s just that I’m not sure it can be some
without any polling. This design would certainly support adding a
(non-threaded) time-based event to poll for input from other devices.

That is not a bad idea for one means of implementing polling alongside
signal-driven event dispatching. However it is not the only way, as I
mentioned before a thread can be used for each blocking read needed to
handle incoming events. Not all poll operations return immediately,
and if you have kernel threads available on a particular platform you
should be able to accommodate the possibility that the kernel will be
able to make use of the intervening wait period between calling your
polling API and it returning – if it can be done without introducing
application latency.

The issue is that if you have a timer/alarm event that can remind you
to perform some polling event, then you already have a concurrent
execution frame wherein you’re forced to examine the same questions I
just asked, so whether or not to use one method or another is just
going to be a matter of examining the polling sources and semantics of
sleeping and latency.

The trickiest requirement is to support the 1.2 API. My first reaction
would be to ask why we can’t just break away, like it was done for the
drawing code, but I realize that SDL is popular for porting those
crufty games that are designed around polling, so this has to work
too. And annoyingly, like I said, it’s easier to make a non-blocking
API block than a blocking API not block…

And again I completely disagree. Supporting the 1.2 API should be the
default behavior of SDL 1.3 applications. This can be easily
achieved by giving SDL default concurrent event handlers which simply
push events into the event queue. The main execution frame will pick
up the events in the SDL event queue using the old APIs. There is
absolutely no difficulty here.

Oh well. My plan is an API
where you call a method to run, erm, the runloop for an amount of
time, which might also be 0 (just dispatch one event, or a few already
ready events, no waiting) or infinite (just run until told to quit).
This should provide the opportunity to setup an emulation for a
polling API.

Could you please explain this approach a little further? If I follow
you correctly, this sort of inversion of control is completely
unnecessary as a mandate from the SDL API and it’s more than a big
divergence from how SDL has done things in the past. I contend that
all these things can be controlled by the user without sacrificing
much of anything.On Fri, Jan 9, 2009 at 2:09 AM, Pierre Phaneuf wrote:


http://codebad.com/

It’s important to distinguish between concurrent execution frames,
logical threads, and kernel multi-threading provisions. If you propose
to deliver events (ie. dispatch event handlers) without any queueing
or threading, you must be proposing to interrupt event handlers to
push other event handlers onto the stack within the same frame of
execution (remember, no threads?) Are you sure this is what you meant?
You cannot makes guarantees about when an event handler will return,
so you are forced to solve a concurrency dilemma – until now, the
event queue has been the route, and I suggest that we have no
delusions about doing away with the event queue, only that we find
ways to skip over the event queue whenever the programmer has better
ideas about how his application will behave. A great example of that
is coming up…

No interruption of ongoing event handlers, no. What I’m proposing is
within the scope of Xlib-era programming (we’re not polling like DOS
games used to, but we’re single-threaded, except for signals, and we
shouldn’t rely on those, since outside of Unix/Linux, they’re rarely
available in a consistent manner). If you really want to process more
than one thing at a time (or at least, in a very quickly alternating
manner, if you’ve got only one CPU/core), you’ll have to use threads.
The point is, if you have just one thread processing events, queueing
them will not process them faster, and will in fact hide information
from the rest of the system (that we’re on the losing end of the war).

Right now, in the case of X11 for example, we read in a bunch of
events with a quick sequence of XNextEvent, putting them in the event
queue (that’s the “event pump”), then once we’re done with that, the
program can get them and process them one at a time. It’s been a while
I checked, but if I remember correctly, if there is MAXEVENTS events
in the queue, further events get dropped entirely, in many cases. To
make things more exciting, we pump events every time we ask for one,
so if the pump gets you more than one event on average, you’re
eventually screwed, and there’s nothing you can do about it other than
process them fast enough so that the average goes below one. Even if
the application does keep up, there is no time advantage to enqueuing
the events, the time it will take overall is basically the time that
it takes the program to process them. In fact, queueing itself has a
small overhead (extra copies, bookkeeping, etc), but I think that’s
not very significant, unless you have a lot of events (similar to what
the documentation for fastevent says motivated it, now that I think
about it, so it might be a problem in some situations, but let’s not
optimize too early).

What I propose is simply that every time we ask SDL for an event, we
get just one. No queueing, just XNextEvent, then return that. In
reality, there will have to be a queue anyway, to receive events sent
using SDL_PushEvent, be it from the same thread or from another one.
So we check the queue first, return the top event if there’s one
there, if there isn’t one, we do XNextEvent and return that. I’ll
overlook the “peeking” and “mask” features for now, they’re both kind
of nasty, but can be implemented by moving events to the queue only
when those icky features are being used.

If you reverse the flow of the events, providing callbacks (you’d
register a callback for an event, then calling an hypothetical
SDL_ProcessEvents would call the callbacks synchronously before
returning) could allow even skipping the SDL_Event structure entirely!
Also, registering callbacks provides more information to SDL about
whether you are interested or not, instead of having to separately
specify flags (which I don’t think you can even do with SDL right
now). It’d be very silly for a game like Quadra, which doesn’t care at
all about mouse motion (only the location of clicks), to overflow the
event queue with motion events that it will promptly ignore, and
possibly miss an actual click event…

The big trick about skipping the event queue is that the platforms
input system actually provide one already, and, most importantly, is
that you can always ADD a queue, but you can’t take one OUT, therefore
a design without the queue leave you in a situation where adding one
for your special situation is perfectly possible. The one that you add
can have the exact overflow behaviours that you want, too, ignoring
certain events or coalescing them as per your application’s needs.

and also means giving the proper control flow
information to the display servers (when you’re overwhelmed by mouse
movement events, the thing to do is not “get more mouse movement
events”, it’s to have the display server coalesce them a bit and get
less).

Is that not the proper thing to do? We could debate that, but that
isn’t the point. The point is that we don’t need to make that
decision, some applications will expect one type of behavior and
others expect another (think hand-writing recognition, or drawing
something with a mouse – if the CPU hiccups, pen strokes or other
mouse movement information would be lost to oblivion if we
auto-consolidate incoming pointer events.) The real future for SDL
event handling is to let the user decide: a default event handler
provided by SDL will mimic the simple, backward-compatible behavior of
SDL 1.2, or the user can register his own handler to do what he wants.
It might also make sense to build a few likely candidates into SDL so
the user doesn’t have to implement them his or herself (for instance,
consolidating mouse movement data might be very appropriate behavior
for some applications; maybe some applications only want that behavior
some of the time. Wouldn’t that be an interesting application…)

Well, it doesn’t matter very much, because it’s what happens now
anyway (we’d have to use the motion history buffer in X11 to get
perfect motion), since it’s done by the display server…

It is true that queueing event and pumping the events at every
possible occasion might reduce that, as long as you don’t get
overwhelmed, at which point you’ll eventually just losing random
event, not just lose precision on the motion.

I’m not 100% sure about the requirements for supporting all the
existing event sources, but it’s just that I’m not sure it can be some
without any polling. This design would certainly support adding a
(non-threaded) time-based event to poll for input from other devices.

That is not a bad idea for one means of implementing polling alongside
signal-driven event dispatching. However it is not the only way, as I
mentioned before a thread can be used for each blocking read needed to
handle incoming events. Not all poll operations return immediately,
and if you have kernel threads available on a particular platform you
should be able to accommodate the possibility that the kernel will be
able to make use of the intervening wait period between calling your
polling API and it returning – if it can be done without introducing
application latency.

We can’t entirely rely on threads, since they’re not available
everywhere. I was thinking worst case here, of the “we’re on DOS and
this joystick uses PIO”. If you’re not stuck in 1987, then you’re all
good, for sure.

I’m not sure that we agree on what “polling” means, either. The way I
think of polling is where you have a “did anything happen” function
that is non-blocking, for each input source. If you’d block on any of
them, you wouldn’t respond to the others, so you can’t block anywhere,
you have to poll each of them in turn, one after another, with a small
sleep, sched_yield() or some such.

The opposite (in my mind) of “polling” is “event-driven”, as embodied
by the (unfortunately named, for the purpose of this discussion)
poll() system call on Unix (I don’t actually use that one often, I
tend to use select() or epoll).

You’re quite correct about being able to exploit the "sleeping"
CPU/core if you have threads, while one thread is waiting for events,
but that’s rarely much of a problem, since the kind of platforms
advanced enough to have all of that usually provide one
event-dispatching system call able to wait for every kind of event.

The issue is that if you have a timer/alarm event that can remind you
to perform some polling event, then you already have a concurrent
execution frame wherein you’re forced to examine the same questions I
just asked, so whether or not to use one method or another is just
going to be a matter of examining the polling sources and semantics of
sleeping and latency.

You do not have to have a concurrent execution frame for timers or
alarms. On Unix, I normally use a priority queue of timers sorted by
expiration time to pick the timeout to use with select(), for example.
Time is just another event. If I recall, SDL enforces timers to run on
another thread, where you have much better latency, but often end up
having to SDL_PushEvent back to the main thread to do anything,
putting you back at the same point. To get the same effect, you can
easily have a runloop in a separate thread, if you need better
latency, and if you don’t have threading, then you’re screwed either
way, what you’ve got is what you’ve got.

So if you’re in that worst case “DOS with a PIO joystick”, you can
just set yourself a timer with an expiration of right now every time
that you check the inports. Sure, you’re using 100% of the CPU once
more, but hey, you’re on DOS, everything uses 100% CPU all the time.
:wink:

And again I completely disagree. Supporting the 1.2 API should be the
default behavior of SDL 1.3 applications. This can be easily
achieved by giving SDL default concurrent event handlers which simply
push events into the event queue. The main execution frame will pick
up the events in the SDL event queue using the old APIs. There is
absolutely no difficulty here.

Since I’ve previously demonstrated that the queue had no effect beyond
sometimes losing random events in extreme cases, we’re (logically)
fine. The problem is that sticking an “always on” emulation of the 1.2
API would lose the nice feature of being able to select the set of
events you’re interested in, but that’s easily fixed: just add an
SDL_INIT_EVENT flag for SDL_Init, that new applications can set to
disable the emulation. Old apps don’t pass it in, and get the existing
behaviour (actually slightly different in the extreme cases where
events would have been lost, but that’s okay, the old behaviour is
basically “go nuts”, anything is better).

You are correct, there is no difficulty, even for platforms that do
not have threads.

Oh well. My plan is an API
where you call a method to run, erm, the runloop for an amount of
time, which might also be 0 (just dispatch one event, or a few already
ready events, no waiting) or infinite (just run until told to quit).
This should provide the opportunity to setup an emulation for a
polling API.

Could you please explain this approach a little further? If I follow
you correctly, this sort of inversion of control is completely
unnecessary as a mandate from the SDL API and it’s more than a big
divergence from how SDL has done things in the past. I contend that
all these things can be controlled by the user without sacrificing
much of anything.

It could return an SDL_Event, but that’s somewhat less elegant. If you
pass in an SDL_Event* as an “out parameter”, you have to copy the
event, even if you’ve got a perfectly fine one sitting right here in
the queue (remember that the original case for fastevent would still
involve queueing, so cutting down on blitting buffers around might be
nice). You could return a pointer instead, but then where is it
located? Who owns that memory? Should you free it? Eek, hitting the
allocator all the time, too?

Also, one of my interest is positioning SDL to be as low level as
possible on as many platforms as possible. Some platforms, like Mac OS
X, have an inverted flow of control like that already, and as I said
before, it’s easier to turn something into an inverted flow of control
(just need to deal with one event at a time, dispatch right away
without having to convert the event into an SDL_Event necessarily)
than turn it into how SDL currently works (which requires queueing
where there wasn’t before, or weird hacks like running the whole
program from within a single event handler like we currently do).
Adopting the style that’s has the lowest overhead to adapt into is
good to keep close to the metal in as many situations as possible.

I admint that inverting the control flow is a bit heavy-handed (that’s
why I wasn’t sure about it, my idea could be done either way), in that
it forces code to be better behaved, and removes opportunities for
incorrect usage. For example, with Xlib, where you have XSelectInput
to optimize what event gets sent, you can set up your XNextEvent to
process an event, and never get it, because you forgot to change the
XSelectInput event mask (or the reverse, set the mask, then get
spurious events from XNextEvent). With an API where you register a
callback, you can’t screw up: if you register the callback, you’re
claim your interest and set up dispatch at a single point. It also
enforces having a proper main loop set up, instead of letting bad
habits like "just checking SDL_GetKeyState whenever you feel like"
fester.

But it’s true that we might not want to force good code on people.
Maybe I just did too much Python recently (I used to be a Perl guy), I
blame my employer… ;-)On Fri, Jan 9, 2009 at 3:29 AM, Donny Viszneki <donny.viszneki at gmail.com> wrote:


http://pphaneuf.livejournal.com/

What follows has a fair bit of hand waving and is taking a very
general view… That means there are lots of special cases where
what I’m going to say is fixed in one way or another. So, take it for
what it is worth.

It is not possible to ensure that you always get every event.

The reason for this is that a computer is a finite device. It has
finite memory and finite processing speed. But, more important the
memory and processing power is allocated in advance and those
allocations make a lot of assumptions about what is expected to happen
and systems fail when those assumptions are violated.

Consider a common I/O device: the network interface. This device
usually talks to the CPU using interrupts. (I say usually because that
was not always the case.) Does the network interface ever loose
events? Yes, it does. If the network is overloaded (IP or Ethernet,
makes no difference) packets are lost in the network and never reach
my network interface. We handle this problem by acknowledging packets.
That pushes the problem of regenerating the packets back to the
source. And it means that the source must be able to regenerate the
packets and must pause until it is sure the packets it has sent have
been received. That means the source device must pause the application
that is trying to send the packets.

The OS has a limited ability to queue packets that have reached my
machine. If the number of packets it has received exceeds its storage
capacity what can the OS do? Why, it just ignores the packets and does
not acknowledge them even if it received them. In this case, it solves
it shortage of storage by passing the buck across the network to the
source machine.

Can the application help the OS out? Why yes it can. The application
can grab all the pending packets when ever possible and queue them up.
If we do that then the application can act to extend the OS queue
capacity and reduce the chance of lost packets. Don’t believe me? Try
streams as many UDP/IP packets as you can across a LAN and see the
effect of application queue size on the total number of lost packets.

That is how it works in networks, so what about using devices like
mice and joysticks? Same thing happens actually. It all depends on how
fast you can get the events from the device but it is still the case
that the OS has a limited amount of space to hold input from any
device. But, in the case of a mouse or joystick the device can not
regenerate the input events. What is an OS to do? In the case of both
the mouse and the joystick have movement and button push events. When
the mouse moves you can just modify the event at the top of the queue
(if it is a movement event) and add the current dx,dy to the existing
dx,dy and sort of lie about where the mouse has been. The application
is sure to get the mouses current location, but loses information
about its path. You can’t do that with buttons though… you have to
generate a new button event for each change of state. So, if the
application doesn’t get mouse/joystick events often enough the OS
queue fills up and input is just lost.

Again, the application can help out by clearing the queue as often as
possible and storing as many events in application memory as possible.

And, think about this: OS developers tend to want to use as little
space as possible so that the application has as much space as
possible. It is easy for an application to store thousands of events
when the OS may only be willing to store a few.

No matter how much queuing is done at each layer of the input system
it is still possible for a burst of events to be generated at a time
when the application queue is full or when the application is busy.
The event burst will fill the OS queues, and the last part of the
burst will be lost. Nothing you can do about it. What you can do is
make the probability of losing events lower and lower by increasing
the size of the application queues and increasing the frequency at
which the application transfers events from the OS queue to the
application queue.

Oh, yeah… The thing about network packets being lost and the source
OS being finally responsible for resending lost packets and having to
pause an application that is trying to send more packets than the
source machine can queue? Well that is why you can have nonblocking
network input, but you can not have non-blocking network output. You
can get close to nonblocking network output, but you can’t get it
because if all the queues are full from your application to the
destination machine, your application must pause until the queues
are emptied. If your are stuck in traffic you must wait for the cars
in front of you to move before you can more. (OK, I’ve been known to
drive off the Interstate through the grass to the frontage road and in
some cities I have seen people driving on the sidewalks, but even
those resource fill up. :slight_smile:

Some problems just can’t be solved, but you can make the chance of
running into the problem fairly small.

Bob Pendleton–

±-------------------------------------+

It is not possible to ensure that you always get every event.

I agree. But there’s a big difference between network events that can
get blasted at high speed, and keyboard events that go at up to 20 per
seconds (that’s assuming a really advanced 120 wpm touch typist). And
most specifically, losing the KeyUp event corresponding to a KeyDown
would be very annoying (the user could un-stick the key by pressing it
again, but it’s far more visible and embarrassing than TCP retries
after dropped packets!). Or, for example, a timer event, I’d be okay
with it arriving late, but never arriving would be fairly insulting
(the memory is all pre-allocated when you register the timer, one only
has to walk the priority queue of registered timers to find out the
expired ones).

Can the application help the OS out? Why yes it can. The application
can grab all the pending packets when ever possible and queue them up.
If we do that then the application can act to extend the OS queue
capacity and reduce the chance of lost packets. Don’t believe me? Try
streams as many UDP/IP packets as you can across a LAN and see the
effect of application queue size on the total number of lost packets.

TCP has flow control designed into it for that purpose, for example,
and doing user-space queueing foils it. As I mentioned in my last
email, you can’t remove a queueing step once it’s there, so if you
provide one, and it more or less cripples the TCP sliding window, the
application cannot “undo the damage”. The crippling is done. Where
if the application decides that it needs to queue in order to operate
properly, if there’s no queueing provided, it can be added, leaving
the options open.

Also, it’s very unclear to me that there is any improvement to get by
moving packets from one queue to the other (from the OS queue to the
application queue). If the buffer isn’t big enough, use
setsockopt(SO_RCVBUF). What I found to be optimal for throughput is to
process UDP packets in batches, with a cap for fairness (don’t want to
starve the UI for the network).

A common mistake is to process (not just receive into a queue, but
actually process) just one packet per notification (which,
incidentally, is what you get if you SDL_PushEvent the packets and
process them using SDL_WaitEvent, since the latter does a
SDL_PumpEvents every time).

I’m quite aware that lossage might be possible, but I’m just saying
that losing information (feedback) about the lossage is also very bad,
preventing correct throttling behaviours from kicking in. A classic
example is a old hack where you could do PPP tunnelled over SSH to
have a home-grown VPN. It turns out that this has very bad latency
implications, because SSH provides a fully-reliable transport (thanks
to using TCP), and TCP actually relies on feedback in the form of lost
packets to notice congestion, so running TCP-over-TCP is a bad idea
(see http://sites.inka.de/~W1011/devel/tcp-tcp.html for a good
explanation). Having an extra queue in SDL loses feedback.

Also, people often underestimate the cost of getting and queueing
spurious events. I once worked on packet capture software, and using a
careful kernel-side packet-matching filter (using “tcpdump -d” and
setsockopt(SO_ATTACH_FILTER)) made the difference between the computer
(a mid-range 1.6 GHz machine, not some dinky ARM board) being
overwhelmed (losing packets at 100% CPU usage) and keep up easily with
a CPU usage in the 10% range. This is related to my point about
application being able to state whether they are interested or not in
an event, instead of just getting everything and letting the
application throw away events it does not need.

That is how it works in networks, so what about using devices like
mice and joysticks? Same thing happens actually. It all depends on how
fast you can get the events from the device but it is still the case
that the OS has a limited amount of space to hold input from any
device. But, in the case of a mouse or joystick the device can not
regenerate the input events. What is an OS to do? In the case of both
the mouse and the joystick have movement and button push events. When
the mouse moves you can just modify the event at the top of the queue
(if it is a movement event) and add the current dx,dy to the existing
dx,dy and sort of lie about where the mouse has been. The application
is sure to get the mouses current location, but loses information
about its path. You can’t do that with buttons though… you have to
generate a new button event for each change of state. So, if the
application doesn’t get mouse/joystick events often enough the OS
queue fills up and input is just lost.

For the mouse, that’s the event coalescing I was talking about, which
does reduce the quality of the input, but is in my opinion better than
"100% accurate, until it becomes utter trash". If your application
isn’t keeping up, slowly and gently degrading is much better than
going into some awful stuttering.

For the buttons, one could drop further button events without dropping
other events, and be careful to post one last ButtonRelease before
doing so. They also happen to be very low-intensity events (rarely
more than 10-20 per seconds, I’d guess?) and they can be very easy to
process (update a “button_is_pressed” boolean isn’t exactly a ton more
work than dropping the event). Again, there’s already a queue in the
display server that could lose those events, so why add our own? Is it
better to lose them inside SDL, somehow (the application doesn’t see
them either way)?

Again, the application can help out by clearing the queue as often as
possible and storing as many events in application memory as possible.

And rob the display server of chances of applying its own throttling.
Once you’ve done that, you can’t undo it, but you can always add it
later if that’s a problem…

And, think about this: OS developers tend to want to use as little
space as possible so that the application has as much space as
possible. It is easy for an application to store thousands of events
when the OS may only be willing to store a few.

Most input systems are in user-space already, SDL is getting those
over Unix domain sockets or Mach ports, and Windows has a limit of
10,000 messages (SDL’s MAXEVENTS is a dinky 128 in comparison!). This
is a non-issue. If we have a platform that does not, then the SDL
driver for that platform can provide some queueing to provide a
minimum expected level of service, but doing that in the generic code
isn’t a clear win to me (the SDL core code can definitely provide a
single queue implementation that platform drivers can share, of
course, it should just not be mandatory).

Oh, yeah… The thing about network packets being lost and the source
OS being finally responsible for resending lost packets and having to
pause an application that is trying to send more packets than the
source machine can queue? Well that is why you can have nonblocking
network input, but you can not have non-blocking network output. You
can get close to nonblocking network output, but you can’t get it
because if all the queues are full from your application to the
destination machine, your application must pause until the queues
are emptied. If your are stuck in traffic you must wait for the cars
in front of you to move before you can more. (OK, I’ve been known to
drive off the Interstate through the grass to the frontage road and in
some cities I have seen people driving on the sidewalks, but even
those resource fill up. :slight_smile:

I use non-blocking sending with small kernel SO_SNDBUF on purpose in
some situations, because I want the explicit feedback from the
network, telling me that I’m overloading it (incidentally, the
recipient buffering everything into user-space would screw this up,
again), and I can coalesce events in a way that makes sense for my
application. For example, in Quadra (a T*tris derivative), there a
"watch" feature where you can see other players blocks as they fall,
but if the bandwidth is too low, you don’t get the full sequence of
block moves with a delay, you just get a single “big move”,
potentially putting it in its final place, and this is accomplished by
reacting appropriately when send() returns EWOULDBLOCK. This clever
protocol is not possible with SDL_net as of now, by the way, even
though it’s completely portable.

When I see the traffic on the road outside the office, I just stay at
the office and work some more. This is “free work” I accomplish
instead of being stuck in traffic. If I lose the feedback, then I can
just try to drive home and hope for the best, where I might or might
not “pause” on the freeway. ;-)On Fri, Jan 9, 2009 at 2:12 PM, Bob Pendleton wrote:


http://pphaneuf.livejournal.com/

Well, I see you did exactly what I asked you not to do. You looked at
the details and not the concepts. I’m only going to respond to a small
part of your “counter” examples. There is no point in arguing details
after tell you that I already know the details.

It is not possible to ensure that you always get every event.

I agree. But there’s a big difference between network events that can
get blasted at high speed, and keyboard events that go at up to 20 per
seconds (that’s assuming a really advanced 120 wpm touch typist). And
most specifically, losing the KeyUp event corresponding to a KeyDown
would be very annoying (the user could un-stick the key by pressing it
again, but it’s far more visible and embarrassing than TCP retries
after dropped packets!).

And, it can happen in SDL right now that you lose a keyup because the
input queue is already full. Unlikely, but possible.

Or, for example, a timer event, I’d be okay

with it arriving late, but never arriving would be fairly insulting
(the memory is all pre-allocated when you register the timer, one only
has to walk the priority queue of registered timers to find out the
expired ones).

If the timer tries to put an event on a full queue, what happens?
Either the timer has to wait until the queue has an empty slot or the
event has to be dropped. The fact that the queue memory is already
allocated is the reason why the event is dropped. It is just like the
case of network packets, either the event has to be dropped or the
source has to take responsibility for delivering it. If the time
decides to wait, the event will be delivered late, which as you say is
better than never. What happens if the time has to wait past the time
of the next event it is supposed to deliver? How far behind can it
get?

Yes, you can reduce the probability to a very low level, but you can’t
get rid of it.

Bob PendletonOn Fri, Jan 9, 2009 at 3:51 PM, Pierre Phaneuf wrote:

On Fri, Jan 9, 2009 at 2:12 PM, Bob Pendleton <@Bob_Pendleton> wrote:

Can the application help the OS out? Why yes it can. The application
can grab all the pending packets when ever possible and queue them up.
If we do that then the application can act to extend the OS queue
capacity and reduce the chance of lost packets. Don’t believe me? Try
streams as many UDP/IP packets as you can across a LAN and see the
effect of application queue size on the total number of lost packets.

TCP has flow control designed into it for that purpose, for example,
and doing user-space queueing foils it. As I mentioned in my last
email, you can’t remove a queueing step once it’s there, so if you
provide one, and it more or less cripples the TCP sliding window, the
application cannot “undo the damage”. The crippling is done. Where
if the application decides that it needs to queue in order to operate
properly, if there’s no queueing provided, it can be added, leaving
the options open.

Also, it’s very unclear to me that there is any improvement to get by
moving packets from one queue to the other (from the OS queue to the
application queue). If the buffer isn’t big enough, use
setsockopt(SO_RCVBUF). What I found to be optimal for throughput is to
process UDP packets in batches, with a cap for fairness (don’t want to
starve the UI for the network).

A common mistake is to process (not just receive into a queue, but
actually process) just one packet per notification (which,
incidentally, is what you get if you SDL_PushEvent the packets and
process them using SDL_WaitEvent, since the latter does a
SDL_PumpEvents every time).

I’m quite aware that lossage might be possible, but I’m just saying
that losing information (feedback) about the lossage is also very bad,
preventing correct throttling behaviours from kicking in. A classic
example is a old hack where you could do PPP tunnelled over SSH to
have a home-grown VPN. It turns out that this has very bad latency
implications, because SSH provides a fully-reliable transport (thanks
to using TCP), and TCP actually relies on feedback in the form of lost
packets to notice congestion, so running TCP-over-TCP is a bad idea
(see http://sites.inka.de/~W1011/devel/tcp-tcp.html for a good
explanation). Having an extra queue in SDL loses feedback.

Also, people often underestimate the cost of getting and queueing
spurious events. I once worked on packet capture software, and using a
careful kernel-side packet-matching filter (using “tcpdump -d” and
setsockopt(SO_ATTACH_FILTER)) made the difference between the computer
(a mid-range 1.6 GHz machine, not some dinky ARM board) being
overwhelmed (losing packets at 100% CPU usage) and keep up easily with
a CPU usage in the 10% range. This is related to my point about
application being able to state whether they are interested or not in
an event, instead of just getting everything and letting the
application throw away events it does not need.

That is how it works in networks, so what about using devices like
mice and joysticks? Same thing happens actually. It all depends on how
fast you can get the events from the device but it is still the case
that the OS has a limited amount of space to hold input from any
device. But, in the case of a mouse or joystick the device can not
regenerate the input events. What is an OS to do? In the case of both
the mouse and the joystick have movement and button push events. When
the mouse moves you can just modify the event at the top of the queue
(if it is a movement event) and add the current dx,dy to the existing
dx,dy and sort of lie about where the mouse has been. The application
is sure to get the mouses current location, but loses information
about its path. You can’t do that with buttons though… you have to
generate a new button event for each change of state. So, if the
application doesn’t get mouse/joystick events often enough the OS
queue fills up and input is just lost.

For the mouse, that’s the event coalescing I was talking about, which
does reduce the quality of the input, but is in my opinion better than
"100% accurate, until it becomes utter trash". If your application
isn’t keeping up, slowly and gently degrading is much better than
going into some awful stuttering.

For the buttons, one could drop further button events without dropping
other events, and be careful to post one last ButtonRelease before
doing so. They also happen to be very low-intensity events (rarely
more than 10-20 per seconds, I’d guess?) and they can be very easy to
process (update a “button_is_pressed” boolean isn’t exactly a ton more
work than dropping the event). Again, there’s already a queue in the
display server that could lose those events, so why add our own? Is it
better to lose them inside SDL, somehow (the application doesn’t see
them either way)?

Again, the application can help out by clearing the queue as often as
possible and storing as many events in application memory as possible.

And rob the display server of chances of applying its own throttling.
Once you’ve done that, you can’t undo it, but you can always add it
later if that’s a problem…

And, think about this: OS developers tend to want to use as little
space as possible so that the application has as much space as
possible. It is easy for an application to store thousands of events
when the OS may only be willing to store a few.

Most input systems are in user-space already, SDL is getting those
over Unix domain sockets or Mach ports, and Windows has a limit of
10,000 messages (SDL’s MAXEVENTS is a dinky 128 in comparison!). This
is a non-issue. If we have a platform that does not, then the SDL
driver for that platform can provide some queueing to provide a
minimum expected level of service, but doing that in the generic code
isn’t a clear win to me (the SDL core code can definitely provide a
single queue implementation that platform drivers can share, of
course, it should just not be mandatory).

Oh, yeah… The thing about network packets being lost and the source
OS being finally responsible for resending lost packets and having to
pause an application that is trying to send more packets than the
source machine can queue? Well that is why you can have nonblocking
network input, but you can not have non-blocking network output. You
can get close to nonblocking network output, but you can’t get it
because if all the queues are full from your application to the
destination machine, your application must pause until the queues
are emptied. If your are stuck in traffic you must wait for the cars
in front of you to move before you can more. (OK, I’ve been known to
drive off the Interstate through the grass to the frontage road and in
some cities I have seen people driving on the sidewalks, but even
those resource fill up. :slight_smile:

I use non-blocking sending with small kernel SO_SNDBUF on purpose in
some situations, because I want the explicit feedback from the
network, telling me that I’m overloading it (incidentally, the
recipient buffering everything into user-space would screw this up,
again), and I can coalesce events in a way that makes sense for my
application. For example, in Quadra (a T*tris derivative), there a
"watch" feature where you can see other players blocks as they fall,
but if the bandwidth is too low, you don’t get the full sequence of
block moves with a delay, you just get a single “big move”,
potentially putting it in its final place, and this is accomplished by
reacting appropriately when send() returns EWOULDBLOCK. This clever
protocol is not possible with SDL_net as of now, by the way, even
though it’s completely portable.

When I see the traffic on the road outside the office, I just stay at
the office and work some more. This is “free work” I accomplish
instead of being stuck in traffic. If I lose the feedback, then I can
just try to drive home and hope for the best, where I might or might
not “pause” on the freeway. :wink:


http://pphaneuf.livejournal.com/


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

±-------------------------------------+

And, it can happen in SDL right now that you lose a keyup because the
input queue is already full. Unlikely, but possible.

Yes, and I don’t like it, hence my wanting to fix that. There might be
event lossage upstream, but it won’t be our fault (and Windows
10,000-deep per thread queue is a bit less likely to lose events than
SDL’s 128-deep one).

If the timer tries to put an event on a full queue, what happens?
Either the timer has to wait until the queue has an empty slot or the
event has to be dropped. The fact that the queue memory is already
allocated is the reason why the event is dropped. It is just like the
case of network packets, either the event has to be dropped or the
source has to take responsibility for delivering it. If the time
decides to wait, the event will be delivered late, which as you say is
better than never. What happens if the time has to wait past the time
of the next event it is supposed to deliver? How far behind can it
get?

I said that the timer registrations are already allocated, not the
event queue (I’m trying to abolish using the event queue in the main
code path, remember?). :wink:

You have the list of timers, sorted by expiration time. You have the
current time. You just processed (and dispatched callbacks right away,
without queueing) UI events. You go through the list of timers,
calling the callback, without ever allocating a single byte of memory.
In fact, as you go through the list and remove the expired and
processed timers, you might free memory. You still deliver timer
events in order of their expiration time, so that you don’t go insane.
Never dropping a timer event.

Some things, like networks, are inherently unreliable. But on the
other hand, some other things like time going forward, are fairly
reliable. :wink:

Yes, if your processing takes too long, timers can get behind. But
instead of dropping random events, the program can now make a decision
in its callback, such as skipping rendering some frames, doing only
the game engine processing, thus lowering the framerate without
actually slowing down the game.

I’ve done this stuff for a long time, I know how to do it, and I’ll do
it, there’s no doubt about it. I’d certainly prefer not to fork SDL in
the process (because fixing this can’t be done well on top of it, I
have to mess with its internals), but I’m sure Sam is reasonable, if
what I do works well and seems better, then I’m sure he’ll take in the
patch. If not, I’ll maintain my fork quietly, and that’ll be that. The
only real difficulty here is the SDL 1.2 compatibility, which is
needed for Sam to accept this ever (and quite understandably!). If
it were just up to me, I don’t know, maybe I’d take advantage that 1.3
is already incompatible with 1.2, and just say that you have to port
your event code to the new API.On Fri, Jan 9, 2009 at 6:40 PM, Bob Pendleton wrote:


http://www.linkedin.com/in/pphaneuf

the mouse and the joystick have movement and button push events. When
the mouse moves you can just modify the event at the top of the queue
(if it is a movement event) and add the current dx,dy to the existing
dx,dy and sort of lie about where the mouse has been. The application

It’s even simpler than that on Windows (outside of SDL): when mouse
motion triggers a hardware interrupt, Windows sets a flag for each app
that is looking for mouse events. When the app pumps the event queue,
they supply the event then, in relation to the previous x,y coordinates
for that application, and clear the flag until the next interrupt.

So total storage for mouse motion events, per app, is one bit, plus X
and Y coordinates, and you don’t ever have to dig through the queue to
replace or update an existing mouse motion event. It’s actually really
smart, since this is probably the only event type that would ever really
risk overflowing a well-written program’s queue.

But then again, it’s also a pathological case of event queue overflow:
you literally drop all mouse motion events between pumps of the queue,
but it doesn’t cause overflow that makes you lose other events…and
between losing a key release and losing mouse motion resolution, it’s
the wiser choice.

–ryan.

At any rate, I’m not going to restrict my plans for SDL development to
what Sam can license unless he wants to give me a job! (Sam, is your
new company hiring?)

Not at the moment! :slight_smile:

See ya,
-Sam Lantinga, Founder and President, Galaxy Gameworks LLC

So… I’m excited about this. Can you post some pseudo code and flow
diagrams?

See ya!
-Sam Lantinga, Founder and President, Galaxy Gameworks LLC

Sam, with all the traffic on this mailing list, it would help us keep track if
you would show and quote the person you’re responding to. I haven’t a clue as
to what you’re excited about, but maybe that’s just me :wink:

JeffOn Saturday 10 January 2009 21:41, Sam Lantinga wrote:

So… I’m excited about this. Can you post some pseudo code and flow
diagrams?

See ya!
-Sam Lantinga, Founder and President, Galaxy Gameworks LLC

I felt it was intuitive that he meant all of the above. I have fallen
behind in this thread by three emails myself, I imagine Sam wants some
pictures to help him see clearly what everyone is saying.

Haven’t you ever been to a board meeting? :stuck_out_tongue:

P.S. Trying to diagram things is a good exercise for identifying
cross-cutting design considerations!On Sun, Jan 11, 2009 at 1:21 AM, Jeff Post <j_post at pacbell.net> wrote:

On Saturday 10 January 2009 21:41, Sam Lantinga wrote:

So… I’m excited about this. Can you post some pseudo code and flow
diagrams?

Sam, with all the traffic on this mailing list, it would help us keep track if
you would show and quote the person you’re responding to. I haven’t a clue as
to what you’re excited about, but maybe that’s just me :wink:


http://codebad.com/

Haven’t you ever been to a board meeting? :stuck_out_tongue:

Far too many of them.

P.S. Trying to diagram things is a good exercise for identifying
cross-cutting design considerations!

If only we had time to diagram everything, life would be much easier :wink:

JeffOn Saturday 10 January 2009 22:22, Donny Viszneki wrote:

Sorry, to be more clear, I’m really interested in seeing some details
on the architecture and implementation that Pierre Phaneuf has in mind.
He clearly has thought alot about it, and I’m curious how he plans to
go about it.

BTW, in case nobody noticed, there is already a simple callback mechanism
for SDL events that bypasses the event queue.

See ya,
-Sam Lantinga, Founder and President, Galaxy Gameworks LLC> On Saturday 10 January 2009 21:41, Sam Lantinga wrote:

So… I’m excited about this. Can you post some pseudo code and flow
diagrams?

See ya!
-Sam Lantinga, Founder and President, Galaxy Gameworks LLC

Sam, with all the traffic on this mailing list, it would help us keep track if
you would show and quote the person you’re responding to. I haven’t a clue as
to what you’re excited about, but maybe that’s just me :wink:

So total storage for mouse motion events, per app, is one bit, plus X and Y
coordinates, and you don’t ever have to dig through the queue to replace or
update an existing mouse motion event. It’s actually really smart, since
this is probably the only event type that would ever really risk overflowing
a well-written program’s queue.

That’s more or less what I’d expect from a reasonable system. On
Windows, the message queue can become pretty exciting when using
things like WSAAsyncSelect (on the server side, anyway), but
otherwise, that’s generally true. But those are optimized in a similar
way: when the interrupts for more packets arrive past the first one,
no need to store more events, just when it goes from “empty buffer” to
"not empty", resetting the bit when it goes back to empty. It’s
actually a bit funkier, due to there being more than one socket, you
have to avoid going O(N) on the number of sockets, but it’s along
those lines…

And also, the queue is 10,000 deep instead of 128. :wink:

But then again, it’s also a pathological case of event queue overflow: you
literally drop all mouse motion events between pumps of the queue, but it
doesn’t cause overflow that makes you lose other events…and between
losing a key release and losing mouse motion resolution, it’s the wiser
choice.

While what you explained is more efficient than what Bob described, in
principle it gets more or less the same effect, the reduction in mouse
motion resolution being the “lie about where the mouse has been” bit
that he talks about.

I agree with you that this is the wiser choice, but what are you
saying in term of what we should do to SDL? I see things like
SDL_INIT_EVENTTHREAD, and it does look a bit like someone thought that
the opposite was the wiser choice, so I’m guessing there is some
history there?On Sat, Jan 10, 2009 at 10:13 PM, Ryan C. Gordon wrote:


http://pphaneuf.livejournal.com/

Sorry, to be more clear, I’m really interested in seeing some details
on the architecture and implementation that Pierre Phaneuf has in mind.
He clearly has thought alot about it, and I’m curious how he plans to
go about it.

I have a pretty clear idea of how I’d do it from scratch, but I’m
trying to think how to make it fit well with the SDL 1.2 API at the
moment. I spent the day yesterday removing various things from
SDL_events.h (and SDL_events_c.h) and seeing what breaks, what needs
what, and so on. I’ll get back with some pseudo-code soon.

BTW, in case nobody noticed, there is already a simple callback mechanism
for SDL events that bypasses the event queue.

The filters? Yeah, I had forgotten about those, but I’d like it if it
was possible for a driver to do the correct XSelectInput (equivalent),
for example.On Sun, Jan 11, 2009 at 2:33 AM, Sam Lantinga wrote:


http://pphaneuf.livejournal.com/

It seems to me that everyone favors a callback-driven event
dispatching API for the majority of events. Why is it hard to have the
default callbacks fill the SDL event queue, thus being backward
compatible?On Sun, Jan 11, 2009 at 11:59 AM, Pierre Phaneuf wrote:

On Sun, Jan 11, 2009 at 2:33 AM, Sam Lantinga wrote:

Sorry, to be more clear, I’m really interested in seeing some details
on the architecture and implementation that Pierre Phaneuf has in mind.
He clearly has thought alot about it, and I’m curious how he plans to
go about it.

I have a pretty clear idea of how I’d do it from scratch, but I’m
trying to think how to make it fit well with the SDL 1.2 API at the
moment. I spent the day yesterday removing various things from
SDL_events.h (and SDL_events_c.h) and seeing what breaks, what needs
what, and so on. I’ll get back with some pseudo-code soon.


http://codebad.com/

Maybe I’m crazy, but from my perspective there exists a significant
duality between pollable event sources and signaled event sources.

That means this:

if a series of discrete events can be “coalesced” then they can be
polled (just think about using an implementation of
SDL_GetKeyboardState() driven by coalescing KEYDOWN and KEYUP events)
and if a data source can be polled regularly and key moments
discriminated from boring moments (meaning that you can tell the
difference between noteworthy changes and note-unworthy changes or a
complete absence of change) then you can receive discrete messages on
a pollable source.

Apparently MSWindows does just this: it coalesces pointer motion data
for each application which avoids queuing messages. It logically
follows then SDL is handing application developers pointer motion
"events" which don’t really exist on MSWindows, they are imagined
through a combination of SDL’s polling action and MSWindows’
coalescent pointer data.

If we recognize the duality of pollable data sources (sources which do
not deliver discrete events, but whose state must be queried) and
discrete or signaled event sources (sources which deliver new data in
specific quanta through some API or other) then we can offer SDL users
an opportunity to choose the update delivery model that suits both
their project and the platform best.

For example on MSWindows, the windowing environment will coalesce
pointer data for you. Maybe that suits your application, so on
MSWindows you tell SDL you want to poll pointer motion. On another
platform without a windowing environment that will coalesce pointer
data for you, if you’ve told SDL you want to poll the pointer
position, it can still happen, through the magic of an abstracting
layer called SDL! I really don’t see anything very hard about this,
and it seems necessary for embedded platforms that might not coalesce
anything for you.–
http://codebad.com/