It’s important to distinguish between concurrent execution frames,
logical threads, and kernel multi-threading provisions. If you propose
to deliver events (ie. dispatch event handlers) without any queueing
or threading, you must be proposing to interrupt event handlers to
push other event handlers onto the stack within the same frame of
execution (remember, no threads?) Are you sure this is what you meant?
You cannot makes guarantees about when an event handler will return,
so you are forced to solve a concurrency dilemma – until now, the
event queue has been the route, and I suggest that we have no
delusions about doing away with the event queue, only that we find
ways to skip over the event queue whenever the programmer has better
ideas about how his application will behave. A great example of that
is coming up…
No interruption of ongoing event handlers, no. What I’m proposing is
within the scope of Xlib-era programming (we’re not polling like DOS
games used to, but we’re single-threaded, except for signals, and we
shouldn’t rely on those, since outside of Unix/Linux, they’re rarely
available in a consistent manner). If you really want to process more
than one thing at a time (or at least, in a very quickly alternating
manner, if you’ve got only one CPU/core), you’ll have to use threads.
The point is, if you have just one thread processing events, queueing
them will not process them faster, and will in fact hide information
from the rest of the system (that we’re on the losing end of the war).
Right now, in the case of X11 for example, we read in a bunch of
events with a quick sequence of XNextEvent, putting them in the event
queue (that’s the “event pump”), then once we’re done with that, the
program can get them and process them one at a time. It’s been a while
I checked, but if I remember correctly, if there is MAXEVENTS events
in the queue, further events get dropped entirely, in many cases. To
make things more exciting, we pump events every time we ask for one,
so if the pump gets you more than one event on average, you’re
eventually screwed, and there’s nothing you can do about it other than
process them fast enough so that the average goes below one. Even if
the application does keep up, there is no time advantage to enqueuing
the events, the time it will take overall is basically the time that
it takes the program to process them. In fact, queueing itself has a
small overhead (extra copies, bookkeeping, etc), but I think that’s
not very significant, unless you have a lot of events (similar to what
the documentation for fastevent says motivated it, now that I think
about it, so it might be a problem in some situations, but let’s not
optimize too early).
What I propose is simply that every time we ask SDL for an event, we
get just one. No queueing, just XNextEvent, then return that. In
reality, there will have to be a queue anyway, to receive events sent
using SDL_PushEvent, be it from the same thread or from another one.
So we check the queue first, return the top event if there’s one
there, if there isn’t one, we do XNextEvent and return that. I’ll
overlook the “peeking” and “mask” features for now, they’re both kind
of nasty, but can be implemented by moving events to the queue only
when those icky features are being used.
If you reverse the flow of the events, providing callbacks (you’d
register a callback for an event, then calling an hypothetical
SDL_ProcessEvents would call the callbacks synchronously before
returning) could allow even skipping the SDL_Event structure entirely!
Also, registering callbacks provides more information to SDL about
whether you are interested or not, instead of having to separately
specify flags (which I don’t think you can even do with SDL right
now). It’d be very silly for a game like Quadra, which doesn’t care at
all about mouse motion (only the location of clicks), to overflow the
event queue with motion events that it will promptly ignore, and
possibly miss an actual click event…
The big trick about skipping the event queue is that the platforms
input system actually provide one already, and, most importantly, is
that you can always ADD a queue, but you can’t take one OUT, therefore
a design without the queue leave you in a situation where adding one
for your special situation is perfectly possible. The one that you add
can have the exact overflow behaviours that you want, too, ignoring
certain events or coalescing them as per your application’s needs.
and also means giving the proper control flow
information to the display servers (when you’re overwhelmed by mouse
movement events, the thing to do is not “get more mouse movement
events”, it’s to have the display server coalesce them a bit and get
less).
Is that not the proper thing to do? We could debate that, but that
isn’t the point. The point is that we don’t need to make that
decision, some applications will expect one type of behavior and
others expect another (think hand-writing recognition, or drawing
something with a mouse – if the CPU hiccups, pen strokes or other
mouse movement information would be lost to oblivion if we
auto-consolidate incoming pointer events.) The real future for SDL
event handling is to let the user decide: a default event handler
provided by SDL will mimic the simple, backward-compatible behavior of
SDL 1.2, or the user can register his own handler to do what he wants.
It might also make sense to build a few likely candidates into SDL so
the user doesn’t have to implement them his or herself (for instance,
consolidating mouse movement data might be very appropriate behavior
for some applications; maybe some applications only want that behavior
some of the time. Wouldn’t that be an interesting application…)
Well, it doesn’t matter very much, because it’s what happens now
anyway (we’d have to use the motion history buffer in X11 to get
perfect motion), since it’s done by the display server…
It is true that queueing event and pumping the events at every
possible occasion might reduce that, as long as you don’t get
overwhelmed, at which point you’ll eventually just losing random
event, not just lose precision on the motion.
I’m not 100% sure about the requirements for supporting all the
existing event sources, but it’s just that I’m not sure it can be some
without any polling. This design would certainly support adding a
(non-threaded) time-based event to poll for input from other devices.
That is not a bad idea for one means of implementing polling alongside
signal-driven event dispatching. However it is not the only way, as I
mentioned before a thread can be used for each blocking read needed to
handle incoming events. Not all poll operations return immediately,
and if you have kernel threads available on a particular platform you
should be able to accommodate the possibility that the kernel will be
able to make use of the intervening wait period between calling your
polling API and it returning – if it can be done without introducing
application latency.
We can’t entirely rely on threads, since they’re not available
everywhere. I was thinking worst case here, of the “we’re on DOS and
this joystick uses PIO”. If you’re not stuck in 1987, then you’re all
good, for sure.
I’m not sure that we agree on what “polling” means, either. The way I
think of polling is where you have a “did anything happen” function
that is non-blocking, for each input source. If you’d block on any of
them, you wouldn’t respond to the others, so you can’t block anywhere,
you have to poll each of them in turn, one after another, with a small
sleep, sched_yield() or some such.
The opposite (in my mind) of “polling” is “event-driven”, as embodied
by the (unfortunately named, for the purpose of this discussion)
poll() system call on Unix (I don’t actually use that one often, I
tend to use select() or epoll).
You’re quite correct about being able to exploit the "sleeping"
CPU/core if you have threads, while one thread is waiting for events,
but that’s rarely much of a problem, since the kind of platforms
advanced enough to have all of that usually provide one
event-dispatching system call able to wait for every kind of event.
The issue is that if you have a timer/alarm event that can remind you
to perform some polling event, then you already have a concurrent
execution frame wherein you’re forced to examine the same questions I
just asked, so whether or not to use one method or another is just
going to be a matter of examining the polling sources and semantics of
sleeping and latency.
You do not have to have a concurrent execution frame for timers or
alarms. On Unix, I normally use a priority queue of timers sorted by
expiration time to pick the timeout to use with select(), for example.
Time is just another event. If I recall, SDL enforces timers to run on
another thread, where you have much better latency, but often end up
having to SDL_PushEvent back to the main thread to do anything,
putting you back at the same point. To get the same effect, you can
easily have a runloop in a separate thread, if you need better
latency, and if you don’t have threading, then you’re screwed either
way, what you’ve got is what you’ve got.
So if you’re in that worst case “DOS with a PIO joystick”, you can
just set yourself a timer with an expiration of right now every time
that you check the inports. Sure, you’re using 100% of the CPU once
more, but hey, you’re on DOS, everything uses 100% CPU all the time.
And again I completely disagree. Supporting the 1.2 API should be the
default behavior of SDL 1.3 applications. This can be easily
achieved by giving SDL default concurrent event handlers which simply
push events into the event queue. The main execution frame will pick
up the events in the SDL event queue using the old APIs. There is
absolutely no difficulty here.
Since I’ve previously demonstrated that the queue had no effect beyond
sometimes losing random events in extreme cases, we’re (logically)
fine. The problem is that sticking an “always on” emulation of the 1.2
API would lose the nice feature of being able to select the set of
events you’re interested in, but that’s easily fixed: just add an
SDL_INIT_EVENT flag for SDL_Init, that new applications can set to
disable the emulation. Old apps don’t pass it in, and get the existing
behaviour (actually slightly different in the extreme cases where
events would have been lost, but that’s okay, the old behaviour is
basically “go nuts”, anything is better).
You are correct, there is no difficulty, even for platforms that do
not have threads.
Oh well. My plan is an API
where you call a method to run, erm, the runloop for an amount of
time, which might also be 0 (just dispatch one event, or a few already
ready events, no waiting) or infinite (just run until told to quit).
This should provide the opportunity to setup an emulation for a
polling API.
Could you please explain this approach a little further? If I follow
you correctly, this sort of inversion of control is completely
unnecessary as a mandate from the SDL API and it’s more than a big
divergence from how SDL has done things in the past. I contend that
all these things can be controlled by the user without sacrificing
much of anything.
It could return an SDL_Event, but that’s somewhat less elegant. If you
pass in an SDL_Event* as an “out parameter”, you have to copy the
event, even if you’ve got a perfectly fine one sitting right here in
the queue (remember that the original case for fastevent would still
involve queueing, so cutting down on blitting buffers around might be
nice). You could return a pointer instead, but then where is it
located? Who owns that memory? Should you free it? Eek, hitting the
allocator all the time, too?
Also, one of my interest is positioning SDL to be as low level as
possible on as many platforms as possible. Some platforms, like Mac OS
X, have an inverted flow of control like that already, and as I said
before, it’s easier to turn something into an inverted flow of control
(just need to deal with one event at a time, dispatch right away
without having to convert the event into an SDL_Event necessarily)
than turn it into how SDL currently works (which requires queueing
where there wasn’t before, or weird hacks like running the whole
program from within a single event handler like we currently do).
Adopting the style that’s has the lowest overhead to adapt into is
good to keep close to the metal in as many situations as possible.
I admint that inverting the control flow is a bit heavy-handed (that’s
why I wasn’t sure about it, my idea could be done either way), in that
it forces code to be better behaved, and removes opportunities for
incorrect usage. For example, with Xlib, where you have XSelectInput
to optimize what event gets sent, you can set up your XNextEvent to
process an event, and never get it, because you forgot to change the
XSelectInput event mask (or the reverse, set the mask, then get
spurious events from XNextEvent). With an API where you register a
callback, you can’t screw up: if you register the callback, you’re
claim your interest and set up dispatch at a single point. It also
enforces having a proper main loop set up, instead of letting bad
habits like "just checking SDL_GetKeyState whenever you feel like"
fester.
But it’s true that we might not want to force good code on people.
Maybe I just did too much Python recently (I used to be a Perl guy), I
blame my employer… ;-)On Fri, Jan 9, 2009 at 3:29 AM, Donny Viszneki <donny.viszneki at gmail.com> wrote:
–
http://pphaneuf.livejournal.com/