?Let’s look at the basics of keyboard input.
At the bottom level is the hardware. Keyboards send “scan codes” to the
operating system. The scan codes encode the location of a key and
whether is was pressed or released, nothing else. Scan codes are device
dependent. Every different type of keyboard generates slightly different
scan codes.
The operating system knows what kind of keyboard is attached and
converts scan codes into a small set of key codes. Key codes are mapped
one to one with scan codes. The scan code, what ever it may be, for
"enter" gets mapped to the key code for enter. Operating systems define
a set of key codes and stick to it. But, again, key codes are purely
positional. They do not tell you what character has been input, they
only tell you which key was pressed.
(Ok, that isn’t exactly true. The conversion from scan codes to key
codes can happen in the OS, or the windowing system. And, the user
usually has the ability to change a table that tells the which scan
codes map to which key codes. This means that anything we do can be
messed up by the users, and often is.)
So,
scan codes (device DEpendent) -> key codes (device INdependent)
If I type the key at the location of the “Z” key on a US qwerty keyboard
and press the key at the same location on a Dvorak keyboard I should get
the same key code no matter what is printed on the top of the key.
Key codes are used in two different ways. Individual key codes can be
mapped to key symbols. That is, you can find out what symbol is printed
on a specific key based on its key code. The trouble is that that key
symbol can be very different depending on the language supported by that
keyboard. The same key code can be mapped to many different symbols.
Many of the keys do not map to single letters, they map to phrases. For
example: the enter key maps to the phrase “enter” while the “L” key maps
to the single letter “L”. And, to make things more complex the letters
and phrases are in the language and alphabet (or ideograms) of the
language the keyboard supports.
Key symbols are not the end of the process. Sequences of key codes are
composed to create texts. A text is a letter or symbol in a language. In
English “a” and “A” and ^A (control A) are all different texts that are
entered by composing the key code for the “A” key with zero or more
modifier keys. In this case “A” was composed with no modifiers to get
"a", with the right shift key to get “A” and the right control key to
get ^A. The CJK languages (and many others) take composition to extremes
where entering a long sequence of key strokes yields a single character
or in some cases a sequence of characters. Input in those languages
usually goes through an IME (input method editor) which is a small
window that pops up to show you what character is being composed.
So,
key codes (language INdependent) -> key symbols (language DEpendent)
and,
sequences of key codes (language INdependent) -> texts (language
Dependent)
All the information about the key symbols and the rules for composing
symbols into texts is in the computer, either in the OS, or the
windowing system. The keyboard has no information. The keyboard just
tells you which key has been pressed.
Q:What do programmers want to do with a keyboard?
too.
Key position input.
Lets say we want to use WASD input for controlling motion in a game. We
want to use the keys based on their location, not based on what is
printed on the keys at those locations.
SDL 1.3 provides this type of input with key press and release events.
The events include an SDLK_* key name that is based on the location of
the key, not what is printed on the key. The names are based on key
symbols printed on a standard US English USB keyboard. No matter what is
printed on the key, if you press a key in the same location as the "W"
on a US keyboard you will get SDLK_W in the key pressed/released event.
Key symbol input.
Some times we want to know what is on the key cap of the key that was
pressed. For example: if I write a program that uses “Z” to mean “zoom
in” I want to know if the key with the “Z” printed on it was hit.
Trouble is that “Z” and some other keys move around from keyboard to
keyboard. In SDL1.3 SDLK_Z is supposed to be the key on the bottom row
of characters just to the right of the shift key no matter what is
printed on that key. But, on many European keyboards that key has a big
"Y" printed on it. That means that we can not use SDLK_Z to let us know
if the “Z” key was hit. We could just wait for a text input event that
will contain a either a “z”, “Z”, or ^Z depending on the state of the
modifier keys, which seems like a good idea to me, but it would be nice
to be able to get the the key symbol from a key press event. Not
necessary, but nice to have.
(SDL1.2 mixed up these two problems and let us get the modified
character along with a key press. That has gone away in SDL1.3.)
I see two ways we can solve this problem. We could just send along the
key cap information with the key pressed event. Or, we can have a
function that given an SDLK_* returns the Unicode code point for the key
cap or 0 (zero) for keys like “enter” and “home”. It doesn’t need to do
anything special for key pad keys because the SDLK_* in 1.3 have a
special bit set for key pad keys.
A function like that would let me tell you that input is based on WASD
on a standard keyboard or some other 4 characters on a Greek, Arabic, or
Chinese, keyboard. Because I can look up the key symbol for the keys
independent of the language of the keyboard. Of course, it would not
solve all out internationalization problems but it would be nice. Having
this function means I do not need to send the key symbol in a key
pressed event.
The current API has a function that seems to solve the problem of key
transposition on European keyboards but that does not address the
problem of using a Greek or Chinese keyboard. There is also a function
to get the key cap, but the current implementation and the architecture
(in my possibly very wrong opinion) assume that code will be added to
SDL to address each different type of keyboard. Considering that there
all major operating systems have spent years providing
internationalization I think we need to provide a system that makes use
of the existing infrastructure as much as possible.
Composed text input.
The text input event is designed to handle returning any sequence of
characters that can be created by any known composition system. It
completely solves the problem of inputing text in all known languages.
It may create some other interesting problems because of the need for
IMEs to pop up a window on top of the SDL window. But, it seems to
completely solve that problem.
How do existing systems solve the problem?
The X Window system solved this problem (sort of?) in the early '90s and
had to adapt to Unicode later on. In X an input event gives you the scan
code and X key code for a key. You can then look up the key symbol for a
key from the key code, or you can pass the event to a function that
composes key presses into texts. X make a clean distinction between scan
codes, key codes, key symbols, and texts.
X has 256 possible key codes that can be assigned to the physical keys
on a key board. That is big enough to handle the keyboards I every heard
of. But, the number of keys has nothing to do with the number of symbols
that can be printed on keys.
X has a huge number of possible key symbols. As it grew and was adopted
by more and more of the world many new key symbols were added. Currently
the complete set of official key symbols includes all the old ones and
all the possible Unicode code points. All told, X has a few million
possible key symbols.
It is pretty easy to create a mapping from X key codes to SDLK_* and
vice versa. Given an X key code it takes only a couple of function calls
to convert the key code to a UNICODE code point. It is a short trip from
key codes to Unicode.
All the work of mapping key codes to Unicode is already (supposedly) in
X. We do not have to do it.
What I would like to do/change in what we have right now?
(Yeah, I know some of this would cause lots of existing code to stop
compiling… But, this is a wish list and I’m kind of fussy
I would really like to either get rid of the Unicode field in the key
press events or redefine it. If it were redefined it would, when
possible, return the Unicode code point for the key cap of the key that
was pressed. It would be set to zero for keys such as “home” and
"enter". Feel free to ignore this suggestion!
I want a function that given an SDLK_* would use the system provided (if
possible) internationalization code to return the Unicode code point of
the symbol on the key. It would return zero for keys like “home” and
"enter". This function solves the key symbol input problem.
I don’t see the need for any other functionality in the keyboard input
system.
BTW, when I was researching all this I ran into SDL-IM; a patch that
adds CJK character composition to SDL 1.2.8. You can find it at:
http://sdl-im.csie.net/
Bob PendletonA:We want to do input :-) And, we would like a little help with output
–
±-------------------------------------+