About the 32-bit Unicode Console API

The 32-bit Unicode Console API is a completely new console API for textual user interface programs. By using it, applications programmers no longer need to have 16-bit code in their applications that performs thunking. If the application uses no other 16-bit APIs (either explicitly or via the runtime library of the compiler used to build it), its executables should contain no 16-bit memory objects at all and will be "Pure 32-bit".

On IBM OS/2, the 32-bit Unicode Console API is implemented on top of the 32-bit VIO, MOU, and KBD subsystems. This documentation covers the IBM OS/2 implementation of the 32-bit Unicode Console API.

Concepts

These are the concepts underpinning the design of the 32-bit Unicode Console API. (Note that there are specific variations for IBM OS/2.)

Each process in the system may have open file handles to zero or more console file objects. A console file object has a text-mode screen buffer and an input queue. Every process may also have a default console file object, which it need not have an open file handle for. The "CON" device is an alias for that object. Processes may also open screen buffers and input queues directly. The "CON" device, console file objects, screen buffers, and input queues all have the same console API, although they won't all necessarily support all functionality. (Input queues won't support the screen buffer reading/writing functions, for example.)

Consoles may also be treated as "glass TTYs", and accessed through the normal DosRead() and DosWrite() I/O system API calls. Reading from a console reads its input queue, and optionally displays echoed content to the screen buffer as that input queue is parsed into a buffer of characters. Writing to a console writes to its screen buffer.

The 32-bit Unicode Console API provides more than that "glass TTY" interface. Console API calls are provided for reading and writing cells, characters, and attributes of the screen buffer, for scrolling the screen buffer in whole or in part, for querying and setting the cursor position, for querying and setting the screen buffer size, and for reading and writing events in the input queue. The API is available for all file handles, but will only actually work with handles that are for console objects.

Screen buffers are simple two-dimensional arrays of character cells. A character cell is an attribute value, with the same semantics as the attribute value of a IBM CGA display adapter, and a Unicode (strictly speaking, UCS-16) character.

Input queues contain input events. Input events are keyboard events, mouse events, or buffer resizing events. Keyboard events comprise a shift state, a device-dependent scan code, an optional device-independent virtual character code, and for keystroks that denote actual characters an optional translated Unicode character code. (Most keys can be translated from device-dependent codes to device-independent codes, and applications must always use the device-independent codes as a first resort.)

Console windows are the visible portions of screen buffers. The window can have any size up to the size of the entire screen buffer, and any location within the buffer such that it does not cross the edges of the buffer. The screen buffer itself can have any size up to the maximum screen buffer size permitted by the system.

For a console that is displayed in a GUI window, no further restrictions are imposed. The console window is the displayed area of the screen buffer, which the user may change at any time by resizing the GUI window. For a console displayed on TUI hardware, further restrictions are imposed. The console window size is fixed at the size that the hardware displays, but it may have any location within the screen buffer. The screen buffer is not restricted in size.

Console code pages

The "glass TTY" interface of DosRead() and DosWrite() operates in terms of an 8-bit character set. This 8-bit character set is mapped onto the Unicode characters used by consoles using a current "code page". A "code page" specifies a particular mapping betwen 8-bit and UCS-16 characters.

Console Modes

Various behavioural aspects of the "glass TTY" (a.k.a. "high level") interface to consoles, can be controlled by setting console mode flags.

These modes affect "high level" output:

DISABLE_ANSI_CSI_SEQUENCES
If this mode is unset, ANSI CSI sequences are recognised and processed. If this mode is set, ANSI CSI sequences are not recognised, and are displayed as a sequence of appropriate glyphs instead.
DISABLE_ANSI_CONTROL_CHARS
If this mode is unset, the ANSI control characters BS, TAB, FF, LF, CR, and VT will cause the appropriate cursor motions and screen changes. If this mode is set, those ANSI control characters will not be treated specially, and will display the appropriate glyphs.
DISABLE_WRAP_AT_EOL
If this mode is unset, output will wrap around to the next line on the screen once it reaches the end of a line, scrolling the screen up one line if it reaches the last line on the screen, and the BS ANSI control character (if DISABLE_ANSI_CONTROL_CHARS is not set) will wrap around to the previous line on the screen if the cursor is at the beginning of a line.
ENABLE_UTF8_OUTPUT
If this mode is unset, characters written to the output are treated as single-byte characters and translated to Unicode using the process' current code page. If this mode is set, characters written to the output are expected to form multi-byte UTF8 sequences that map onto Unicode characters, and the process' current code page is ignored.

These modes affect "high level" input:

DISABLE_EDITING_KEYS
If this mode is unset, editing keys such as BACKSPACE will cause the appropriate actions in the line buffer. If this mode is set, editing keys will be treated no differently to all other keys. If line input is being echoed and this mode is unset, then ANSI control characters will be displayed in order to display the editing operation, and it is recommended that the output console buffer thus not have DISABLE_ANSI_CONTROL_CHARS set.
DISABLE_LINE_INPUT
If this mode is unset, a call to read the console will not return until the ENTER key is pressed, and the caller's supplied buffer will be filled with characters from the input line. If this mode is set, the call will return when no more keypresses are immediately available, and the ENTER key will not be treated specially.
DISABLE_LINE_INPUT_ECHO
If this mode is unset, line input (which itself will only happen if DISABLE_LINE_INPUT is unset) will be echoed to the output console buffer. If this mode is set, line input will not be echoed.

The 32-bit Unicode Console API is © Copyright 1999,2000,2009 Jonathan de Boyne Pollard. "Moral" rights are asserted.