Introducing Linux® Screen Reader
The Linux Screen Reader (LSR) project is an open source effort to develop an extensible assistive technology for the GNOME desktop environment. The goal of the project is to create a reusable development platform for building alternative and supplemental user interfaces in support of people with diverse disabilities. The primary use of the LSR platform is to give people with visual impairments access to the GNOME desktop and productivity applications (e.g., Firefox, Eclipse) using speech, Braille, and screen magnification. The extensions packaged with the LSR core are intended to address these needs (or these features).
This article summarizes the extensible architecture of LSR and describes how the concepts of extensions and profiles create a complete user experience. It also includes examples of extensions that ship with LSR and form its screen reader user interface, plus how to download, install, and run LSR. Links to other LSR resources are also included.
Extensions, profiles, and the user experience
At the core of LSR is a message pump called the Access Engine. The Access Engine receives desktop accessibility events from the Assistive Technology Service Provide Interface (AT-SPI) and user commands from LSR input device extensions. The engine dispatches these events to script extensions, called Perks, for processing. Perks use a convenience API to query events and their associated widgets for information (e.g., name, label, role, state) and to perform actions on them (e.g., focus, select, expand). Perks report information to the user by sending information to LSR output device extensions. Perks may also show chooser extensions-dialog boxes that enable more complex interactions with users than standard hot keys, and monitor extensions, tools to help developers debug their own extensions.

Figure 1: Access Engine flow diagram.
Any number of extensions may be grouped into a profile for LSR. A profile defines which input and output devices are available, which Perks are loaded, and which LSR dialogs to show. For instance, the default user profile packaged with LSR enables support for speech and Braille output. In contrast, the mag profile loads support for magnification only. In essence, the profile system allows a user to pull together various extensions to create a user experience targeting his or her needs.
The following sections describe some of the extensions packaged with LSR and how they contribute to the user experience.
Perks: default, application, and task
The three default Perks currently packaged with LSR operate on every accessible application. The basic speech Perk reports on events such as focus, selection, caret, text, state, and value changes. It defines configurable settings such as word and character echo and speech verbosity. The basic Braille Perk shows the text at the user's current point-of-regard (or focus). It also provides options such as how much to overlap and pad the display while scrolling. Both of these Perks define keyboard and Braille input commands for reviewing the screen, inspecting text attributes, scrolling the Braille display, spelling the current word, and so on. The third default script, the basic magnifier Perk, supports configurable mouse pointer, focus, and caret tracking as well as kinematics options(velocity and acceleration) for smooth panning. Other default Perks may be coded in the future to enable new user experiences for people with other disabilities. For instance, a Perk accepting commands via a switch or speech recognition could be written to support people with mobility impairments.
In addition to the default Perks, LSR ships with scripts written to improve interaction with specific applications. The Perk for the Gaim instant messenger, for example, automatically announces incoming messages, defines additional keyboard commands (e.g., report conversation status as idle, typing, unread), and offers user-configurable options specific to the Gaim program (e.g., read or don't read new messages when Gaim is in the background). Other Perks exist for Firefox, gnome-terminal, metacity, and gdm to compensate for differences or deficiencies in accessibility enablement in these applications. All of these Perks load automatically when their respective applications start.
At runtime, the user can manually load Perks using the Perk chooser (see Figure 3). This ability allows for the creation of Perks that improve certain tasks regardless of the application in which they are performed. For instance, a user might load a spell checking Perk that watches the user type in one or more text areas (e.g., e-mail body, text area on a Web page). When the Perk detects a misspelled word, it might play a small sound effect, "underline" the word on an 8-dot Braille display, or briefly flash the magnified region. If a user finds a particular Perk extremely helpful, he or she can set it to load automatically for all applications or for particular programs.
Devices: output and input
Text-to-speech (TTS) synthesis is supported on a number of speech engines using a variety of methods. The IBM TTS engine is accessed through our own pyibmtts wrapper to provide responsive, high quality speech output. The open-source Festival engine and the commercial DECtalk engine are usable through the gnome-speech interface, a component of the GNOME desktop. Additional engines, such as eSpeak and Cicero, are supported by SpeechDispatcher, a generic messaging interface to speech devices. The user may configure the rate, pitch, volume, and other speech properties of these engines, depending on their capabilities [2].
A device definition for BrlAPI, the programmer's interface to BrlTTY, adds support for Braille output to LSR. A user may configure BrlTTY to work with over 25 different Braille displays and multiple translation tables [1]. LSR provides additional options such as continuation character and caret rendering style as well as a way to specify dead cells to skip on the display. Beyond output, this extension also enables the touch cursors and buttons on a standard Braille to serve as input commands to the screen reader.
The device adapter for gnome-magnifier, another component of the GNOME desktop, makes screen magnification possible in LSR. The magnifier device supports features such as color inversion, viewport size and location, cursor size and color, zoom level, and more.
Finally, the keyboard device extension allows use of the standard keyboard as an input device to LSR. Perks can bind commands such as item, word, and character review to key combinations.
Choosers: settings, Perks, help, and search
The settings chooser makes user configuration of Perks, devices, profiles, and the LSR platform possible. The panels in this dialog box are constructed on-the-fly according to the settings defined by the available extensions. For instance, if a speech device supports a setting called Rate, the settings chooser presents it as a slider with a label, proper minimum and maximum, and tooltip description. Because the widgets are generated automatically, device and Perk developers do not need to worry about creating accessible configuration dialog boxes in addition to writing their code. The settings dialog box is accessed by pressing Alt + Shift + Q according to the default keymap.

Figure 2: Settings chooser showing the options for the basic speech Perk.
A second dialog box, the Perk chooser, allows a user to manually load and unload Perks from an application at runtime. This dialog box allows a person to load a Perk for a temporary task (e.g., spell checking, development tools) and unload it when done. The dialog can prove useful to both developers as well as expert users in temporarily changing the behavior of LSR without restarting it. The Perk chooser is accessed by pressing Alt + Shift + A according to the default keymap.

Figure 3: Perk chooser showing the available Perks for the gnome-terminal application.
Two additional dialog boxes are planned for future inclusion with LSR. The help chooser will offer context sensitive aid. Four sections in the dialog box will state how to use the focused control, how to use the features of the active Perks, what LSR commands are available in the current application, and what accessibility information is provided by the current control. The search chooser will allow the user to enter a string of text which LSR will attempt to locate in the active application. The search will include both on-screen content and accessibility information (e.g., image descriptions), unlike the typical dialog boxes provided by desktop applications.
Monitors: events, tasks, and I/O
Three monitor windows provide information about the events happening inside LSR to aid developers in debugging their extensions. The event monitor shows the raw accessibility events received from the desktop. The task monitor shows the execution of code in the active Perks in response to accessibility events and commands from the user. The I/O (input/output) monitor logs the stream of output and commands sent to all devices as well as all commands received from input devices.

Figure 4: Monitor windows showing the events, tasks, and input/output in LSR.
Obtaining and running the software
The LSR homepage has detailed instructions on how to download and install the latest version of the software. See the downloads section of the LSR wiki for instructions on getting third-party .rpm and .deb packages, the latest source tarball, or even the development code out of Subversion. Once installed, you can run LSR using the shortcut in the System Accessibility menu or by typing lsr in a console window.
Currently, additional work is required to make LSR run at the login screen and to start automatically after login. Instructions for configuring your machine for these cases are available on the LSR wiki. These issues will be addressed in a future version of the GNOME desktop with an improved configuration panel for assistive technologies.
User profiles
Running LSR with the command lsr loads the default user profile which provides speech and Braille services. The extensions in this profile define hot keys for reviewing the screen, announcing text attributes, moving pointer and focus, stopping speech, modifying the speech rate, and showing the settings and Perk dialog boxes. The extensions also activate the keys on the Braille display for scrolling the current line, moving to the start and end of a line, moving to the start or end of a document, and clicking using the touch cursors. Numerous options are provided by the Perks and devices in this profile, all of which are configurable through the settings chooser (see Figure 2).
Using the command lsr -p mag starts LSR under the magnifier profile. Options controlling how the magnifier appears on screen and how applications drive the magnifier are configurable in the settings chooser. By default, the single magnifier region tracks the mouse cursor, application focus, and text caret.
The developer profile
Running LSR with the command lsr -p developer starts LSR under the developer profile. This profile loads support for speech, Braille, and magnification. It also loads an additional Developer Perk and the three monitor windows described earlier. The Developer Perk defines new keyboard commands for reading the names of the currently loaded Perks, refreshing the loaded Perks so that changes to the code take effect immediately, and muting all speech output.
The testing profile
The autotest profile activated by typing lsr -p autotest uses a special monitor extension that sends all output from LSR to a logging server. The purpose of this profile is to enable automated regression testing of LSR. While LSR is running under this profile, a suite of dogtail test scripts perform actions on desktop applications (e.g., open windows, activate menu items, enter text). LSR responds to these actions as if a human user were performing them. The response of the Perks that would be converted to speech is written to log files as the tests progress. The test logs are later compared to control logs to detect differences across versions of LSR.
Figure 5: Automated regression testing flow diagram.
About the author
Peter Parente is a software developer in the Software Group Emerging Technology organization. Peter received his Bachelor's degree in computer engineering and computer science from Rensselaer Polytechnic Institute and his Master's in computer science from the University of North Carolina at Chapel Hill. His keen interest in accessibility, usability, and alternative user interfaces is driving him to complete his doctoral dissertation on generating perceptually-based audio displays for audio-unaware GUI applications. He is a long-time supporter of and contributor to the open source community.
