Skip to main content
Icons of Progress
 

Pioneering Machine-Aided Translation

IBM100 Pioneering Machine-Aided Translation iconic mark
 

Even with the advent of gigahertz processors and gigabyte memories, machines still fall flat when they try to translate languages. The latest programs, for example, provide intelligible results—but only in the comfortable range of use for ordinary e-mails and business web pages.

Part of the reason is that not only do many words in every language have more than one meaning, phrases can have both literal and idiomatic meanings, making translations by non-native speakers a delicate exercise in avoiding embarrassment and worse—confusion. Add to this the considerable problems presented by individual style and context—not to mention the impressive range of meanings humans can inject into any single word, from love to sarcasm to humor—and you can readily see why even the most advanced machines still struggle to replicate the skill of an experienced human translator.

IBM has a rich history of working on machine-aided translation devices dating back to the 1920s, and is currently refining what may be a breakthrough solution for a next-generation translator.

IBM founder Thomas Watson Sr. saw the problems of language barriers firsthand in his early work with the International Chamber of Commerce. In 1927, under his direction, the company developed its first translation system based on the Filene-Finlay simultaneous translator. It was essentially an audio setup of headphones and dials that allowed users to listen to professional translators translating speeches in real time.

Installed and first used at the League of Nations (a precursor of the United Nations) in 1931, the system allowed listeners to dial in to their native language and hear pre-translated speeches read simultaneously with the proceedings. The IBM ® Filene-Finlay Translator was later modified and used for simultaneous translations at the Nuremberg war crimes trials following World War II, and at the United Nations.

By the early 1950s, IBM had developed an English-Russian translator using the IBM 701 Electronic Data Processing Machine, the company’s first commercial scientific computer. This program incorporated logic algorithms that made grammatical and semantic “decisions” that mimicked the work of a bilingual human. This work advanced in the 1960s when IBM developed a machine and programs for translating Chinese. IBM researchers used phrase structure analysis to match the meaning of ideographic Chinese characters with other languages.

Around the same time, IBM built the Automatic Language Translator, a custom computer for the military that used a high-speed optical disk with 170,000 words and phrases to translate Russian documents into English.

IBM is also credited with developing the first Braille translator. Working with the American Printing House for the Blind in Louisville, Kentucky, the company introduced the Braille Translation System in the early 1960s. It was based on the IBM 704, the company’s first mass-produced mainframe, and went into service at the American Printing House for the Blind as the APH-IBM system.

More recently, the Thomas J. Watson Research Center—working with the US Defense Department in Washington, DC—invented MASTOR, or Multilingual Automatic Speech-to-Speech Translator. The system, which consists of software and a two-way automatic translation device, can recognize and translate a vocabulary of 50,000 English and 100,000 Iraqi Arabic words. In the fall of 2007, IBM donated the system to the US military to help coalition forces communicate with the Iraqi people. IBM Research also created technology that translates Arabic television and radio broadcasts into English text. The system, called TALES for Translingual Automatic Language Exploitation System, recognizes Arabic audio and translates it into English text to generate a machine-produced closed caption that enables an English-speaking listener to get the gist of the Arabic content.

In 2006, IBM held an Innovation Jam among all of its employees to identify promising ideas. One of the key issues that surfaced was the challenge of the language barrier inherent in a company with employees in 170 countries. IBM funded a corporate-wide project involving the research labs, software development and consulting services to develop a “Real-Time Translation Service” as a secure, enterprise-strength language translation system that could be used to build a smarter workplace for IBM employees, and eventually be used by IBM’s clients, IBM Business Partners and perhaps the world.

The effort currently led by IBM’s Multilingual Natural Language Processing group, headed by Salim Roukos, chief technology officer for translation research, resulted in what the company calls the n.Fluent system.

n.Fluent, which is pronounced “en-flü-ənt,” learns as it evolves based on input from usage patterns submitted by multilingual IBM employees. Still primarily for internal use only, volunteers in this IBM crowdsourcing project have helped develop n.Fluent to the point where it can instantaneously translate between English and 11 other languages. Employees currently use it as a secure, real-time means of translating electronic documents, web pages, and even live, instant messages.

So far, n.Fluent, which made its internal debut at IBM in 2008, can be used for translating between English and Chinese (both simplified and traditional), Korean, Japanese, French, Italian, Russian, German, Spanish, Portuguese and Arabic. In addition to internal use by IBMers, n.Fluent is also used to provide web support to IBM clients and is being licensed externally for cloud-based translation services.

The n.Fluent team adopted the approach of using algorithms developed from the large number of parallel sentences in related languages, such as English and Spanish, and French and English. But since most of the content in n.Fluent is in English, the team turned to crowdsourcing among IBM employees as means to capture and continually improve the software’s translation accuracy and quality.

“n.Fluent,” says David Lubensky, one of the IBM researchers who started the project, “is a kind of Harvard Business School case study of how the crowds inside the company help you develop a better product. Our goal is to replicate this over various domains.”

 

Selected team members who contributed to this Icon of Progress:

  • Thomas J. Watson Sr. IBM CEO, supported Edward A. Filene’s development of a translation device in the 1920s
  • Edward A. Filene One of the original patent holders on the Filene-Finlay translation system
  • Dr. Cuthbert Hurd Founded the IBM Applied Science Department and directed the development of the IBM 701 Russian-English translation program
  • Peter Sheridan Worked on the Russian-English translation program with Dr. Hurd and the Georgetown University scholars between 1954 and 1957
  • Leon Dostert Founder and director of the Georgetown University Institute of Languages and Linguistics in Washington, DC; helped set up the interpretation system for the Nuremberg war crime trials, and helped train the interpreters and translators
  • Paul Garvin Linguist who helped Leon Dostert with the Russian-English translation system
  • Anne Schack Led the development of the first computerized Braille translation program, which converted literary texts, in punched card form, into Grade 2 Braille
  • C. J. Fitch Led the development of the IBM Wireless Translation System
  • Yuqing Gao Led the development of the IBM MASTOR System and its donation project to the US government
  • Salim Roukos CTO of Translation at IBM Research, leads the team that is developing n.Fluent software