Even with the advent of gigahertz processors and gigabyte memories, machines still fall flat when they try to translate languages. The latest programs, for example, provide intelligible results—but only in the comfortable range of use for ordinary e-mails and business web pages.
Part of the reason is that not only do many words in every language have more than one meaning, phrases can have both literal and idiomatic meanings, making translations by non-native speakers a delicate exercise in avoiding embarrassment and worse—confusion. Add to this the considerable problems presented by individual style and context—not to mention the impressive range of meanings humans can inject into any single word, from love to sarcasm to humor—and you can readily see why even the most advanced machines still struggle to replicate the skill of an experienced human translator.
IBM has a rich history of working on machine-aided translation devices dating back to the 1920s, and is currently refining what may be a breakthrough solution for a next-generation translator.
IBM founder Thomas Watson Sr. saw the problems of language barriers firsthand in his early work with the International Chamber of Commerce. In 1927, under his direction, the company developed its first translation system based on the Filene-Finlay simultaneous translator. It was essentially an audio setup of headphones and dials that allowed users to listen to professional translators translating speeches in real time.
Installed and first used at the League of Nations (a precursor of the United Nations) in 1931, the system allowed listeners to dial in to their native language and hear pre-translated speeches read simultaneously with the proceedings. The
By the early 1950s, IBM had developed an English-Russian translator using the IBM 701 Electronic Data Processing Machine, the company’s first commercial scientific computer. This program incorporated logic algorithms that made grammatical and semantic “decisions” that mimicked the work of a bilingual human. This work advanced in the 1960s when IBM developed a machine and programs for translating Chinese. IBM researchers used phrase structure analysis to match the meaning of ideographic Chinese characters with other languages.
Around the same time, IBM built the Automatic Language Translator, a custom computer for the military that used a high-speed optical disk with 170,000 words and phrases to translate Russian documents into English.
IBM is also credited with developing the first Braille translator. Working with the American Printing House for the Blind in Louisville, Kentucky, the company introduced the Braille Translation System in the early 1960s. It was based on the IBM 704, the company’s first mass-produced mainframe, and went into service at the American Printing House for the Blind as the APH-IBM system.
“Speech-to-speech translation systems have the potential to revolutionize the way people around the world who do not speak a common language communicate with one another. There are thousands of different languages spoken; imagine being able to communicate with anyone instantly through the assistance of a universal translator. Breaking such communication barriers would lead to tremendous growth in cultural understanding. Allowing people to accept and live with everyone’s differences would be a very rewarding future.”
“For it is through the print of language that man has ever sought to communicate more widely with his contemporaries, more completely with posterity. Multi-lingualis (sic) has, in part, hindered this quest. Electronic language translation is another stride forward in man’s effort to reach his neighbors. … Concretely, if electronic language translation makes possible, in due course, the translation into the languages of the less developed areas of the world, the basic references and scientific literature in existence in Western languages, this in itself would be significant. The value to research of having current literature in scientific fields readily and promptly available in various idioms is another practical objective.”
“701 Translator,” IBM Press releaseJanuary 8, 1954
“The advances IBM has made in the research and development of speech-to-speech translation systems have the potential to revolutionize the way people around the world communicate with one another. The military’s use of the MASTOR system is a very exciting example of that capability—one where we see the potential to improve the safety of U.S. service personnel and save lives.”
“Made in IBM Labs: Speech Translation Technology Breaks Through Language Barrier for U.S. Forces in Iraq,” IBM press releaseOctober 12, 2006
More recently, the Thomas J. Watson Research Center—working with the US Defense Department in Washington, DC—invented MASTOR, or Multilingual Automatic Speech-to-Speech Translator. The system, which consists of software and a two-way automatic translation device, can recognize and translate a vocabulary of 50,000 English and 100,000 Iraqi Arabic words. In the fall of 2007, IBM donated the system to the US military to help coalition forces communicate with the Iraqi people. IBM Research also created technology that translates Arabic television and radio broadcasts into English text. The system, called TALES for Translingual Automatic Language Exploitation System, recognizes Arabic audio and translates it into English text to generate a machine-produced closed caption that enables an English-speaking listener to get the gist of the Arabic content.
In 2006, IBM held an Innovation Jam among all of its employees to identify promising ideas. One of the key issues that surfaced was the challenge of the language barrier inherent in a company with employees in 170 countries. IBM funded a corporate-wide project involving the research labs, software development and consulting services to develop a “Real-Time Translation Service” as a secure, enterprise-strength language translation system that could be used to build a smarter workplace for IBM employees, and eventually be used by IBM’s clients, IBM Business Partners and perhaps the world.
The effort currently led by IBM’s Multilingual Natural Language Processing group, headed by Salim Roukos, chief technology officer for translation research, resulted in what the company calls the n.Fluent system.
n.Fluent, which is pronounced “en-flü-ənt,” learns as it evolves based on input from usage patterns submitted by multilingual IBM employees. Still primarily for internal use only, volunteers in this IBM crowdsourcing project have helped develop n.Fluent to the point where it can instantaneously translate between English and 11 other languages. Employees currently use it as a secure, real-time means of translating electronic documents, web pages, and even live, instant messages.
So far, n.Fluent, which made its internal debut at IBM in 2008, can be used for translating between English and Chinese (both simplified and traditional), Korean, Japanese, French, Italian, Russian, German, Spanish, Portuguese and Arabic. In addition to internal use by IBMers, n.Fluent is also used to provide web support to IBM clients and is being licensed externally for cloud-based translation services.
The n.Fluent team adopted the approach of using algorithms developed from the large number of parallel sentences in related languages, such as English and Spanish, and French and English. But since most of the content in n.Fluent is in English, the team turned to crowdsourcing among IBM employees as means to capture and continually improve the software’s translation accuracy and quality.
“n.Fluent,” says David Lubensky, one of the IBM researchers who started the project, “is a kind of Harvard Business School case study of how the crowds inside the company help you develop a better product. Our goal is to replicate this over various domains.”
Selected team members who contributed to this Icon of Progress:
- Thomas J. Watson Sr. IBM CEO, supported Edward A. Filene’s development of a translation device in the 1920s
- Edward A. Filene One of the original patent holders on the Filene-Finlay translation system
- Dr. Cuthbert Hurd Founded the IBM Applied Science Department and directed the development of the IBM 701 Russian-English translation program
- Peter Sheridan Worked on the Russian-English translation program with Dr. Hurd and the Georgetown University scholars between 1954 and 1957
- Leon Dostert Founder and director of the Georgetown University Institute of Languages and Linguistics in Washington, DC; helped set up the interpretation system for the Nuremberg war crime trials, and helped train the interpreters and translators
- Paul Garvin Linguist who helped Leon Dostert with the Russian-English translation system
- Anne Schack Led the development of the first computerized Braille translation program, which converted literary texts, in punched card form, into Grade 2 Braille
- C. J. Fitch Led the development of the IBM Wireless Translation System
- Yuqing Gao Led the development of the IBM MASTOR System and its donation project to the US government
- Salim Roukos CTO of Translation at IBM Research, leads the team that is developing n.Fluent software