Skip to main content

IBM Achieves Major Breakthrough in Voice Recognition

New Technology Makes Speech Recognition More Like a Conversation, Less Like Talking to a Computer. New IBM Embedded ViaVoice 4.4 Designed to Eliminate Need for Predefined Commands, and Allow Auto Drivers and Handheld Users to Speak Naturally

Select a topic or year


ARMONK, NY - 24 Jan 2006: IBM today announced a voice technology breakthrough that can allow automobile drivers and handheld device users to speak commands naturally without memorizing specific predetermined commands.

Released as part of the IBM Embedded ViaVoice 4.4 software package, it is a significant technology advance for embedded speech technology in devices and automobile navigation systems.

The new offering is designed to provide users with new flexibility and accuracy in embedded speech devices. Previously, individuals were required to learn, memorize and use a fixed set of phrases and commands to interact with speech recognition systems. For example, when asking for "Radio 104.3 FM," the new IBM-pioneered technology allows drivers to simply say, "Tune to 104.3," or "Set the radio station to 104.3," or "Change the radio station to 104.3." A great variety of intuitive commands would change the radio station to the desired channel, thus eliminating the need to memorize a specific command list.

IBM Embedded ViaVoice 4.4 features "freeform command recognition," which uses advanced statistical language modeling and semantic interpretation to enable natural language understanding between the user and the voice recognition system. Freeform command recognition permits people to use intuitive command phrases that are not "memorized" for controlling devices, such as radio or navigational systems in automobiles or commands on handheld devices.

The new package also includes significant improvements in overall accuracy in all noise conditions through the use of new acoustic models, enhanced acoustic model training techniques, and further improvements in speech-silence detection, which is a feature that handles transient noises like road bumps, blowing horns or railroad crossings.

Multiple Language Recognition, More Natural Use
New users can easily operate the system out of the box, and drivers can become less encumbered by the task of remembering specific words or phrasing and maintain their focus on the road. Handheld device users, meanwhile, can perform functions in a more fluid manner that fits into their normal activities. In addition, the new system not only allows for freeform commands, it can recognize those commands in multiple languages.

"Speech recognition has gotten a bad rap in recent years, but it has quietly been maturing into a truly useful technology and is experiencing a second coming," said Jim Holland, Product Line Manager Embedded Speech, IBM Software Group." The ability to use natural language commands allows the device to become a more instinctive part of a user's daily routine and reflects our mission to provide information and functionality to users as a readily available service regardless of the environment.

IBM Embedded ViaVoice Version 4.4 delivers market-leading speech technology for mobile devices, such as automobile navigation systems, hands-free phones, personal digital assistants (PDAs) and other smart devices. Embedded device applications can use IBM speech technology either for automatic speech recognition (ASR) that uses human speech to input commands into a mobile device or for text-to-speech (TTS), which uses a synthesized human voice to speak text and other information from a mobile device.

Related XML feeds
Topics XML feeds
Software
Information Management, Lotus, Tivoli, Rational, WebSphere, Open standards, open source