September 3, 2002
By Cade Metz
Developer: Carnegie Mellon University
Imagine someone who speaks only Croatian easily conversing with someone who speaks only English. This has nearly become a reality at Carnegie Mellon's Language Technologies Institute, thanks to a research project known as Tongues. The triptych of linguistic applications runs on a mini-notebook, listening to speech in one language and spitting out a speech translation in another.
"We actually took the system to Zagreb and recruited random Croatian volunteers to talk to American officers," says Robert Frederking, the chair of graduate programs at the institute and teacher at Carnegie Mellon's renowned computer science department since 1989. He's spent the last few years working on Tongues' machine translation engine. "Half the time, they could actually carry on a productive conversation."
Tongues is almost epic in scope. It includes a speech recognizer, which turns spoken words into text; a machine translator, which converts the text from one language to another; and a speech synthesizer, which turns the text back into audible words. For conversations to flow in both directions, each engine must work in both languages being translated.
The speech recognizer, known as Sphinx, and the speech synthesizer, known as Festival, have been in development for years. Sphinx is an open-source platform developed at Carnegie Mellon in the early 1990s. When Frederking and his team began the Tongues project, Sphinx was fairly adept at understanding English, but very little work had been done with foreign languages. So they spent several weeks teaching it Croatian.
"In order to understand a particular language, you have to record data in each language that covers all possible phonetics in all possible contexts and build a fresh model in the Sphinx framework," says Alan Black, a Tongues researcher.
Black was recruited from the University of Edinburgh, where he had helped develop Festival, the speech synthesizer, which is also an open-source platform. Much like Sphinx, Festival was originally written for use with English but provides a framework that allowed Tongues researchers to construct an adequate Croatian engine.
The team built the system's translator using a technique known as example-based machine translation. Essentially, they created a database that holds a massive list of English phrases and their Croatian equivalents, culling data from bilingual Internet sites and university textbooks. When the engine receives a text phrase in one language, it provides the equivalent text in the other.
Believe it or not, the entire Tongues system runs on a Toshiba Libretto, a pint-size Windows notebook. Despite its complexity, the system has little trouble running on a three-year-old Pentium processor. "Two people can exchange a 10-second sentence in about a minute and a half," says Black.
Within the next decade, American soldiers, doctors, and chaplains abroad will be able to converse with foreigners without learning a new language or fumbling through a dictionary. And that's one step closer to bridging the language gap.
Original URL: http://www.pcmag.com/article2/0,4149,415355,00.asp