Trekkies, pay attention: the simultaneous translation device used in the beloved Star Trek series may finally become reality!
Many people believe the Star Trek series of the 1960s sparked quite a few present-day inventions. There are people, for example, that claim the communicators in the series that flip open are actually the predecessors of the clamshell-shaped mobile phone that was very popular a few years ago.
Currently, there is a new development that mimics the famous sci-fi TV show: the Universal Translators that the crew wore on their space journeys might become reality as a number of companies are working on a simultaneous-translation device.
Last year, as outlined by The Economist, a number of events took place which show that simultaneous-translation devices are big business. In the summer of 2012, inventor Will Powell introduced a system that can translate a Spanish-English conversation by using a hands-free headset and a set of goggles that display the translated text. The downside? The system requires the speakers to be patient and severely reduce the speed with which they speak.
In November of that same year, the lives of Japanese and English, Chinese or Korean speakers were made much easier when the Japanese mobile-phone operator NTT DoCoMo demonstrated a service which translates phone calls between these languages. A computer listens along, then translates the conversation and even makes sure the voice used to convey the message matches the gender of the speaker.
The most advanced technology appears to come from the headquarters of software titan Microsoft. When the most important research officer of the company, Rick Rashid, spoke at a conference in China last October, his English sentences were translated into Mandarin Chinese on the spot. At first, these translated sentence only appeared as subtitles on a screen, but a computer-generated voice that even mimicked the tone of Rashid’s voice soon followed.
These new devices might come in different packages, the problems they have to deal with are the same. Firstly, developers have to find a way to recognise and digitise speech. In the past, software was used to parse words into phonemes (sounds, more or less), which were then identified and reconstructed to words. This method works fine when the number of words used isn’t that extensive, but when the vocabulary is substantial, at least one in every four words is wrong.
Microsoft has found a way to eliminate this problem. Their software doesn’t identify phonemes, but senones – sequential triplets of phonemes. This is a little more difficult but makes the recognition of words a whole lot easier. This senone identifier works in the same way the human brain does by using neurons to judge the strength of the signals of neighbouring neurons. These neurons then send output to other neurons, which repeat the process. Just as the human brain, the software Microsoft has developed had multiple layers – nine, to be exact. When the ninth layer is reached, the system makes a guess about the word it has heard. This new technology results in fewer errors than its predecessors; according to Microsoft, at last a third fewer errors are made. Similar claims are made by Google and Nuance, two other companies that are working on translation software.
The second problems translation devices face is that of translation itself. It’s not just words that need translation, it’s the entire grammatical structure of a language. Google had solved this problem by using crowd-sourcing. The text that is entered in their Translate app is compared to millions of other sentences that are derived from Google software. The app then chooses the most appropriate one. Jibbigo, another company that has created a translator app, pays users in developing countries to correct mother-tongue translations.
However, these devices are still not 100 per cent accurate. As the voice of the speaker is mimicked by their machine translator, Microsoft hopes users will not pay too much attention to the grammatical errors in the translations the device produces. This accent mimicking takes it’s time, however: the system does need an hour of recording to achieve this effect.
Travellers have different needs for a translation device, as they need a device that is portable. Will Powell’s invention might be the solution for them, as it only needs a mobile-phone signal, headsets and a laptop. The problem with conversations, though, is that the software needs to determine who is speaking when. Mr Powell has though of this as well: in his device, all speech is ran through two different translation systems at the same time, one English – Spanish, and one Spanish – English. The translation that makes sense is then displayed in the right pair of goggles.
These problems show that simultaneous machine translation isn’t as developed as its consecutive counterpart, in which Jibbigo has developed an app which holds a 40,000 word vocabulary in ten different languages. And then we haven´t even started on the fact that people in real world conversations often interrupt each other and use slang words.
That Star Trek device might thus be a number of light years away. Beam me up, Scotty!