Real-Time Digital Translation

Kaelan Guetschow, VTQ Magazine

In-Ear Translator. Real-time Digital Translation VTQ 5.jpg

Real-time Digital In-Ear Translation: Not Just For Star Trek Anymore

As recently as a few years ago, the concept of real-time, person-person translation without the assistance of a translator would have seemed like a concept reserved for science fiction like Hitchhikers Guide to the Galaxy or Star Trek. But the future is now, and dozens of companies have flooded the market with real-time translation devices that use algorithms and technology to do the translation for you.

Admittedly, real-time digital translation isn’t perfect and like any emerging technology, there are still bugs that need to be resolved. In its most refined form, real-time digital translation works by taking a recorded audio sample, it then converts it using speech-to-text, translates the text, and serves it back out using text-to-speech. This process sounds simple, but as always, the devil is in the detail. For example, the system may not work well in noisy environments because it could pick up and try to translate other conversations going on in the background. 

The time that it takes to convert speech to text and translate it takes a lot of processing power and can’t always keep up with the conversational flow. The AI-generated voice that outputs the translation needs to have the right cadence and inflection so that the originally intended meaning can be understood.

There are lots of small things like this that engineers, developers, and researchers need to solve before the technology can be truly seamless, but it’s not stopping companies from making it readily available for consumer use. 

One of the biggest players in the digital real-time translation game is Google. It first introduced real-time translation with its Pixel Buds in 2017 and has gone on to introduce it as a feature that is part of Google Translate and works with all Google assistant-optimized headphones and phones.

To combat the lag that comes inherently with the translation process, Google uses Neural Machine Translation (NMT) to predict the upcoming sequence of words. Think of it like preloading a game.
By translating the predicted speech before the phrase has been finished, the process lag is reduced and brings us closer to seamless real-time translation.

Google isn’t the only company putting effort into the real-time digital translation game. Recently, a startup based in Shenzhen called Timekettle launched WT2 Language Translator. 

Unlike Google’s use of software and the adaption of existing consumer headphones, the WT2 is a set of earpieces that are meant to be worn by two people. 

When connected with the app, the WT2 can (almost) seamlessly translate 36 languages and 84 accents. The WT2 stands out because of how it can automatically detect and interpret languages in real-time while filtering out ambient noise and other conversations. And the WT2 does all of this without needing to be prompted by the user. It’s as close to a plug and play solution as you can get. 

Although there are more options on the market than ever before, real-time digital translation is still in its infancy. We aren’t at Star Trek universal translator level yet, but as technology and research progress, we’ll get closer. Who knows, maybe someday language barriers will be a thing of the past.