A Few Words

Machine Translation: Isn't there an app for that?

Posted by Elanex Marketing Team on Dec 10, 2014 11:19:00 AM

Digital, mobile, social, always-on, all-accessible world

When it comes to communicating, we have very few limitations. Thanks to technology, distance is no longer a challenge. Instead, the last barrier preventing us from connecting with anyone in this always-on, all-accessible world is the lack of a common language.  

But, of course, there is an app for that too.  The question is: Does it deliver?

Long before Google, linguists and computer scientists in the 50's searched for the Holy Grail of translation: english-indonesian_3software that could translate as well as humans. The software is known as Machine Translation (MT). While the power of automated translation can’t be denied, there’s a reason we haven’t retired the hundreds of thousands of professional freelance translators in the world.    

The Challenge: Comprehension vs. Communication

The current state of MT is “good enough” for comprehension in most major language pairs, especially those involving English and European languages. Today’s MT systems provide a good to very good idea of the meaning of the text. This is often called “gisting” – you get the gist or essence of the text. For this reason, MT is gaining acceptance for the translation of content that, due to the cost of professional translation, normally would not be translated at all - emails, blogs, user-generated content, etc. Many of the quick and easy translation apps and services use the power (and low cost) of MT in their products and services. However, there is a difference between comprehension and the ability to communicate in a foreign language. MT alone is not suitable for official communications, such as legal contracts or proposals, or for most customer-facing content such as company websites, user guides, technical manuals, etc. Unfortunately, for those unfamiliar with other languages (attention Americans), the naïve assumption is that MT systems provide a fluid translation and they blindly use free or low-cost MT services to translate important content. The easiest way to understand what a native speaker of your translation is reading is to machine translate something from another language into your own. You’ll see that while the translation may be understandable, it will not inspire confidence. More critically, due to the nuance of language, it may not even be accurate.

Rules vs. Statistics

In order to appreciate the capabilities of MT, it’s important to understand the two popular approaches to it: Rules vs. Statistics.

In rule-based MT, words are translated from one language to another following specific grammar rules. Government research agencies in particular have invested large sums chasing this method. But what about the exceptions inconveniently found in all language grammars? Programming a set of rules that completely define a language has proven to be impossible, and, as a result, rule-based MT so far has not achieved a broad level of acceptance.

In response to rule-based shortcomings, researchers invented systems that "learn" by analyzing large amounts of already translated text to create statistical probability models for how new text should be translated. These systems, known as statistical machine translation (SMT), have become very advanced in the last few years (Google Translate is a good example) but the results are highly variable. The results are greatly influenced by a) the size of the training corpora, or collection of aligned bilingual texts that the engine can use to “learn” from; and b) specific terminology (glossary) that further “tunes” the engine. Some companies have been very successful in using SMT systems for internal documentation when properly trained with previously translated materials specific to their company.

By combining MT with expert translators, large volumes of content can be quickly translated and verified by human eyes at about 4 times the speed of standard translation. Good quality MT translations can be "fixed" by human editors in a fraction of the time it would take them to translate from scratch, significantly lowering costs.

A Solution? Post-Editing

The drive to improve MT results involves what is called "post-editing," which consists of having professional translators edit the machine translation output. This is often called PEMT, for post-editing of machine translation. If the machine translation is of reasonable quality, this approach can significantly reduce time and cost. There are two views on how to use PEMT. One is to simply improve the translation to correct terminology and to make it more readable, not overly-focusing on grammar or writing style. This can produce readable translations at a fraction of the cost and speed of a standard translation. The other approach is to use intensive editing to reach human translation quality levels. The result is much better but the time and cost can reach those of human translation, depending on the complexity of the text being translated. An additional benefit of the latter approach is that the result can be fed back into the MT engine for tuning purposes, leading to continual improvements of future translations. Some companies have decided that the long-term benefit is worth keeping translation costs for PEMT at traditional human translation levels, knowing that one day they will receive large time and cost benefit. However neither approach works if the machine translation is poor to begin with. This can be understood by anyone that has tried to edit a very poorly written text. At some point, it is faster to start from scratch and the output is of higher quality. A translator that needs to post edit poor MT output would do better to simply start over. This is a very important consideration when evaluating a PEMT solution.

The Future of Translation Services

It’s hard to know what the future holds, but we are certainly seeing an increase in the use of MT in everyday communication. Translation is becoming a utility embedded in many devices and apps, helping to bring down language barriers in everyday interactions.

NTT DoCoMo’s JSpeak is one such example.  The mobile carrier’s app is helping tourists navigate Japan by instantly translating spoken Japanese into English or other languages and vice versa. The service MT1offers a preinstalled list of over 700 handy phrases for transportation, restaurants, hotels, shopping, hospitals and other common encounters.

Google is also helping globetrotters with its recent purchase of Word Lens. The app allows you to point your smartphone’s camera at simple text and have it immediately translated into your native language. The app, which can translate signs, menus and even books, replaces the words in the live view onscreen with their English equivalents.

Microsoft is also testing the MT waters with a new feature that instantaneously translates Skype conversations. The software provides an audio translation in a male or female voice of everything being said, plus an onscreen text transcript.

TAUS, the Translation Automation User Society, continues to highlight the necessity and demand for translation in all markets across all industries today, and it’s catching on. For example, in April 2014, the National Institute of Information and Communications Technology (NICT) launched a Global Communication Plan to help create a multilingual speech translation system implemented across different sectors, such as medicine, tourism, finance and even disaster planning and prevention.

You might think this paints a bleak picture for human translators, but MT technology has a long way to go before it can fully replace professional human translation when accuracy and communication count. Moreover, use of MT has been shown to increase demand for professional services once companies learn the benefit of connecting with customers around the world in their own language. A recent report by Common Sense Advisory found consumers overwhelmingly prefer to make purchases in their mother tongue. While simple MT might help a business make itself understood, only a human can make the reader feel like you’re really speaking their language.

To learn more about Elanex’s PEMT solution, check out VeriFast(sm).

Topics: Machine Translation, Translation Basics, Translation Tools

Interested in receiving tips, trends, and best practices in translation? Please subscribe to the Elanex blog.