Retrospective: Machine Translation

Acclaro's 10th anniversary is fast approaching, and today we want to take a look at one of the more striking changes in the translation services industry over the past decade: machine translation, or MT. There are two basic types of machine translation technology, rule-based and stats-based systems:

Rule-Based Machine Translation Technology

Rule-based machine translation relies on built-in linguistic rules and millions of bilingual dictionaries for each language pair. This process requires extensive lexicons with morphological, syntactic, and semantic information, outlining how the information is parsed and displayed. The software then uses these complex rule sets and then transfers the grammatical structure of the source language into the target language.

Users can improve the out-of-the-box translation quality by adding their terminology into the translation process, on an as-needed basis. They create user-defined dictionaries which override the system's default settings.

Ten years ago, rule-based MT systems (like Systran) were pretty much the only game in town. While rule-based MT worked well for large companies with straightforward terminology, like the automotive or manufacturing industries, it wasn't the best option for many others, like marketing translations. All that changed with Google Translate, which works on a very different model.

Statistical Machine Translation Technology

Statistical machine translation analyzes and indexes of huge amounts of monolingual and bilingual corpora (a.k.a. sets of existing translations). We're not kidding: a minimum of 2 million words for a specific domain and even more for general language are required before stats-based MT can really function effectively. As a result, statistical machine translation is CPU intensive and requires an extensive hardware configuration to run translation models for average performance levels. However, you gain access to a much greater and more diverse pool of possible translations, thereby increasing the odds of getting good quality from the start.

Google Translate is probably the largest and best-known stats-based system, and for good reason: Google has been indexing translations for years, so they have a large and varied resource base.

What's the difference?

Rule-based MT provides good out-of-domain quality and is by nature predictable. Dictionary-based customization guarantees improved quality and compliance with corporate terminology, but translation results may lack the fluency readers expect. In terms of investment, the customization cycle needed to reach the quality threshold can be long and costly. The performance is high even on standard hardware.

Statistical MT provides good quality when large and qualified corpora are available. The translation is fluent, meaning it reads well and therefore meets user expectations. However, the translation is neither predictable nor consistent. Training from good corpora is automated and cheaper. But training on general language corpora, meaning text other than the specified domain, is poor. Furthermore, statistical MT requires significant hardware to build and manage large translation models.

What's next?

Good question — technology changes make it an exciting time to work in the translation industry, and we're optimistic about what the next ten years may bring. From our standpoint today, MT and human translation each have their place. While MT may be cheap and fast, it doesn't always produce the best quality or work well with the illogical and nuanced characteristics inherent in most every language. If you do use machine translation, this can mean your global users might get the wrong message without a human-driven quality check. Human translation is flexible, accurate, and conveys the right idea, but can be a slower and more expensive process. While the ideal solution would be to have a superhuman translator who works 24/7 and outputs a million words a month with 100% accuracy, that's just not the reality of where we are...or at least, not yet. In some instances, a hybrid solution, combining an initial machine translation followed by a human post-edit, may work if you need to get a very large amount of translation work done quickly and if your translation will work well with current MT technology.

If you are curious about whether or not machine translation might be a good fit, we wrote up a great newsletter article about just that very topic, or you can contact us too for more information.

Photo attribution: jcorrius

Subscribe to this blog

RSS feed

About this blog

Smart, fun and useful. Acclaro shares news and tips on translation, localization, language, global business and culture.


software cost tips technology & localization industry language entertainment southeast asia arabic website marketing mobile acclaro languages of the world acclaro localization and translation services acclaro world language map world language map arts sports & culture europe eastern europe africa french international business north america latin america middle east swedish asia german chinese documents case study localization retail quality transcreation spanish south america italian english ecard networking portuguese romance languages japanese staffing technology machine translation cost savings acclaro games language apps elearning localization multimedia translation multimedia localization elearning translation training translation ecommerce localization bitcoin cryptocurrency global ecommerce bitcoin regulation bitcoin exchanges china chinese translation chinese localization beijing localization cantonese and mandarin translations translation for business in china business translation in chinese top retail markets in the world new retail openings retail markets acclaro black friday cyber monday cyber week cyber monday woche single’s day bachelor’s day holiday shopping online borderfree doorbuster sales global holiday ecommerce healthcare initiative spanish healthcare mt case study mobile apps mobile app translation app store optimization aso mobile app thai translation ecommerce in asia valentine’s day valentines in japan valentines in south korea chinese valentine’s day singles day translation project fitbit visualiq gibson mardi gras food localization globalization consulting localization staffing localization recruiting startups global scalability international ecommerce international payment international order fulfillment global online shopping acclaro april fool’s stories usaapril fool’s stories germany april fool’s stories ecommerce ecommerce design wine translation marketing translation english remains the dominant language in the united states but almost one in five americans speaks a language other than english at home. are you missing out on customers who are more comfortable doing business in languages other than english? read on to learn how your company can profit from translation within the united states. translation myths translation mistakes international translation misconceptions translation errors translation process translating startups marketing translations translation marketing international social media qa quality assurance quality translations global apps app localization app translation app store translation global startups international app launch press release translation international press release global pr global press release press release localization kontax translate news international marketing video localization video translation video translation agency brazil brazilian portuguese english-to-portuguese translation boston translation services boston web translation boston translation agency global branding international branding global brand evaluation lithuanian translation lithuanian language translation adapt to lithuanian translators translation ecommerce in india business travel business travel apps international banking financial services translation marketing transcreation international copywriting website translation website localization japanese translation english-to-japanese translation japanese translation services translation solutions web localization mobile app localization iphone 6 glocal global brands translation services translation agency translation partner global content marketing localization world business case for translation