How MT changes the translation industry
Machine Translation, often referred to as MT, has been around for decades. It has evolved through several iterations - from rule based, statistical, example based, hybrid to the modern and current version of Neural Machine Translation or NMT.
MT has become a viable option since the 1990s. However, quality and use-cases were limited. Through the iterations of MT, quality and use-cases increased. Most of us are still familiar with the famously bad translation of manuals especially from Asian manufacturers. – Those were great examples of machine translation and its performance at the time.
During the 2010s, and with the emergence of NMT, we have seen a big leap in quality. This has majorly expanded how Neural Machine Translation can be used in operations and processes in the real world. Since then, Google Translate, DeepL, Systran, Yandex, and many more have dramatically improved quality and performance of the neural machine translation output.
Today, NMT is in use at many large enterprises and LSP (Language Service Provider = translation agencies) to help streamline processes and cut costs. The latter being the main focus.
Market share and dynamics
Today, ca. 30% of all content is translated by neural machine translation while 70% is still done by human translators. A common misconception is that machine translation replaces human translation.
While this is partially true, the main volumes for machine translation are coming from the benefits of machine translation, i.e. speed and low cost are unlocking content potential that has previously not been translated for time and cost reasons.
Thus, machine translation is causing the 50bn € language industry to grow further rather than cannibalizing human translation.
Different approaches and its implications:
The quantitative approach is used e.g., by Google Translate. Google crawls the internet for content with related language translations and trains its engine with the content from this process.
+ Large volumes – large volumes mean a large number of data points which build a good average.
+ Lots of languages covered.
- With large volumes, the quality of the content cannot be properly vetted, resulting in low quality content being used for training.
- Quality is suffering due to low quality content used for training.
In contrast, the qualitative approach to NMT is focused on high quality content for training - to give the AI the best possible basis for learning. This approach is used, for example, by DeepL.
+ Quality is high.
+ High quality content is needed for training. – It is hard to generate this kind of content.
- Adding language takes a lot of effort as high-quality content needs to be gathered first.
- The quality for different language pairs varies greatly.
- Several different NMT providers are needed to cover all needs.
The underlying technology is based on AI language models.
There are proprietary engines and open-source engines in use, e.g. Open NMT OpenNMT - Open-Source Neural Machine Translation based on research from Harvard NLP.
Generally speaking, the quality of the training material is the main factor when it comes to translation quality.
To cover the gap between pure machine translation and human translation, a hybrid has emerged: called machine translation with post editing or MTPE.
This is one of the fastest expanding practices in the translation industry both with LSP, pure hybrid providers like Lengoo, or the leading translation marketplace Lyngual.
+ Fast and cheap due to MT, but all common errors of MT are corrected by a human translator.
- Bad at conveying underlying tone and humorous content, thus not good with creative or marketing content.
The future of neural machine translation
The results and possible applications of GPT-3 are quite interesting both for content creation and translation. It will be very interesting to watch the results and applications progress.
Where MT makes sense:
· Technical documentation.
· Repetitive content.
· Internal content.
· Quick and dirty translation to understand the content.
Where MT is not the best option
· Creative content.
· Marketing content.
· Content that will be published in any way (minimum MTPE is recommended).
· Content in which emotions, humor is supposed to be communicated.
· Content that needs to be localized i.e., adapted to local standards and culture.
MT and legal concerns
There are several websites where content can be translated via MT for free. Be warned, nothing on the internet is free: you pay for the free service with your content (read the terms and conditions).
This means that legally your content is published when using these sites. This negates any NDA you may have.
I have seen many companies struggle with this topic. There even have been a number of instances where confidential and critical information has been recovered by hackers from these engines causing a lot of problems. – So, be mindful and careful of this, and closely consider which content you put through a free MT.
The alternative is to use the professional engines or services like Lyngual to keep your content safe. – the cost to stay safe is very low.