01/11/2022

Custom vs stock models in machine translation: Which is better?

Custom vs stock models in machine translation: Which is better?

Working with machine translation always inevitably means being aware of tradeoffs. When deciding between human and machine translation, for example, we already know it’s a decision of quality vs speed and scalability. When choosing between two different engines, it could be a matter of performance in different languages.

This also applies when choosing between types of machine translation models—do you go with the stock models, or do you need a custom version? Let’s take a look at the options.

What is the difference between stock and custom models?

Before we go on, we need to explain what stock and custom models are.

Let’s go with stock models first. Stock models refer to the basic version of any machine translation system that is made available for use. Today, every machine translation provider has a stock model that is trained on massive amounts of generic language data. You might think of Google Translate and other major MT systems that are freely available for general use.

Custom models, on the other hand, are versions of an MT system that have undergone additional training in order to make them suited for specific needs. MT systems can be trained for any number of domains—legal, life sciences, hospitality, finance, you name it, and MT systems can be customized for it.

For custom models, the user—usually a business or institution—brings in their own language data, usually in the form of glossaries and translation memories, to further train the MT system. This helps it provide more relevant translations, as many key terms and phrases are properly translated.

There is also a subcategory known as “vertical stock models” that falls in between the two types. Vertical stock models are models that have been pretrained by the MT provider for different domains. As such, they are more suitable for use in those domains without businesses having to provide additional language data themselves.

What is the tradeoff between the two?

As we said earlier, there’s always a tradeoff when it comes to choices like this. In the case of stock vs custom models, the tradeoff has to do with price vs quality and relevance of translations.

Custom models are, understandably, more expensive than stock models.

For example, Amazon Translate’s Standard Translation plan, as of time of writing, comes down to $15 per million characters, while its Active Custom Translation plan is at $60 per million characters. That’s four times as expensive!

Another example is Google’s new Translation Hub, which costs $0.15 per page for its basic service, and $0.50 per page for the advanced version that comes with customization.

But the value that custom models provide more than offsets the price in most cases. Stock models may be trained on billions of strings of language and translation data, but that data is generic in nature, and may possibly generate translations that are wrong, or not relevant to your specific needs.

Stock models are useful for translating texts that are simple and don’t feature technical or domain-specific terminology.

With custom models, the machine-translated output is more likely to be translated properly thanks to the language data the user provides. Terms and phrases that are more technical or domain-specific will be handled properly by a custom model compared to a stock model.

Which is the better option?

For most businesses and institutions, going the custom route is usually the optimal choice. It’s more expensive, but the returns in terms of quality and relevance are exponentially greater.

However, not all are able to make full use of custom machine translation. This is especially the case when the user doesn’t have access to large amounts of language and translated data needed to customize an MT model properly. In this case, a better option would be to seek out MT providers that offer vertical stock models in the domain they need.

Stock models can be useful for general communications, but they fall short in many of the larger-scale use cases of machine translation.

Parting thoughts

As we mentioned earlier, choices like this come with tradeoffs. But there is a clear winner for most cases: custom models. Businesses and institutions looking to invest in machine translation would do well to go all-in if they have the means to do so.