03/08/2022

Context is key in MT: Highlights from the NeTTT conference 2022 (Day 2)

Context is key in MT: Highlights from the NeTTT conference 2022 (Day 2)

In a previous installment we brought you some quick commentary on some panels we attended at the first ever conference on New Trends in Translation and Technology (NeTTT), covering the first day of the conference. 

For coverage of the other days of the conference, see our articles here:

MT is not the future, but the now: Highlights from the NeTTT conference 2022 (Day 1)
Towards better MT: Highlights from the NeTTT conference 2022 (Day 3)

As we mentioned in the previous post, the conference brought together both academics and industry players with interest in translation studies, linguistics, machine translation, and other relevant domains, to share the most recent and cutting edge research and insights in the field with one another.

The second day saw a lot more panels on machine translation, and unfortunately we were unable to catch them all. But here are some of the highlights from the ones we attended, along with some special mentions.

Addressing context in machine translation development

The second day of the conference began with a keynote titled “Machine Translation Using Context Information”, presented by Marcello Federico of AWS AI Labs.

Federico emphasizes that the output of machine translation may look correct out of context, but there are many external factors that may show them to be incorrect. These may include gender, speech registers, topic or domain of discourse, among other things that define the context in which the original text, and by extension the translation, operates within.

Machine translation has yet to solve these problems, however there is already a lot of research on how generic data can be annotated to control for some of these factors and analyze past translations to provide better output.

Context in machine translation for subtitles

In line with the theme of context, the next panel we were able to attend discussed a specific context-related problem in “Fixed Language Units and Machine Translation: Pragmatemes in Machine-Translated Subtitles” presented by Judyta Mężyk.

Mężyk defines pragmatemes as “autonomous, polylexical, semantically compositional utterances constrained in their signified by the situation of communication in which they are produced”.

If that sounds like a mouthful, what it basically means is text that communicates its full meaning only in the context it is part of. Examples include greetings such as “Hello!” or “Good morning!” and situational sentences like “How can I help you?” and “Sign here, please.”

Critical or uncritical? MT in the news

Newspapers and media are a major purveyor of mainstream views about current topics and trends, and as such make a good case for study. This is what researcher Elizabeth Marshman tackles in her panel “Weird, Wonderful, Worthy, and Worrying: Use Cases for MT as Described in Canadian Newspapers”.

Marshman shares various nuanced, as well as some relatively uncritical views of machine translation as presented by journalists in various newspapers, with a somewhat encouraging picture of machine translation’s portrayal in mainstream media. Still, there’s room for improvement in terms of making journalists more aware of the nuances.

Special mentions

As we brought up earlier, there were a few panels we were unable to attend but which we believe deserve some special mention.

The first concerns ethics regarding the development of models with concern for low-resource languages. This panel is “Machine Translation and Technocracy: Mitigating issues of power parity in MT for low-resource languages”, presented by Matt Riemland.

The second is “Human-Adapted MT for Literary Texts: Reality or Fantasy?” presented by Damien Hansen and Emmanuelle Esperança-Rodier. Literature has long been considered a Waterloo domain for machine translation, so it would be interesting to learn the researchers’ take on the topic.