The University of Sheffield
Department of Computer Science

Peter Roch Undergraduate Dissertation 2015/16

"A picture is worth a thousand words" - a survey on the effect of images in machine translation comprehension.

Supervised by L.Specia

Abstract

Automatically translating human language has been a long sought-after goal in the field of Natural Language Processing (NLP). Machine Translation (MT) is one of the oldest subfields of Artificial Intelligence (AI) research. It aims to automatically convert one natural language into another. MT can significantly lower communication barriers, with enormous potential for positive social and economic impact. The dominant paradigm is Statistical Machine Translation (SMT), which learns to translate from human-translated examples. Human translators have access to a number of contextual cues beyond the actual segment to translate when performing translation, for example, images associated with the text and related documents. SMT systems, however, completely disregard any form of non-textual context and make little or no reference to wider surrounding textual content. This results in translations that miss relevant information or convey incorrect meaning. Such issues drastically affect reading comprehension and may make translations useless. This is especially critical for user-generated content such as social media posts - which are often short and contain non-standard language - but applies to a wide range of text types.