Multi-score neural machine translation evaluation : beyond single score metrics

Park, Dojun2024-10-092024-10-0920231905237219http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-ds-150244http://elib.uni-stuttgart.de/handle/11682/15024http://dx.doi.org/10.18419/opus-15005Evaluating translation quality in the domain of machine translation (MT) has traditionally relied on single-score metrics. This paper introduces a novel multi-score evaluation approach that generates a three-dimensional vector, distinctly measuring quality across three dimensions: accuracy, fluency, and style. Through extensive empirical investigations, we underscore the benefits of our approach, showcasing both its interpretability and its superior edge against traditional single-score methods. Among the various architectures tested, RemBERT emerged as the most promising, consistently delivering robust results in both reference-based MT evaluation and reference-free quality estimation. Our approach offers a versatile solution, equipping stakeholders-from MT developers to end-users-with nuanced insights that go beyond a simple overarching assessment of translation quality.eninfo:eu-repo/semantics/openAccess004400Multi-score neural machine translation evaluation : beyond single score metricsmasterThesis