Attention mechanism and skip-gram embedded phrases: short and long-distance dependency n-grams for legal corpora Cover Image

Attention mechanism and skip-gram embedded phrases: short and long-distance dependency n-grams for legal corpora
Attention mechanism and skip-gram embedded phrases: short and long-distance dependency n-grams for legal corpora

Author(s): Panagiotis Krimpas, CHRISTINA VALAVANI
Subject(s): Translation Studies
Published by: Uniwersytet Adama Mickiewicza
Keywords: computational linguistics; legal terminology; legal translation; Neural Machine Translation; Self Attention Mechanism; short and long-distance dependency n-grams; skip-gram algorithm;

Summary/Abstract: This article examines common translation errors that occur in the translation of legal texts. In particular, it focuses on how German texts containing legal terminology are rendered into Modern Greek by the Google translation machine. Our case study is the Google-assisted translation of the original (German) version of the Constitution of the Federal Republic of Germany into Modern Greek. A training method is proposed for phrase extraction based on the occurrence frequency, which goes through the Skip-gram algorithm to be then integrated into the Self Attention Mechanism proposed by Vaswani et al. (2017) in order to minimise human effort and contribute to the development of a robust machine translation system for multi-word legal terms and special phrases. This Neural Machine Translation approach aims at developing vectorised phrases from large corpora and process them for translation. The research direction is to increase the in-domain training data set and enrich the vector dimension with more information for legal concepts (domain specific features).

  • Issue Year: 2022
  • Issue No: 52
  • Page Range: 318-350
  • Page Count: 33
  • Language: English
Toggle Accessibility Mode