Automatic Alignment of the Psalterium Sinaiticum and the Septuagint Psalms
Automatic Alignment of the Psalterium Sinaiticum and the Septuagint Psalms
Author(s): Hanne EckhoffSubject(s): Language studies, Language and Literature Studies
Published by: Кирило-Методиевски научен център при Българска академия на науките
Keywords: Old Church Slavonic; Greek; automatic alignment; automatic tagging; aspect.
Summary/Abstract: This paper describes the work on automatically aligning the Psalterium Sinaiticum with the Septuagint psalms in the Tromsø Old Russian and Old Church Slavonic Treebank (TOROT). It briefly accounts for the transcription, text processing and manual annotation of the Psalterium Sinaiticum itself. It then explains the choice of Greek text, describes the automatic lemmatisation and morphological tagging of the Greek text and calculates and analyses the success rate in a small sample. Next the algorithm for automatic token-level alignment of texts is briefly described, and the success rate calculated and analysed. The results seem quite good from a quantitative perspective (over 90% accuracy in most cases), and it may seem tempting to try to use the data directly. However, a pilot study of aspect in the Greek and Old Church Slavonic text shows that the automatically processed Greek parallel leads to considerable data loss, and that much manual sifting of apparent mismatch examples is necessary to arrive at a preliminary analysis. In a low-resourced historical language such as Old Church Slavonic we cannot afford working with this amount of noise and data loss. We can use automatic tagging and alignment to ease our workload, but we have to manually post-correct the output.
Journal: Кирило-Методиевски студии
- Issue Year: 2021
- Issue No: 31
- Page Range: 71-90
- Page Count: 20
- Language: English
- Content File-PDF