POS-TAGGING TARTU CORPUS OF ESTONIAN LEARNER ENGLISH WITH CLAWS7
POS-TAGGING TARTU CORPUS OF ESTONIAN LEARNER ENGLISH WITH CLAWS7
Author(s): Liina Tammekänd, Reeli Torn-LeesikSubject(s): Foreign languages learning, Lexis, Computational linguistics, Finno-Ugrian studies
Published by: Eesti Rakenduslingvistika Ühing (ERÜ)
Keywords: Estonian learner English; TCELE; POS-tagging; tagger errors; corpus linguistics;
Summary/Abstract: The aim of the study is to examine whether the CLAWS7 tagger is a suitable tool for tagging the Tartu Corpus of Estonian Learner English (TCELE). Extracts were tagged manually and automatically, and the results were compared to calculate the error rate and reveal the possible causes for tagger errors. The error rate was 4.01%. The tagger expectedly experienced some of the disambiguation problems outlined in the CLAWS7 post-editing guide, yet certain tagger errors were also triggered by learner errors.
Journal: Eesti Rakenduslingvistika Ühingu aastaraamat
- Issue Year: 2022
- Issue No: 18
- Page Range: 263-278
- Page Count: 16
- Language: English