POS-TAGGING TARTU CORPUS OF ESTONIAN LEARNER ENGLISH WITH CLAWS7 Cover Image

POS-TAGGING TARTU CORPUS OF ESTONIAN LEARNER ENGLISH WITH CLAWS7
POS-TAGGING TARTU CORPUS OF ESTONIAN LEARNER ENGLISH WITH CLAWS7

Author(s): Liina Tammekänd, Reeli Torn-Leesik
Subject(s): Foreign languages learning, Lexis, Computational linguistics, Finno-Ugrian studies
Published by: Eesti Rakenduslingvistika Ühing (ERÜ)
Keywords: Estonian learner English; TCELE; POS-tagging; tagger errors; corpus linguistics;

Summary/Abstract: The aim of the study is to examine whether the CLAWS7 tagger is a suitable tool for tagging the Tartu Corpus of Estonian Learner English (TCELE). Extracts were tagged manually and automatically, and the results were compared to calculate the error rate and reveal the possible causes for tagger errors. The error rate was 4.01%. The tagger expectedly experienced some of the disambiguation problems outlined in the CLAWS7 post-editing guide, yet certain tagger errors were also triggered by learner errors.

  • Issue Year: 2022
  • Issue No: 18
  • Page Range: 263-278
  • Page Count: 16
  • Language: English