A corpus-based approach to the automatic morphological analysis of Estonian computer-mediated communication Cover Image

Korpuslingvistiline lähenemine eesti internetikeele automaatsele morfoloogilisele analüüsile
A corpus-based approach to the automatic morphological analysis of Estonian computer-mediated communication

Author(s): Heiki-Jaan Kaalep, Raul Sirel, Kadri Muischnek
Subject(s): Language and Literature Studies
Published by: Eesti Rakenduslingvistika Ühing (ERÜ)
Keywords: computational linguistics; corpus linguistics; morphology; morphosyntax; wordclass; orthography; Estonia

Summary/Abstract: This article concentrates on aspects of Estonian that are different in computermediated communication and the standard written language: orthography and the divergence of word-forms. The authors present an analysis of these differences and propose a way to adapt an existing morphological analyser for analysing computermediated communication. The method entails the creation of a user lexicon for the morphological analyser, deployed largely in an automated manner, and the automatic pre-processing of texts.

  • Issue Year: 2011
  • Issue No: 7
  • Page Range: 111-127
  • Page Count: 17
  • Language: Estonian
Toggle Accessibility Mode