Heade n&#228;itelausete automaattuvastamine eesti keele &#245;ppes&#245;nastike jaoks

Kristina Koppel

Heade näitelausete automaattuvastamine eesti keele õppesõnastike jaoks
Automatic detection of good dictionary examples in Estonian learner’s dictionaries

Author(s): Kristina Koppel
Subject(s): Foreign languages learning, Theoretical Linguistics, Lexis
Published by: Eesti Rakenduslingvistika Ühing (ERÜ)
Keywords: corpus lexicography; corpus linguistics; learner’s lexicography; language learning; collocations; usage examples; GDEX; Estonian;

Summary/Abstract: This paper explains, firstly, how a tool called Good Dictionary Example (GDEX) (Kilgarriff et. al 2008) scores corpus sentences and helps the lexicographer automatically select the best examples for dictionaries. Secondly, the training datasets containing example sentences from the Estonian Collocations Dictionary (ECD) are introduced. Thirdly, the paper focuses on different parameters of good dictionary examples.Most of the paper is based on an analysis of the training datasets and an evaluation of the previous GDEX configurations. For evaluating the configurations, the graphical user interface GDEX Editor was used. Based on the results of statistical analysis and on the evaluation of different configurations, a new configuration 1.4 is introduced. There are 16 new parameters implemented in GDEX 1.4.The main parameters of GDEX 1.4 are as follows: the desired sentence is a full sentence; sentence length is 4–20 tokens; the sentence contains a verb; it does not contain low frequency words or words from the blacklist; the optimal length is 6–12 tokens; sentences containing more than 1 adverb, pronoun, proper name, numeral, conjunction, comma, more than 2 verbs and sentences containing certain pronouns are penalized.The output of GDEX 1.4 can be applied to the ECD project and to create a web interface SkELL for learners of Estonian.

Details
Contents

Journal: Eesti Rakenduslingvistika Ühingu aastaraamat

Issue Year: 2017
Issue No: 13
Page Range: 53-71
Page Count: 19
Language: Estonian

Content File-PDF

Back to list

Heade näitelausete automaattuvastamine eesti keele õppesõnastike jaoks Automatic detection of good dictionary examples in Estonian learner’s dictionaries

Heade näitelausete automaattuvastamine eesti keele õppesõnastike jaoks
Automatic detection of good dictionary examples in Estonian learner’s dictionaries