Automatic Learning of Language Resources for Detection of Conflict Events for Bulgarian Cover Image

Автоматично извличане на езикови ресурси за откриване на конфликтни събития за български език
Automatic Learning of Language Resources for Detection of Conflict Events for Bulgarian

Author(s): Hristo Tanev
Subject(s): Language studies, Language and Literature Studies, Theoretical Linguistics, Applied Linguistics, Lexis, Semantics, Comparative Linguistics, Philology
Published by: Институт за български език „Проф. Любомир Андрейчин“, Българска академия на науките
Keywords: event extraction; terminology extraction; dictionaries
Summary/Abstract: In this paper we present an overview of event detection for conflict events, such as battles and other military operations, from news streams. We then evaluate a terminology extraction algorithm for learning Bulgarian lexica specific to military conflicts. The domain-specific dictionaries related to conflicts may often require thousands of entities, including professions, military ranks, weapons, vehicles, actions, organization names, relevant adjectives and other lexica. The evaluation shows very promising results, with the accuracy of the learning algorithm exceeding 80%, thus proving the feasibility of event detection for the Bulgarian language.