QUOTE EXTRACTION FROM ESTONIAN MEDIA: ANALYSIS AND TOOLS Cover Image

QUOTE EXTRACTION FROM ESTONIAN MEDIA: ANALYSIS AND TOOLS
QUOTE EXTRACTION FROM ESTONIAN MEDIA: ANALYSIS AND TOOLS

Author(s): Dage Särg, Karmen Kink, Karl-Oskar Masing
Subject(s): Media studies, Lexis, Computational linguistics, Finno-Ugrian studies
Published by: Eesti Rakenduslingvistika Ühing (ERÜ)
Keywords: quote extraction; indirect speech; named entity recognition; information extraction; corpus linguistics; computational linguistics; Estonian;

Summary/Abstract: This paper describes the identification, adaptation and creation of tools that are needed for creating a quote extractor for Estonian media texts that would be able to properly extract both direct and indirect quotes and attribute them to the correct person identified by full name and profession. This includes named entity recognition and resolution as well as grammar-based extraction of direct and indirect quotes. To get a further understanding of indirect speech in Estonian media, we also performed a corpus linguistic analysis of the quotes extracted with our tools from one week of Estonian news.

  • Issue Year: 2021
  • Issue No: 17
  • Page Range: 249-265
  • Page Count: 17
  • Language: English
Toggle Accessibility Mode