Slavic languages and the Lithuanian language in the Clarin-PL parallel corpora Cover Image

Języki słowiańskie i litewski w korpusach równoległych Clarin-PL
Slavic languages and the Lithuanian language in the Clarin-PL parallel corpora

Author(s): Violetta Koseska-Toszewa, Roman Roszko
Subject(s): Language studies, Language and Literature Studies, Theoretical Linguistics, Semantics, Comparative Linguistics, Western Slavic Languages, Baltic Languages, Philology
Published by: Instytut Slawistyki Polskiej Akademii Nauk
Keywords: multilingual parallel corpora; semantic annotation; scope quantification;

Summary/Abstract: The Clarin Eric and Clarin-PL strategic scientific purpose is to support humanistic research in a multicultural and multilingual Europe. Polish researchers put the emphasis on building a bridge between the Polish language and Polish linguistic technologies and other European languages and their linguistic technologies. So far, the Polish scientific community has mainly focused on Polish-English connections. Clarin-PL has been developing the first and only multilingual corpora of the Polish language in conjunction with other Slavic languages and the Lithuanian language: the Polish-Bulgarian-Russian Parallel Corpus and the Polish- Lithuanian Parallel Corpus. The parallel corpora created by the ISS PAS Corpus Linguistics and Semantics Team break through the existing “canons” and allow scientists access to interlinked multilingual language resources – in the first phase limited to the languages of the three Slavic groups and the Lithuanian language. In the article, the authors present very detailed information on their original system of the semantic annotation of scope quantification in multilingual parallel corpora, hitherto unused in the subject literature. Due to the system’s originality, the semantic annotation is carried out manually. Identification of particular values of scope quantification in a sentence and the hereby presented attempts of its recording are supported by long-term research conducted by an international team of linguists and computer scientists / mathematicians developing the issue of quantification of names, time and aspect in natural languages.

  • Issue Year: 2016
  • Issue No: 51
  • Page Range: 191-217
  • Page Count: 27
  • Language: Polish