The Bulgarian-Polish-Russian Parallel Corpus
The Bulgarian-Polish-Russian Parallel Corpus
Author(s): Maksim Duszkin, Joanna Satoła-StaśkowiakSubject(s): Language and Literature Studies
Published by: Instytut Slawistyki Polskiej Akademii Nauk
Keywords: parallel corpora; the Bulgarian-Polish-Russian parallel corpus
Summary/Abstract: The Semantics Laboratory Team of Institute of Slavic Studies of Polish Academy of Sciences is planning to begin work on the creation of a Bulgarian-Polish-Russian parallel corpus. The three selected languages are representatives of the main groups of Slavic languages: Bulgarian represents the southern group of Slavic languages, Polish — the western group of Slavic languages, Russian — the eastern group of Slavic languages. Our project will be the first parallel corpus of these three languages. The planned corpus will be based on material, dating from one period (the 20th century) and will have a synchronous nature. The project will not constitute the sum of the separate corpora of selected languages. One of the problems with creating multilingual parallel corpora are different proportions of translated texts between the selected languages, for example, Polish literature is often translated into Bulgarian, but not vice versa. Bulgarian, Russian and Polish differ typologically — Bulgarian is an analytic language, Polish and Russian are synthetic. The parallel corpus should have compatible annotation, while taking into account the characteristic features of the selected languages. We hope that the Bulgarian-Polish-Russian parallel corpus will serve as a source of linguistic material of contrastive language studies and may prove to be a big help for linguists, translators, terminologists and students of linguistics. The results of our work will be available on the Internet.
Journal: Cognitive Studies | Études cognitives
- Issue Year: 2011
- Issue No: 11
- Page Range: 241-254
- Page Count: 14
- Language: English