Corpus of the 19th Century of the Warsaw University and IJP PAN Cover Image

Korpus XIX w. Uniwersytetu Warszawskiego i IJP PAN
Corpus of the 19th Century of the Warsaw University and IJP PAN

Author(s): Marek Łaziński, Rafał L. Górski, Michał Woźniak
Subject(s): Language and Literature Studies
Published by: KSIĘGARNIA AKADEMICKA Sp. z o.o.
Keywords: language corpus; historical Polish; Modern Polish Period; corpus linguistics

Summary/Abstract: The article describes a historical corpus which documents the 19th and early 20th century. The corpus was created as part of a research grant whose objective was to investigate the development of the aspectual system of Polish in the last 250 years against the background of Czech and Russian. An important resource for this investigation was a database of aspectual triplets, which, in turn, was based on materials such as text corpora. Since there was no large corpus of the 19th and early 20th century available, there was a need to bridge this gap. In the course of the project, such corpus was made and it is now publicly accessible with no restrictions. This comprehensive corpus contains over 12 million contemporary words. Its texts originate from major Polish virtual libraries. It is POS-tagged with a tagger dedicated for 19th century texts. A web-based concordancer, an adjusted version of ParaVoz, allows for querying the corpus. The queries may be constrained by metadata.

  • Issue Year: 18/2023
  • Issue No: 35
  • Page Range: 125-134
  • Page Count: 10
  • Language: Polish
Toggle Accessibility Mode