KORPUS POLSZCZYZNY XVI WIEKU
CORPUS OF THE 16TH-CENTURY POLISH LANGUAGE
Author(s): Krzysztof W. Opaliński, Patrycja PotoniecContributor(s): Monika Czarnecka (Translator)
Subject(s): Theoretical Linguistics, Lexis, Historical Linguistics, Western Slavic Languages, 16th Century, Philology
Published by: Dom Wydawniczy ELIPSA
Keywords: lexicography; history of Polish; diachronic corpus of Polish;
Summary/Abstract: The original purpose of creating the corpus of the 16th Polish language was to preserve the material basis of Słownik polszczyzny XVI wieku (Dictionary of the 16th-Century Polish Language) (SPXVI) comprising 272 texts transliterated in accordance with standardised principles, which is of great value. The project described here consists in creating an online base of the resources and using a part of it as a germ of a language corpus with texts designated with morphosyntactic markers. The works adopted XML encoding in the TEI (Text Encoding Initiative) formalism, version P5, adjusted to a 16th-century text. Typographical elements as well as grammatical categories and forms of words were designated in the texts. The germ of the corpus of the 16th-century Polish language comprises 135 thousand segments and it will be expanded by another 100 thousand in the future to provide material for an automated form designation tool. Ultimately, integration with the Diachronic Corpus of Polish is planned.
Journal: Poradnik Językowy
- Issue Year: 2020
- Issue No: 08
- Page Range: 17-31
- Page Count: 15
- Language: Polish
- Content File-PDF