Создание и использование исторических корпусов славянских письменных памятников
Creation and Using of Historical Corpora of Slavonic Manuscripts
Author(s): Viktor A. BaranovSubject(s): Language studies, Language and Literature Studies, Theoretical Linguistics, Applied Linguistics, Philology
Published by: Институт за литература - БАН
Keywords: Historical Slavonic corpus; Russian chronicles; linguistic statistics
Summary/Abstract: The requirements for historical corpora of medieval texts 1) are determined by properties of the data and the historical-linguistic, textological and linguo-textological tasks to be solved; 2) and should be realized with the help of special tagging, processing procedures, query parameters and retrieval demonstrations. The corpus should a) have metadata concerning both texts and manuscripts, and involving both linguistic and analytical tagging; b) support the rendering of documents (facsimile and transcription), concordances, lists, and comparison of subcorpora data; c) simplify graphic-orthographic variation during data search and visualization; d) provide tools both for processing and searching linguistic material and its further analysis according to traditional methods; and e) support problem description and resolution by applying corpus methods that engage with the quantity, distribution, co-occurrence, and variation of linguistic units in big data arrays. The realization of these requirements is demonstrated on a subcorpus of three copies of chronicles (Laurentian, Hypatian, Radzivilovsky) from the historical corpus project “Manuscript” (manuscripts.ru).
Journal: Scripta & e-Scripta
- Issue Year: 2019
- Issue No: 19
- Page Range: 33-57
- Page Count: 25
- Language: Russian
- Content File-PDF