Prvn&#237; korpus mluvč&#237;ch češtiny v dětsk&#233;m věku

První korpus mluvčích češtiny v dětském věku
The first corpus of childhood Czech speakers

Author(s): Anna Chromá, Klára Matiasovitsová
Subject(s): Language and Literature Studies, Applied Linguistics, Morphology, Language acquisition, Psycholinguistics, Sociolinguistics, Developmental Psychology, Scientific Life
Published by: Univerzita Karlova v Praze - Filozofická fakulta, Vydavatelství
Keywords: Chroma corpus; Czech language; CHILDES; language acquisition; morphological annotation; linguistic analysis;

Summary/Abstract: The article discusses the Chroma corpus, a newly published dataset capturing the spoken interactions of monolingual Czech children aged 19 to 49 months. This corpus fills a chronological gap in Czech language acquisition research and is part of the international CHILDES database. The Chroma corpus includes audio recordings of spontaneous interactions between children and their caregivers, recorded longitudinally over 11 to 27 months. These recordings are transcribed using the CHAT transcription system, which is standard for CHILDES. The corpus contains 99,388 tokens in children's utterances and 238,211 tokens in adult utterances. The transcriptions are annotated morphologically using the MorphoDiTa tool, allowing for detailed linguistic analysis. The Chroma corpus is a significant resource for studying various linguistic phenomena, including morphological and syntactic innovations, and contributes to the broader understanding of first language acquisition.

Details
Contents

Journal: Časopis pro moderní filologii

Issue Year: 106/2024
Issue No: 1
Page Range: 107-109
Page Count: 3
Language: Czech

Content File-PDF

Back to list

První korpus mluvčích češtiny v dětském věku The first corpus of childhood Czech speakers

První korpus mluvčích češtiny v dětském věku
The first corpus of childhood Czech speakers