PrevDistro: An open-access dataset of Hungarian preverb constructions
PrevDistro: An open-access dataset of Hungarian preverb constructions
Author(s): Ágnes KalivodaSubject(s): Morphology, Finno-Ugrian studies
Published by: Akadémiai Kiadó
Keywords: preverbs; constructions; corpus-driven; dataset; Hungarian grammar
Summary/Abstract: Hungarian has a prolific system of complex predicate formation combining a separable preverb and a verb. These combinations can enter a wide range of constructions, with the preverb preserving its separability to some extent, depending on the construction in question. The primary concern of this paper is to advance the investigation of these phenomena by presenting PrevDistro (Preverb Distributions), an open-access dataset containing more than 41.5 million corpus occurrences of 49 preverb construction types. The paper gives a detailed introduction to PrevDistro, including design considerations, methodology and the resulting dataset's main characteristics.
- Issue Year: 69/2022
- Issue No: 4
- Page Range: 549-563
- Page Count: 15
- Language: English