Машинско и аудитивно препознавање словенских језика
Machine-Based and Auditory Identification of Slavic Languages
Author(s): Jacek Kudera, Jovana StevanovićSubject(s): Sociolinguistics, Descriptive linguistics, Western Slavic Languages, Eastern Slavic Languages, South Slavic Languages
Published by: Vilniaus Universiteto Leidykla
Keywords: machine identification of linguistic origin; auditory study; comparison; Slavic languages;
Summary/Abstract: This paper presents a comparison of auditory and machine-based identification of linguistic origins. Two studies were conducted to assess the ability of lay listeners and a state-of-the-art machine approach to identify Slavic L1 from delexicalized speech samples. The first study involved 228 native speakers of the four Slavic languages (Bulgarian, Czech, Polish and Russian) who had not received any prior training in Slavic philology, phonetics, linguistics, or forensic science. Their task was to identify the linguistic origins of speakers when exposed to limited phonetic cues. The stimuli consisted of meaningless logatomes to control for the lexical information. The second study employed machinebased identification of a spoken language, based on two distinct approaches: (1) formant structure of phonetic signal and (2) a neural network and vector representation of speech samples. The data showed that Slavic native speakers, even when exposed to limited auditory cues, are able to identify speakers’ L1s. Interestingly, in the context of the Bulgarian language, the machine-based identification method performed better than the lay listeners. The results of the experiments provide insight into the advantages of hybrid approaches in investigations related to LADO (Language Analysis for the Determination of Origin). Furthermore, the outcomes of this comparison may contribute to the debate on the involvement of native speakers in L1 identification procedures for closely related languages.
Journal: Slavistica Vilnensis
- Issue Year: 69/2024
- Issue No: 1
- Page Range: 56-66
- Page Count: 11
- Language: Serbian