Forvo.com: So many people, so many recordings Cover Image

Forvo.com: Jak víc hlav víc namluví
Forvo.com: So many people, so many recordings

Author(s): Michal Škrabal, Pavel Machač
Subject(s): Western Slavic Languages
Published by: AV ČR - Akademie věd České republiky - Ústav pro jazyk český
Keywords: audio pronunciation dictionary; citizen science; crowdsourcing; forvo.com

Summary/Abstract: The article describes the Czech section of the crowdsourced audio dictionary available on the website forvo.com (2008–2021), which is remarkable for several reasons: for its scope, reach, linguistic diversity, and the very unique variability of pronunciation recorded. We compare the website with some other open multilingual databases of audio recordings and touch on the dichotomous relationship between the intended concept of the website and its actual form. We also briefly characterize the list of Czech entries and summarize the advantages and weaknesses of the available data for scientific purposes. Finally, we consider the typical user of the website, either a provider of audio data (speaker), whose speech behaviour is obviously influenced by the specific speech situation during the recording, or a non-native lay recipient (listener), who is fully dependent on the confidence in the representativeness of the specific pronunciation variants. Finally, we define the notion of representativeness, which will later, in our further study, serve as an evaluation framework for the phonetic analysis of the recordings.

  • Issue Year: 104/2021
  • Issue No: 3
  • Page Range: 143-152
  • Page Count: 10
  • Language: Czech
Toggle Accessibility Mode