Identyfikacja bifonematycznych wystąpień wybranych diad ortograficznych w języku polskim na potrzeby automatycznej transkrypcji tekstu
The identification of biphonematic occurrences of selected orthographic dyads in Polish for the purposes of automatic text transcription
Author(s): Daniel ŚledzińskiSubject(s): Syntax, Descriptive linguistics, Western Slavic Languages
Published by: Polskie Towarzystwo Językoznawcze
Keywords: text transcription; Polish language; identification of inflectional forms; grapheme-to-phoneme (G2P);
Summary/Abstract: The article discusses the methods of identifying Polish inflectional forms that contain the orthographic dyad from the set: dz, dź, dż and rz, which denotes two phonemes. The mentioned structures are mainly monophonematic, so the biphonematic occurrences can be considered exceptional. This issue is important for language processing, mainly for the automatic conversion of an orthographic text into phonological or phonetic transcription. This function is served by the multilayer transcription model proposed by the author. The paper briefly presents the assumptions of this model in the context of the problem discussed. The most important part of the article is the description of the research, whose aim was to determine onset orthographic sequences that appear only in certain inflectional forms, which enables their identification.
Journal: Biuletyn Polskiego Towarzystwa Językoznawczego
- Issue Year: LXXVIII/2022
- Issue No: 78
- Page Range: 221-235
- Page Count: 15
- Language: Polish