AJALOOLISTE TEKSTIDE NORMALISEERIMINE
NORMALIZING HISTORICAL TEXTS
Author(s): Gerth JaanimäeSubject(s): Historical Linguistics, Computational linguistics, Estonian Literature, 19th Century
Published by: Eesti Rakenduslingvistika Ühing (ERÜ)
Keywords: NLP; normalizing; language history; corpus linguistics; computational linguistics; language change; non-standard language; digital humanities; Estonian;
Summary/Abstract: Normalizing historical texts or in other words converting them to modern spelling enables us to analyze them with tools designed for contemporary language. It also makes it possible to search the texts for different keywords and automatically compare the old spelling to contemporary spelling. This article gives a general overview of normalizing, different methods, previously performed experiments and the main problems in the context of the old Estonian texts from the second half of the 19th century.
Journal: Eesti Rakenduslingvistika Ühingu aastaraamat
- Issue Year: 2021
- Issue No: 17
- Page Range: 47-59
- Page Count: 13
- Language: Estonian