Kabyle corpus digital database and exploitation. Test of lexicometric analysis of the identity dimension in the romanesque discourse Cover Image

Base de données numérique des corpus kabyles et exploitation. Essai d’analyse lexicométrique de la dimension identitaire dans le discours romanesque
Kabyle corpus digital database and exploitation. Test of lexicometric analysis of the identity dimension in the romanesque discourse

Author(s): Arezki Ikherbane, Ramdane Boukherrouf, Noura Tigziri
Subject(s): Language and Literature Studies, Theoretical Linguistics, Applied Linguistics, Lexis
Published by: Jazykovedný ústav Ľudovíta Štúra Slovenskej akadémie vied
Keywords: corpus; kabyle; identity; novel; lexicometry; databases

Summary/Abstract: The purpose of this contribution is to show, through a preliminary analysis of a corpus sample composed of the first five kabyle novels (1963-1990), the contribution of lexicometry as a new method based on statistics, in the treatment of large corpora and the establishment of databases. The aim is to describe all the phases intrinsic to the preliminary processing of a corpus (transcription, tagging and lemmatization) before submiting them to the various stages of its exploitation. Thus, in our corpus, we have opted to deal with the theme of identity induced by the five works by highlighting both the overused vocabulary and the singularity of each work in relation to the corpus as a whole. But before moving on to the quantitative analysis of the vocabulary, a work of data preparation is necessary. We intend to focus on the orthographic choices to be adopted by removing all ambiguities, the marking out and the lemmatization of the corpus. In order to do this, we have resorted to Lexico5 computer tool.

  • Issue Year: 72/2021
  • Issue No: 4
  • Page Range: 894-905
  • Page Count: 12
  • Language: French
Toggle Accessibility Mode