From graphematics to phrasal, sentential, and textual semantics through morphosyntax by means of corpus-driven grammar and ontology: A case study on one Tibetan text Cover Image

From graphematics to phrasal, sentential, and textual semantics through morphosyntax by means of corpus-driven grammar and ontology: A case study on one Tibetan text
From graphematics to phrasal, sentential, and textual semantics through morphosyntax by means of corpus-driven grammar and ontology: A case study on one Tibetan text

Author(s): Aleksei Dobrov, Maria Smirnova
Subject(s): Language and Literature Studies, Applied Linguistics
Published by: Jazykovedný ústav Ľudovíta Štúra Slovenskej akadémie vied
Keywords: Tibetan language; computer ontology; Tibetan corpus; natural language processing; corpus linguistics;parsing;

Summary/Abstract: This article presents the current results of an ongoing study of the possibilities of fine-tuning automatic morphosyntactic and semantic annotation by means of improving the underlying formal grammar and ontology on the example of one Tibetan text. The ultimate purpose of work at this stage was to improve linguistic software developed for natural-language processing and understanding in order to achieve complete annotation of a specific text and such state of the formal model, in which all linguistic phenomena observed in the text would be explained. This purpose includes the following tasks: analysis of error cases in annotation of the text from the corpus; eliminating these errors in automatic annotation; development of formal grammar and updating of dictionaries. Along with the morpho-syntactic analysis, the current approach involves simultaneous semantic analysis as well. The article describes semantic annotation of the corpus, required by grammar revision and development, which was made with the use of computer ontology. The work is carried out with one of the corpus texts – a grammatical poetic treatise Sum-cu-pa (VII c.).

  • Issue Year: 72/2021
  • Issue No: 2
  • Page Range: 319-329
  • Page Count: 11
  • Language: English
Toggle Accessibility Mode