Računalna obradba hrvatskih korpusa: povijest, stanje i perspektive
Croatian Corpus Processing: History, State of the art and Perspectives
Author(s): Marko TadićSubject(s): Language and Literature Studies
Published by: Hrvatsko filološko društvo
Summary/Abstract: This article gives a survey of Croatian corpus processing. It lists the most important projects since the first Croatian computer corpus (Gundulić's Osman) up to the present time. The article focuses on the Croatian National Corpus which is the central project in the field of corpus linguistics in Croatia today. The Croatian National Corpus consists of two parts: 1) representative 30-million Corpus of Contemporary Croatian Language and 2) Croatian Electronic Text Archive. The 30-million Corpus covers the first phase of the Croatian National Corpus while the effort in the second phase will be concentrated on the widening of the contents of the Croatian Electronic Text Archive. The 30-million Corpus, which is now at the stage of advanced planning and software and pilot corpus (7,67 million of running words) testing, should to be finished in the year 2000.
Journal: Suvremena lingvistika
- Issue Year: 1997
- Issue No: 43-44
- Page Range: 387-394
- Page Count: 8
- Language: Croatian