Actualizing the distribution of Czech National Corpus sources through re-evaluation of the balance in corpus structure Cover Image

Aktualizace rozvržení zdrojů Českého národního korpusu s ohledem na revizi vyváženosti jeho struktury
Actualizing the distribution of Czech National Corpus sources through re-evaluation of the balance in corpus structure

Author(s): Jan Králík
Subject(s): Language and Literature Studies
Published by: AV ČR - Akademie věd České republiky - Ústav pro jazyk český

Summary/Abstract: In order to develop balanced corpora, the term “expectations” of the future potential user of corpora has been introduced (Králík, 2001). Based on several statistical studies of such expectations, the textual structure of SYN2000, which is the synchronic part of the Czech National Corpus (CNC) has been proposed and realized. The present article discusses two new studies of expectations (Aktér 2001 and ČJ 2001) and suggests important implications for future work on CNC. Table 1 and Table 2 reveal the stability of expectations in the categories of fiction [krásná literatura] and newspapers and magazines [noviny + časopisy]. Although the daily contact between respondents and administrative texts is stable (see Table 3), the distribution of these texts is closely bound to other non-fiction topics, which is why no special attention to administrative texts is proposed. The expectations concerning newspapers and magazines are stable (Table 5), but changed radically during 1996–2001 (first and last searches, Table 6). Within the same period, an obvious rise in interest in fiction has been noted (Table 6). The reasons for this can be attributed to natural societal development. Thus, a strong reduction in newspaper texts and strong increase in the use of fictional texts is proposed (Table 7 + Table 8).

  • Issue Year: 65/2004
  • Issue No: 2
  • Page Range: 133-142
  • Page Count: 10
  • Language: Czech