CEEOL - Browse Subjects Result

We kindly inform you that, as long as the subject affiliation of our 300.000+ articles is in progress, you might get unsufficient or no results on your third level or second level search. In this case, please broaden your search criteria.

Testing word embeddings for Polish

Author(s): Agnieszka Mykowiecka,Małgorzata Marciniak,Piotr Rychlik / Language(s): English Issue: 17/2017

Distributional Semantics postulates the representation of word meaning in the form of numeric vectors which represent words which occur in context in large text data. This paper addresses the problem of constructing such models for the Polish language. The paper compares the effectiveness of models based on lemmas and forms created with Continuous Bag of Words (CBOW) and skip-gram approaches based on different Polish corpora. For the purposes of this comparison, the results of two typical tasks solved with the help of distributional semantics, i.e. synonymy and analogy recognition, are compared. The results show that it is not possible to identify one universal approach to vector creation applicable to various tasks. The most important feature is the quality and size of the data, but different strategy choices can also lead to significantly different results.

More...

Categorical Model of Structural Operational Semantics for Imperative Language

Author(s): William Steingartner,Valerie Novitzká / Language(s): English Issue: 2/2016

Definition of programming languages consists of the formal definition of syntax and semantics. One of the most popular semantic methods used in various stages of software engineering is structural operational semantics. It describes program behavior in the form of state changes after execution of elementary steps of program. This feature makes structural operational semantics useful for implementation of programming languages and also for verification purposes. In our paper we present a new approach to structural operational semantics. We model behavior of programs in category of states, where objects are states, an abstraction of computer memory and morphisms model state changes, execution of a program in elementary steps. The advantage of using categorical model is its exact mathematical structure with many useful proved properties and its graphical illustration of program behavior as a path, i.e. a composition of morphisms. Our approach is able to accentuate dynamics of structural operational semantics. For simplicity, we assume that data are intuitively typed. Visualization and facility of our model is not only a new model of structural operational semantics of imperative programming languages but it can also serve for education purposes.

More...

Valency Constructions At Work: A Case Study

Author(s): Temenuzhka Seizova-Nankova / Language(s): English Issue: 1/2016

This is a corpus-driven analysis which aims at highlighting the complex behavior of V+eye(s) collocation (1372 occurrences) drawn from the BNC, more specifically from Kilgarriff’s WordSketch of the lemma EYE. Statistical scores are used to identify patterns of use in relation to relative frequency. As a result, some monovalent, divalent, and trivalent valency constructions are described. Such observations have important implications both for future research of language use as well as for foreign language learning and teaching.

More...

Identification of the Category of Adjective in English A Corpus-Based Approach

Author(s): Ilina Doykova,Temenuzhka Seizova-Nankova / Language(s): English Issue: 1/2015

It is common knowledge that in English there are problems that arise concerning identification of word classes. Attention is more specifically drawn to examining the class of adjectives and what problems they pose for the researcher and foreign language learner. The Analyzed examples are extracted from the BNC and also from a self-made electronic corpus.The article consists of Introduction and three sections. The introduction describes the state of affairs in very general terms. Section 1, 2 and 3 deal with the different formal classifications of adjectives and how they relate to other word classes, adverbs in particular.

More...

Translating Noun Phrases from English to Bulgarian on the Base of Translation of Maintenance Manual of Dresser Pressure Relief Valve. Difficulties and Characteristics

Author(s): Emilia Toneva / Language(s): English,Bulgarian Issue: 1/2014

The article deals with problems of translation having to do with the complexity of noun phrase structure in ESP. What is aimed at is giving some practical solutions of dealing with a concrete text and foregrounding some difficulties and characteristics.

More...

“Run To The Hills, Run For Your Lives” – For Versus To With Verbs Of Motion (A Corpus-Based Study)

Author(s): Svetlana Nedelcheva / Language(s): English Issue: 1/2014

This paper is a corpus-based study on the semantics of the English prepositions for and to when combined with verbs of motion. The verbs chosen for analysis – run, go, hurry – are all motion verbs which can combine with both prepositions. The aim of the study is to bring some evidence in order to confirm Tyler and Evans’s hypothesis (2003: 153) that the semantics of to is related to reaching a particular target or goal, direction and contact, while with the semantics of for the majority of senses are primarily associated with purposes, intentions and motives, which reflects the more intentional character of its functional element. The study uses the largest corpus of American English available at present – the Corpus of Contemporary American English (COCA). It provides an exhaustive number of excerpts suitable for a comprehensive analysis. The data bring forth some problems in the interpretation of contexts where to and for are apparently used interchangeably. This leads to the inference that to and for also seem to share a high degree of semantic overlap.

More...

Dimensions of Terminology Work

Author(s): Anita Nuopponen / Language(s): English Issue: 25/2018

Straipsnyje aptariami įvairūs terminologijos darbo aspektai. Nagrinėjama terminologijos literatūra, ieškant aiškių ir numanomų dichotomijų ir trichotomijų, susijusių su tiksline grupe, poreikio tenkinimo skuba, įvairiais poreikiais, tikslu, ekspertize, bendradarbiavimu, tęstinumu, produktu, kalba, medžiaga, priemone, orientacija ir sistemiškumu. Kaip tyrimo rezultatas pateikta daugiaaspektė terminologijos darbo tipologija (sąvokų sistema), kurios kategorijos gali būti įvairiai derinamos. Pavyzdžiui, terminologijos darbas gali būti savarankiška vienkartinė ar pasikartojanti veikla, kurios produktas – spausdintas, el. knygos ar pdf formatu parengtas, tinklalapyje pateiktas ar kaip terminų straipsnių rinkinys terminų banke įdėtas terminų žodynas. Be to, terminologijos darbas gali būti organizuota, bendradarbiavimu paremta veikla, kurios imasi projekto grupė, organizacijos padalinys ar profesionalus terminologas, jis gali būti vertėjo ar specialiųjų tekstų rengėjo darbo dalis, atliekama nuolat ar pagal poreikį. Šiuolaikinis terminologijos darbas labai įvairus. Jam reikia tokių metodų ir principų, žinynų ir vadovėlių, kurie galėtų aprėpti įvairius poreikius ir suteiktų tinkamą pagrindą skirtingų tipų terminologijos darbui: terminologijos planavimui, greitai atliktinam terminologijos darbui, įvairioms tikslinėms grupėms numatytai veiklai, įskaitant kompiuterijos sprendimus, projektus, tęstinę terminologijos tvarkybą ir kt.

More...

LE RÔLE DE L’HÉRITAGE SÉMANTIQUE DANS LA TRADUCTION AUTOMATIQUE

Author(s): Michał Hrabia / Language(s): French Issue: 3/2018

The aim of this paper is to present the role of the semantic inheritance in the one of the linguistic models for machine translation – the object-oriented approach by Wiesław Banyś. In the first part, the author outlines the general concepts of the theory and provides several examples of its application in the disambiguation process. The second part is fully focused on the question of the hierarchy of object classes and the semantic inheritance of attributes and operations. In fact, it is precisely thanks to the hierarchy postulated in the theory that the linguistic description becomes effective and fully applicable in computer systems.

More...

Sąvokos sąjūdis įsišaknijimas lietuvių kalboje

Author(s): Veslava Čižik-Prokaševa / Language(s): Lithuanian Issue: 79/2019

The aim of this study is to analyse the growing relevance of the concept sąjūdis and its spread in contemporary Lithuanian. By referring to the Corpus of Contemporary Lithuanian Language developed by the Centre of Computational Linguistics of Kaunas Vytautas Magnus University, it was established that 28 words that were formed from the noun sąjūdis directly or from the formations of this word are used in Lithuanian: 14 nouns, 12 adjectives, one verb, and one adverb. These words are discussed from the perspective of word formation in the article by providing their meanings, forms, frequency of usage, and sources. For the sake of comparison, examples found through the web search engine Google are provided.

More...

La e-langue des jeunes en France et en Ukraine

Author(s): Andriy Bilas / Language(s): French Issue: 7/2019

The article proposes the comparative study of the new forms of youth’s slang induced by the development of communication via Internet and its impact on common communication and language. Our study also considers scientific literature examining e-language markers of French and Ukrainian youth striving for sociolinguistic cooperation under the analyzed sociolect circumstances. The multidisciplinary approach makes it possible to construct the typology of the youth’s e-language forms.

More...

Les prénoms et les patronymes dans les ressources dictionnairiques pour le traitement automatique du polonais par NooJ

Author(s): Krzysztof Bogacki,Agnieszka Dryjańska / Language(s): French Issue: 19/2019

This paper reports on a study whose purpose was to provide researchers specializing in the automatic treatment of natural languages with linguistic resources dedicated to Polish, namely dictionaries and local grammars. Firstly, a morphological dictionary of first names and surnames in NooJ format is presented. The corpus for the dictionary, made up of texts collected from several sources published on the Internet, contains more than 466,000 headwords (7 586 first names and 458 244 surnames). Seeking to reduce the size of the dictionary, we propose a modular approach for the construction of local grammars. It requires, however, the creation of more than 40 local grammars for surnames and almost double for first names. The dictionary recognizes altogether about 33MB of forms. As the solution based on a list of first names and surnames is time- and disc space-consuming, we introduce another approach – based on local grammars only. In the final part of the paper, we discuss the advantages and disadvantages of both solutions, as well as semantic and grammatical ambiguities that cannot be overcome in both approaches. Secondly, we discuss the reasons for the choice of this part of the lexicon, and next, having given a brief overview of the properties that distinguish proper nouns from the common names, we describe these properties that have a direct impact on the forms of surnames in Polish and constitute the main sources of opposition among them. In addition to the grammatical categories (case, gender and number) affecting surnames’ forms, we also point out their origin (Slavic, Latin, Greek, biblical etc.). As for the observance of the usage rules of Polish surnames, very strict or more flexible, we have adopted a liberal approach that does not exclude certain forms, although they can be considered erroneous by purists.

More...

Līdzsvarotais mūsdienu latviešu valodas tekstu korpuss, tā nozīme gramatikas pētījumos

Author(s): Kristine Levane-Petrova / Language(s): Latvian Issue: 10/2019

The main purpose of this paper is to present „The Balanced Corpus of Modern Latvian” (LVK) (www.korpuss.lv) – a new 10 million representative corpus of contemporary Latvian. It describes the design, composition and text selection criteria of LVK2018. Also the annotation of the corpus (the metadata and the morphological tagging) and the usage of the corpus is described in the paper. The history of the LVK series goes back to the 2007 when the first 1 million corpus was created. The LVK design, compilation and the text selection criteria were based on the Latvian Language Corpus Conception. The same corpus design criteria were also used for the subsequent LVK series. The last corpus from that series (LVK2013) was released on 2013 with 4.5 million words. All corpora are morphologically annotated and the texts also annotated with metadata. LVK2018 is the 10 million representative corpus of contemporary Latvian. LVK2018 is enlarged from LVK2013 based on the slightly modified corpus design criteria that also applied for the previous corpora from LVK series. LVK2018 is designed as general-language, representative and balanced corpus that aims to cover the variety of existing texts in some estimated proportions. The corpus contains five different sections: journalism (60%), fiction (20%), scientific (10%), legal (8%), parliamentary transcripts (2%). This work has received financial support from European Regional Development Fund under the grant agreement No. 1.1.1.1/16/A/219 (Full Stack of Language Resources for Natural Language Understanding and Generation in Latvian).

More...

Latviešu valodas sintaktiski marķētā korpusa gramatikas modelis

Author(s): Laura Rituma,Baiba Saulite,Gunta Nešpore-Bērzkalne / Language(s): Latvian Issue: 10/2019

This paper describes the development of Latvian Treebank and its grammar model. This corpus is the first syntactically annotated corpus for Latvian, and currently contains approximately 13000 annotated sentences. A hybrid dependency-constituency model was developed in order to describe Latvian syntactic constructions as accurately as possible, augmenting the commonly used dependency grammars with phrase constructions for certain syntactic elements – analytical word forms and relations other than subordination. The grammar model is based on idea of a syntactic nucleus which is a functional syntactic unit consisting of content-words or syntactically inseparable units that are treated. There are three kinds of phrase constructions in the Latvian Treebank grammar model: x-words, coordination and punctuation mark constructions. X-words are used for analytical forms, compound predicates, prepositional phrases etc. Coordination constructions are used for coordinated parts of sentences and coordinated clauses. Punctuation mark constructions are used to annotate different types of constructions that require the punctuation in the sentence. The chosen annotation approach and data transformation systems ensure that the corpus is accessible to end users both in the hybrid dependency-constituency model suitable for research of syntactic phenomena in Latvian linguistic tradition, and in the Universal Dependencies multilingual model that is better suited for certain computational linguistics systems. This work has received financial support from European Regional Development Fund under the grant agreement No. 1.1.1.1/16/A/219 (Full Stack of Language Resources for Natural Language Understanding and Generation in Latvian) in synergy with the grant agreement No. 1.1.1.2/VIAA/1/16/188 (From Abstract Meaning Representation to Natural Language Sentence and Coherent Text Generation).

More...

Дигиталната хуманитаристика: центрове и периферии

4.90 €

Preview

Дигиталната хуманитаристика: центрове и периферии

Author(s): Susan Schreibman / Language(s): Bulgarian Issue: 23/2020

“Digitale Geisteswissenschaften: Zentren und Peripherien”. This paper explores a history of humanities computing over the past decade as embodied in or represented by A Companion to Digital Humanities (first published in 2004), methodologically, theoretically, and in terms of community practice. It explores digital humanities as an emerging discipline through changes in technology, as well as through evolving conceptions of the field, particularly through the lens of literary studies and new media. The article also explores how the field’s major conference Digital Humanities, but previously titled the Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing (ACH/ALLC), reflects these changes, through not only the themes presented in conference papers, but in the change of the title of the conference itself.

More...

Using Computational Linguistics Methods in Digital Humanities: On Possibilities and Pitfalls

4.50 €

Preview

Using Computational Linguistics Methods in Digital Humanities: On Possibilities and Pitfalls

Author(s): Antske Fokkens,Tommaso Caselli,Minh Le,Pia Sommerauer,Leon van Wissen / Language(s): English Issue: 23/2020

The paper illustrates by means of two sets of case studies, one related to information extraction and one related to concept change (Section 4), possibilities and issues that are involved in the use of Computational Linguistics methods in the area of Digital Humanities, with a particular focus on evaluation. It provides to both communities, i.e., computational linguists and digital humanists, a set of best practices, or guidelines, on the cross-fertilisation of different methodologies and area of study.

More...

4.90 €

Preview

Дигиталната хуманитаристика: Манифести

Author(s): Reneta Bozhankova / Language(s): Bulgarian Issue: 23/2020

The article discusses the manifestoes of digital humanities in their chronological order, and focuses on their ideas, style characteristics, similarities to the avant-garde manifestos of the early 20th century. The analysis reviews positions that are critical to digital humanities and attempts to identify its place in the range of contemporary academic disciplines.

More...

Какво е дигитална компетентност и трябва ли да се преподава?

4.90 €

Preview

Какво е дигитална компетентност и трябва ли да се преподава?

Author(s): Tatyana Angelova / Language(s): Bulgarian Issue: 23/2020

The topic of the article is the concept of digital competence, which is described through criteria and indicators presented by the Framework for Understanding and Development of Digital Competence in Europe. A functional solution to the problem of smart use of information technologies is sought. We defend the understanding that the education systems in secondary schools and universities have to be changed and reformed according to the paradigm set by new technologies. The digital competence self-assessment matrix is examined to answer the question if this type of competence should be taught.

More...

Ерата на цифровата paideia: обучение чрез паралелни изходни и целеви текстове и текстуалeн анализ

4.90 €

Preview

Ерата на цифровата paideia: обучение чрез паралелни изходни и целеви текстове и текстуалeн анализ

Author(s): Yoana Sirakova / Language(s): Bulgarian Issue: 23/2020

Bilingual corpora in language education and research offer a unique instrument for discovering languages logical sets and translation techniques both in the original and the target language systems. They provide today’s learners with a means for individual exploration and acquiring of linguistic knowledge with no human mediators like e.g. teachers. That is one of the goals of building a parallel bilingual corpus of Bulgarian translations of Roman authors and texts as a depiction of a peculiar local reception of antiquity and as a means for enhancing linguistic and literary analyses. The paper confines to the account of few case studies, provided by aligned bilingual corpora and computer-assisted analyses, and intertwines with some issues about the dynamic and changing conditions and challenges in education and research in present days.

More...

Елиминиране на човешкото – от кибернетиката до трансхуманизма

5.90 €

Preview

Елиминиране на човешкото – от кибернетиката до трансхуманизма

Author(s): Natalia Hristova / Language(s): Bulgarian Issue: 23/2020

In his book The Obsolescence of Humankind (1956), Gunter Anders speaks of the Promethean shame of the modern man that he is not at the height of the things he invented himself. At the heart of this shame stands the humiliating feeling that unlike a machine, the human being is an accidental, incomplete result of a blind, unpredictable, incalculable and uncontrollable process of conception and birth, that unlike machines, he is born, not produced. Desiring to imitate the technical tools, the modern man sets himself ever higher goals, becoming subject to a specific engineering whose ultimate goal is to transform tne man into a machine. The article еxamines this process of eliminating the human from the cybernetics through the algorithmic governmentality to the biotechnical enhancement.

More...

Между дигитално и виртуално: случаят на периферното тяло

4.90 €

Preview

Между дигитално и виртуално: случаят на периферното тяло

Author(s): Nikolay Genov / Language(s): Bulgarian Issue: 23/2020

The paper explores the notion of virtuality in William Gibson’s latest sci-fi novel “The Peripheral” in terms of its relation to the concepts of the digital and the real. The body is thus used as a medium the purpose of which is to negotiate their possible theoretical differences. The article also explores the pervasive role of the police authority in the novel’s universum considering it a part of the established dynamic.

More...