CEEOL - Browse Subjects Result

We kindly inform you that, as long as the subject affiliation of our 300.000+ articles is in progress, you might get unsufficient or no results on your third level or second level search. In this case, please broaden your search criteria.

Detecting Source Code Plagiarism on .NET Programming Languages using Low-level Representation and Adaptive Local Alignment

Author(s): Faqih Salban Rabbani,Oscar Karnalim / Language(s): English Issue: 1/2017

Even though there are various source code plagiarism detection approaches, only a few works which are focused on low-level representation for deducting similarity. Most of them are only focused on lexical token sequence extracted from source code. In our point of view, low-level representation is more beneficial than lexical token since its form is more compact than the source code itself. It only considers semantic-preserving instructions and ignores many source code delimiter tokens. This paper proposes a source code plagiarism detection which rely on low-level representation. For a case study, we focus our work on .NET programming languages with Common Intermediate Language as its low-level representation. In addition, we also incorporate Adaptive Local Alignment for detecting similarity. According to Lim et al, this algorithm outperforms code similarity state-of-the-art algorithm (i.e. Greedy String Tiling) in term of effectiveness. According to our evaluation which involves various plagiarism attacks, our approach is more effective and efficient when compared with standard lexical-token approach.

More...

Граматичната категория лице във форумното общуване на bg-mamma

Author(s): Bilyana Todorova / Language(s): Ukrainian Issue: 52/2017

The main purpose of the paper is to present some grammatical features of computer-mediated communication (CMC), in particular the verb category of the person. The specifics of grammar use in CMC texts are not preferred by researchers as they are more difficult to investigate. The grammatical system is more stable and conservative and the differences in grammar use are less discernable. A lot of investigators claim that CMC occupies the middle position between speech and writing. Hewing and Coffin (2004) consider that in speech the first and the second person use is preferred; in writing the third person has more frequent use. This research thus aims to investigate the use of personal forms of verbs in some discussions in the largest Bulgarian forum platform – bg- mamma. The results show that in the corpus the third person appears more often than others. The second person has lowest use.

More...

Lietuvių kalbos morfologija atvirojo kodo Hunspell platformoje

Author(s): Virginijus Dadurkevičius / Language(s): Lithuanian Issue: 90/2017

The paper presents the results of an attempt to build basic Lithuanian language resources using the widespread Hunspell platform. The spelling is actually the primary target of this opensource platform but the morphological analysis and synthesis are also possible. Moreover, the ability to efficiently perform lemmatization (stemming) makes this platform the best option for text search engines (e.g. Solr/Lucene) and information retrieval. Taggers, grammar checkers and other basic natural language processing tools can also be build using properly built Hunspell language resources.Every Hunspell language resource consists of two files: dictionary and affixes (it may be empty). The dictionary contains main forms (lemmas) whereas the affixes contain the morphological rules to generate all possible forms. As a source for the dictionary we have used the Modern Lithuanian Dictionary (6-th edition), Corpus of the Contemporary Lithuanian Language compiled at the Center of Computation Linguistics of Vytautas Magnus University,database of documents of the Lithuanian Parliament, versti.eu machine translation corpus of Vilnius University and various public internet sources (totally 1.3 billion tokens). Main criteria for semi-manual compilation of the Lithuanian dictionary of lemmas from these sources was correctness, usability, actuality and approval by language authorities. Deprecated loanwords or extremely rare, exotic, obsolete, jargon, insulting forms were discarded from the list. Resulting dictionary consists of 171 000 lemmas: 42 000 common nouns, 73 000 proper nouns, 15 000 adjectives, 53 pronouns, 153 numerals, 35 000 verbs, 4 000 adverbs and 2 000 others(prepositions, conjunctions, particles, onomatopoeias, interjections, acronyms and abbreviations).The second component of language resource, the so called “affix file”, contains information of various kind: metadata, preferable suggestions for spelling correction, grouping of rules, explicit tags for flexing and non-flexing properties, rules for suffix and affix alteration.In order to make the Hunspell resources suitable for creating basic language tools, e.g.morphological analyzer and synthesizer, some principles should be kept:1) every flexion paradigm (consisting of one or more rules) should be thoroughly generated from one single lemma in dictionary file (it is not trivial, especially for irregular verbs); 2) every individual alteration case should have its own morphological tag, e.g.‘Masc_Sg_Il’ for masculine + singular + illative;3) every dictionary item should have references for part of speech and other non-flexing information;4) avoid prefixation via rules, use dictionary instead – affixed forms may have completely different meanings and using them under single lemma may cause problems for text search engines;5) do not rely much on calling rules from rules – calling depth can by no more than 1.The coverage of the contemporary Lithuanian by this implementation of Lithuanian morphology is about 98 percent. The full list of all the theoretically possible forms generated by this resource contains about 17 million entries.This work clearly shows an efficient way for any language (especially with scarce funding resources) to make basic language tools using a single open source development platform – the Hunspell.

More...

Null operators, ellipsis, and scrambling

Author(s): Nakamura Masanori / Language(s): English Issue: 2/2016

Move is subject to phase-based locality, whereas Agree is not, a natural consequence of cyclic linearization. Then, null operator movement, having no impact on linearization, should be immune to certain phase-related effects. I show that this prediction is borne out, based on the interactions between (null operator) movement and ellipsis. Furthermore, I extend the present proposal to scrambling in Japanese. It turns out that the observed correlation between movement and ellipsis helps us choose among competing theories of scrambling. Specifically, theoretical as well as empirical considerations support an analysis of scrambling in Japanese as involving either null operator movement or PF movement.

More...

SELECTING NEURAL NETWORK ARCHITECTURE FOR INVESTMENT PROFITABILITY PREDICTIONS

Author(s): Tonimir Kišasondi,Alen Lovrenčić / Language(s): English Issue: 1/2006

In this paper we present a modified neural network architecture and an algorithm that enables neural networks to learn vectors in accordance to user designed sequences or graph structures. This enables us to use the modified network algorithm to identify, generate or complete specified patterns that are learned in the training phase. The algorithm is based on the idea that neural networks in the human neurocortex represent a distributed memory of sequences that are stored in invariant hierarchical form with associative access. The algorithm was tested on our custom built simulator that supports the usage of our ADT neural network with standard backpropagation and our custom built training algorithms, and it proved to be useful and successful in modelling graphs.

More...

Computers for Teaching Writing in a Foreign Language

Author(s): Agnieszka Leńko-Szymańska / Language(s): English Issue: 36/1997

Komputer ugruntował już swoją pozycję w dziedzinie nauczania języków obcych, jednak zazwyczaj wykorzystywany jest on jedynie do tradycyjnych ćwiczeń dotyczących słownictwa i gramatyki. Celem autorki artykułu jest pokazanie, że prawdziwe możliwości komputera są znacznie większe i mogą one służyć komunikatywnej metodyce, szczególnie w zakresie polepszania umiejętności pisania. W artykule przedstawiono kilka typów programów komputerowych wspomagających pisanie, takich jak: edytory tekstu, słowniki elektroniczne czy poczta elektroniczna. Jednakże autorka zwraca uwagę na fakt, że sam komputer nie może poprawić umiejętności pisania uczniów, jeśli nie jest on wspierany przez nauczyciela. Artykuł omawia także różnorodne oprogramowanie, które służy do bezpośredniego nauczania i ćwiczenia umiejętności pisania.

More...

Inteligența artificială visează la jurnalism. Imparțialitatea algoritmilor în relatările despre o lume nouă (de la singularitatea lui Kurzweil la Deep Mind și Quill)

Author(s): Andrei Stipiuc / Language(s): Romanian Issue: 20/2017

Great minds have always imagined the future of humanity and have been daydreaming about tangible worlds made possible with the aid of innovation and technological progress. Dreams have turned, as the case may be, into pages of utopias about coexistence and cohabitation between man and machine, or of dark dystopia, marked by the lack of freedom and of individual fundamental rights. From Jules Verne’s imagined inventions which now are finding their place in history museums, to the testimonies of Marshall McLuhan, Isaac Asimov, or Arthur C. Clarke, the vast majority of the envisioned products have become as invisible in our daily lives as our kitchen sinks. Currently, technological advancement has polarized the greatest hopes and fears around artificial intelligence and super-brains: shall we become immortal through transplantation of biotechnological consciousness and digital grafts, or shall we find our fatal end, losing control of the space odyssey to the (literally, perhaps)hands of artificial intelligence? The media designs and redesigns new outlines for the automation crisis at a fast pace. Simultaneously with investments in computerized laboratories meant to create highly impartial journalistic materials, many journalists are increasingly concerned about their already problematic future. Whether the reports about this new world will be authored by human reporters, or by algorithms, it remains to be seen as time goes by. Vernor Vinge says, however, that there is not much time left.

More...

From Practice to Theory: the Evolution of English Pre-corpus Monolingual Learner’s Dictionaries

Author(s): Anca Cehan,Nadina Cehan / Language(s): English Issue: 2 (28)/2018

The paper traces the progress of a typically English lexicographic product: the monolingual learner’s dictionary, during the pre-corpus period, by looking at the accretion of its defining features: treatment of phraseology, vocabulary control, presence of grammar information, ordering of headwords, contextual information, restricted defining vocabulary and defining style in the works of the precursors and those of the Vocabulary Control Movement. As such, it presents the defining features of the genre and offers an overview of the main contributions made by the early lexicographers from the sixteenth to the eighteenth centuries, Samuel Johnson, Harold Palmer, A.S. Hornby, and Michael West.

More...

The Verb “Seem” – A Corpus-Based Approach

Author(s): Emilia Toneva,Temenuzhka Seizova-Nankova / Language(s): English Issue: 1/2014

This paper presents a short corpus-based analysis of the verb “seem”. The verbs are very important components of the language and they have received much linguistic attention. The verb “seem” is one of the linking verbs. The definitions for this paper are from Longman online, Macmillan online, and Collins COBUILD English Dictionary and the corpus is extracted from the British National Corpus (BNC). The verb “seem” is analyzed in terms of different syntactic structures of use.

More...

Quantitative and Qualitative Analysis of the Apology Speech Act ‘Sorry’

Author(s): Deyana Peneva / Language(s): English Issue: 1/2013

The paper dwells on the quantitative and qualitative analysis of an apology speech act in spoken language discourse. The data in the analysis are taken from British National Corpus of spoken language considering a particular speech act and namely, the speech act of apology and in the present paper – the performative adjective ‘sorry’. The paper focuses on the description of the reference corpus, the purposes of the quantitative analysis, followed by a detailed overview of the basic sorry patterns, the quantitative analysis itself and finally the paper attempts to make assumptions and draw conclusions with respect to the quantitative and qualitative analysis of the data.

More...

An Interactive System for the Grammatical Analysis of Written Texts in Romanian

Author(s): Manuela Mihăescu,Dina Vîlcu,Sanda Cherata,Cornel Vîlcu / Language(s): English Issue: Suppl./2011

The paper presents the development, within a research project, of an interactive system of grammatical analysis for texts written in Romanian. The two products realised as practical applications are presented here: a grammar checker for Romanian and an educational application with functions of assistance in teaching/ learning Romanian (as a foreign language).

More...

Timing is everything! On derivational complexity and multiple workspaces

Author(s): Marijana Marelj / Language(s): English Issue: 1/2019

Under any derivational approach, syntactic computations proceed from more complex to less complex domains. Though such multiple workspaces get to be resolved into a single – matrix – workspace, the issue of timing– i.e. the point when multiple workspaces must resolve to a single derivational space has not been addressed in the literature. I argue that not only the direction, but also the timing of syntactic computations is guided by a more general requirement to reduce the computational complexity and I propose Multiple Workspaces Earliness Hypothesis to address this issue. On the empirical side, the technical apparatus and the analysis I propose allow me to capture the seemingly contradictory binding facts involving locative PPs as well as to treat adjuncts as relation, rather than absolute notions.

More...

THE BASICS OF CORPUS LINGUISTICS

Author(s): Tatjana Ponorac / Language(s): English Issue: 34/2013

This paper discusses the term corpus linguistics. It goes on to describe the basic functions, characteristics, application and the key terms of this relatively young discipline. Basically, corpus linguistics is based on texts taken from different life contexts in order to examine the parallel between the theories of language and its realization in concrete situations.

More...

OSNOVI KORPUSNE LINGVISTIKE

Author(s): Tatjana Ponorac / Language(s): Serbian Issue: 34/2013

U radu se govori o pojmu korpusna lingvistika. Ukazuje se na osnovne funkcije, obilježja, primjenu i ključne pojmove koji karakteristišu ovu relativno mladu disciplinu. Ističe se da se korpusna lingvistika bazira na tekstovima koji se preuzimaju iz različitih životnih konteksta kako bi se proučila paralela između teorija o jeziku i njegovoj realizaciji u konkretnim situacijama.

More...

LINGUISTIC INNOVATIONS IN COMPUTER MEDIATED COMMUNICATION

Author(s): Marek Weber / Language(s): English Issue: 14 (19)/2008

It is justified to claim that the processes of the computerisation of everyday life, the creation of the information society and the public acquisition of the Internet led to the emergence of a new variety of language abundant with linguistic innovations used in numerous forms of computer mediated communication. The arrival of the Internet is regarded by some researchers as a revolutionary event in linguistic terms and its significance is even compared to the appearance of the media of speech and writing. Texts existing in the Internet have been classified into three categories according to the criterion whether they existed in a pre-Internet era and if so whether they have been subject to a noticeable modification. The first group comprises genres that have exactly the same counterparts in print versions. The second category includes types of texts that have undergone a substantial transformation in relation to their traditional equivalents, such as electronic mail, internet forums and mailing lists. The last category consists of new genres that were not known before the appearance of the Internet and evolved together with the emergence of computer mediated communication, such as synchronous chat. The paper is an attempt to describe characteristic features of the language used in texts representing the last two abovementioned categories: electronic mail, internet forums, mailing lists and synchronous chat. Particular attention is focused on novel linguistic mechanisms that can be observed in these genres.

More...

Sözbilimsel Yapı Kuramının Metinlerdeki Önemli Birimlerin Belirlenmesine Yönelik Kullanımı

Author(s): Yusuf Aydin,Yusuf Doğan / Language(s): Turkish Issue: 2/2020

The aim of this research is to provide an overview of how Rhetorical Structure Theory can be used to identify important information in texts and summarization. 204 undergraduate students and 2 domain experts participated in the research. In order to collect data in the research, an informative text was divided into units according to the Rhetorical Structure Theory. Undergraduate students were asked to rate these units according to their level of importance to the text, and field experts were asked to determine the role of units. Krippendorff's Alpha, Kappa, Pearson correlation analysis and descriptive statistics were carried out on the data obtained. According to the results of the research, there is a moderate agreement between experts are in determining the units in the text. Students are intuitively aware of the nucleus and satellite distinction revealed by the rhetorical structure theory. According to the results of the research, the contributions of the Rhetorical Structure Theory in determining important information in informative texts and summarizing are limited. It is anticipated that better results can be obtained in determining the important information in the informative texts and summarizing the texts along with the other theories.

More...

Terminology as the Basis for Building Engineering Feature-based Models

Author(s): Ricardo Eito Brun / Language(s): English Issue: 27/2020

Satellite operations require the combined use of different tools to support engineering activities and to control the spacecraft. This communication is managed by the Monitoring and Control System (MCS) that receives telemetry data from the spacecraft and releases telecommands to keep the satellite’s attitude and flight path. These complex systems are developed as open platforms that can be extended and customised to support mission-specific requirements and objectives. As a general rule, it can be stated that these software applications are good candidates for implementing variability mechanisms in a structured, planned way and that their functionality is a good candidate to analyse the feasibility of applying feature-based modelling techniques. This paper describes the use of terminology analysis to build a feature model to support requirements analysis for this type of software-based systems.

More...

SLOGAN LOCALISATION AT WORK: THE REXONA WEBSITE

Author(s): Cristian Lako / Language(s): English Issue: 22/2017

Websites are the main PR tool of today’s companies as they can be employed to broadcast information on products, events or news and keep customers close. However, each market needs to be catered for individually, even when they share the same language.

More...

ПЕРЕХОДНОСТЬ СМЫСЛОВОГО ГЛАГОЛА В ЯПОНСКИХ БЕНЕФАКТИВНЫХ КОНСТРУКЦИЯХ В СВЕТЕ КОРПУСНЫХ ДАННЫХ

Author(s): Natalia A. Solomkina / Language(s): Russian Issue: 02 (41)/2021

Benefactive constructions in the languages of the world allow main verbs with different transitivity levels. In this article we are taking in account Japanese benefactives (analytic constructions with so-called directionality verbs) in typological context, and we examine restrictions that these constructions impose on transitivity level of the main verb. We survey the data from the Balanced Corpus of Contemporary Written Japanese (BCCWJ) and the Corpus of Spontaneous Japanese (CSJ). Using the quantitative analysis, we prove that in our datasets there is a significant statistical difference between the distribution of valency classes of the main verbs in benefactive constructions and the distribution of valency classes across randomly selected verb forms. To put it differently, the choice of a main verb depends on the restrictions imposed on the transitivity level by the benefactive construction. We also demonstrate that benefactive constructions lean toward monotransitive main verbs but are not confined to them. Our data confirm the preceding assumption that intransitive verbs with ‘give’ auxiliaries are acceptable if the receiver is not overtly expressed. But for ‘receive’ auxiliaries we do not find any limitations linking the transitivity of the main verb with the overt encoding of a benefactor or a beneficiary.

More...

La Place de la Traduction Automatique dans l'Enseignement de la Traduction

Author(s): Erdinç Aslan / Language(s): French Issue: 18/2021

The rapid developments in machine translation in recent years have significantly affected the field of translation. The new methods applied have led to significant improvements in the quality of translation. At the same time, machine translation started to be used in different applications and in different ways, especially on social networks. This situation has made machine translation an area in which large companies are focusing and making significant investments, and as a result, it has increased the market share of machine translation in the sector, and it has made it popular by allowing ordinary people to take an interest in machine translation. Despite this, there is discussion as to how these developments can be reflected in the translator training curriculum and in machine translation. This study focuses on the place of machine translation in Teaching Translation and examines how these systems can be reflected in the Teaching Translation curriculum.

More...