WorldWideScience

Sample records for stylistic text classification

  1. STYLISTIC FEATURES OF ADVERTISING TEXTS OF INFORMATIVE AND COMPARATIVE TYPES

    Directory of Open Access Journals (Sweden)

    Poddubskaya, O.N.

    2016-06-01

    Full Text Available The relevance of this article is related to the fact that nowadays advertising has a very strong impact both on the consumer market, political and cultural life of society, and on the language and its development as a system. Advertising has given rise to the development of a special set of stylistic features of a text, formed under the influence of reviving advertising traditions in the Russian language and under the active impact of energetic and pushy European advertising. The purpose of this study is to explore stylistic features of informative and comparative advertising texts. The object of research is Russian-language advertising in printed media and on television. In the end of the article we made conclusions about groups of language means used for different stylistic devices in informative and comparative advertising texts. Analysis of stylistic features of modern informative and comparative advertising texts can be of great interest to specialists in the field of theoretical studies of modern advertising.

  2. DISCOURSE STYLISTICS AS CONTEXTUALIZED STYLISTICS

    Directory of Open Access Journals (Sweden)

    Marina Katnić-Bakaršić

    2003-01-01

    Full Text Available The focus of the paper is on discourse stylistics, viewed as contextualized discipline. Context includes various factors (sociohistorical, cognitive, cultural and intertextual. The paper investigates the most important approaches to discourse stylistics: pragmatic stylistics, discourse and/ or conversational analysis, cognitive stylistics, critical stylistics, feminists stylistics. In discourse stylistics analysis is always combined with interpretation, and description is followed by explanation and critique.

  3. Automated Determination of the Type of Genre and Stylistic Coloring of Russian Texts

    Directory of Open Access Journals (Sweden)

    Barakhnin Vladimir

    2017-01-01

    Full Text Available In this paper we propose the algorithm of automated definition of the genre type and semantic characteristics of poetic texts in Russian. We formulated the approaches to the construction of a joint (“two-dimensional” classifier of genre types and stylistic colouring of poetic texts, based on the definition of interdependence of the type of genre and stylistic colouring of the text. On the basis of these approaches the principles of formation of the training samples for the algorithms for the definition of styles and genre types were analyzed. The computational experiments with a corpus of texts of the Lyceum lyrics of A.S.Pushkin were implemented, which showed good results in determining the stylistic colouring of poetic texts and sufficient results in determining the genres. The proposed algorithms can be used for automation of the complex analysis of Russian poetic texts, significantly facilitating the work of the expert in determining their styles and genres by providing appropriate recommendations.

  4. Multimodal Stylistics: The Happy Marriage of Stylistics and Semiotics

    DEFF Research Database (Denmark)

    Nørgaard, Nina

    2010-01-01

    put up for analysis. Since the first major flourishing of stylistics in the 1960s, different linguistic paradigms and other academic trends of the times have caused the field to branch off into a great variety of sub-fields such as formalist stylistics, functionalist stylistics, cognitive stylistics......, doctor-patient discourse, academic writing, etc. While forceful in its rigour and systematism, the traditional stylistic approach (whether of a formalist, functionalist, cognitive or other orientation) has until recently largely failed to embrace meanings which are created by semiotic systems other than......Stylistics is the systematic study of the ways in which meaning is created by linguistic means in literature and other types of text. It arose from a wish to make literary criticism more ―scientific by anchoring the analysis of literature more solidly in the actual grammar and lexis of the texts...

  5. DISCOURSE STYLISTICS AS CONTEXTUALIZED STYLISTICS

    OpenAIRE

    Marina Katnić-Bakaršić

    2003-01-01

    The focus of the paper is on discourse stylistics, viewed as contextualized discipline. Context includes various factors (sociohistorical, cognitive, cultural and intertextual). The paper investigates the most important approaches to discourse stylistics: pragmatic stylistics, discourse and/ or conversational analysis, cognitive stylistics, critical stylistics, feminists stylistics. In discourse stylistics analysis is always combined with interpretation, and description is followed by explana...

  6. Speech Act Classification of German Advertising Texts

    Directory of Open Access Journals (Sweden)

    Артур Нарманович Мамедов

    2015-12-01

    Full Text Available This paper uses the theory of speech acts and the underlying concept of pragmalinguistics to determine the types of speech acts and their classification in the German advertising printed texts. We ascertain that the advertising of cars and accessories, household appliances and computer equipment, watches, fancy goods, food, pharmaceuticals, and financial, insurance, legal services and also airline advertising is dominated by a pragmatic principle, which is based on demonstrating information about the benefits of a product / service. This influences the frequent usage of certain speech acts. The dominant form of exposure is to inform the recipient-user about the characteristics of the advertised product. This information is fore-grounded by means of stylistic and syntactic constructions specific to the advertisement (participial constructions, appositional constructions which contribute to emphasize certain notional components within the framework of the advertising text. Stylistic and syntactic devices of reduction (parceling constructions convey the author's idea. Other means like repetitions, enumerations etc are used by the advertiser to strengthen his selling power. The advertiser focuses the attention of the consumer on the characteristics of the product seeking to convince him of the utility of the product and to influence his/ her buying behavior.

  7. Towards an integrated corpus stylistics

    Directory of Open Access Journals (Sweden)

    McIntyre Dan

    2015-12-01

    Full Text Available Over recent years, the use of corpora in stylistic analysis has grown in popularity. However, questions still remain over the remit of corpus stylistics, its distinction from corpus linguistics generally and its capacity to explain complex stylistic effects. This article argues in favour of an integrated corpus stylistics; that is, an approach to corpus stylistics that integrates it with other stylistic methods and analytical frameworks. I suggest that this approach is needed for two main reasons: (i it is analytically necessary in order to fully explain stylistic effects in texts, and (ii integrating corpus methods with other stylistic tools is what will distinguish corpus stylistics from corpus linguistics. My argument is supported by reference to examples from Mark Haddon’s no vel The Curious Incident of the Dog in the Night-time and the HBO TV series Deadwood. Both these examples rely for their explanation on a combination of corpus stylistic analytical techniques and other stylistic methods of analysis.

  8. A stylistic classification of Russian-language texts based on the random walk model

    Science.gov (United States)

    Kramarenko, A. A.; Nekrasov, K. A.; Filimonov, V. V.; Zhivoderov, A. A.; Amieva, A. A.

    2017-09-01

    A formal approach to text analysis is suggested that is based on the random walk model. The frequencies and reciprocal positions of the vowel letters are matched up by a process of quasi-particle migration. Statistically significant difference in the migration parameters for the texts of different functional styles is found. Thus, a possibility of classification of texts using the suggested method is demonstrated. Five groups of the texts are singled out that can be distinguished from one another by the parameters of the quasi-particle migration process.

  9. Categorization and Pathology of Persian Stylistic Researches

    Directory of Open Access Journals (Sweden)

    Maryam Dorpar

    2014-08-01

    Full Text Available Abstract In following article, surveys and researches about Persian style were categorized in two branches of historical and formalistic styles Mohammad Taghi Bahar founded stylistics as an autonomous knowledge by publishing his book, History of the evolution of Persian prose (1331, for teaching in University of Tehran. This book which has been influenced by verbal instructions of qajar dynasty’s scholars made the way generally has been followed by researchers in Persian stylistics up to now. However, researchers and critics have introduced various theories and approaches during last four decades.  Stagnation in Persian stylistic researches is the main problem which is considered in current article. The main questions are: What branch of stylistics should be the performed Persian stylistic researches? “what are the weak points of surveys” and “what should be done for getting rid of this stagnation?” The main objective of current article is taking steps for removing stagnation from Persian stylistics.  Malek osh-Shoara Bahar used periodization in studying prose styles and analyzed revolution of Persian prose in lexical aspect (obsolete words, Arabic words, synonyms, words repetition, morphological aspect (verbal prefixes, comparative adjective suffixes, syntactic aspect (precedence of verb over its belongings, omission of verbs and rhetorical aspect (simile and allegory, metonymy and metaphor, prolixity and periphrasis, riming prose and harmony. In fact he tried to show both health and strength and laxity and corruption period of prose. We call Bahar stylistics and all researches done in his way historical stylistics with traditional attitude. In this method, styles' consistency and evolution through history have been studied and preiodization of styles has been taken into account. Mentioned researches periodized styles, finding formal similarities and differences. Since, neglecting meaning and text functionality they have only paid

  10. Categorization and Pathology of Persian Stylistic Researches

    Directory of Open Access Journals (Sweden)

    Maryam Dorpar

    2014-07-01

    Full Text Available  Abstract In following article, surveys and researches about Persian style were categorized in two branches of historical and formalistic styles Mohammad Taghi Bahar founded stylistics as an autonomous knowledge by publishing his book, History of the evolution of Persian prose (1331, for teaching in University of Tehran. This book which has been influenced by verbal instructions of qajar dynasty’s scholars made the way generally has been followed by researchers in Persian stylistics up to now. However, researchers and critics have introduced various theories and approaches during last four decades.  Stagnation in Persian stylistic researches is the main problem which is considered in current article. The main questions are: What branch of stylistics should be the performed Persian stylistic researches? “what are the weak points of surveys” and “what should be done for getting rid of this stagnation?” The main objective of current article is taking steps for removing stagnation from Persian stylistics.  Malek osh-Shoara Bahar used periodization in studying prose styles and analyzed revolution of Persian prose in lexical aspect (obsolete words, Arabic words, synonyms, words repetition, morphological aspect (verbal prefixes, comparative adjective suffixes, syntactic aspect (precedence of verb over its belongings, omission of verbs and rhetorical aspect (simile and allegory, metonymy and metaphor, prolixity and periphrasis, riming prose and harmony. In fact he tried to show both health and strength and laxity and corruption period of prose. We call Bahar stylistics and all researches done in his way historical stylistics with traditional attitude. In this method, styles' consistency and evolution through history have been studied and preiodization of styles has been taken into account. Mentioned researches periodized styles, finding formal similarities and differences. Since, neglecting meaning and text

  11. A Road to Aesthetic Stylistics

    Directory of Open Access Journals (Sweden)

    Samir Al-Sheikh

    2016-08-01

    Full Text Available Being a linguistic phenomenon, poetry is marked by the defamilarization of language in a poetic discourse there is an aesthetic distortion of  the normal codes, in which the aesthetic value is the most prominent function of the poetic texture . This study is a new  adventure in correlating linguistics to aesthetics by and through the so-called approach Aesthetic stylistics( As. Aesthetic stylistics is the application of the theory of beauty to the intentionally violated components in literary text. It proceeds with the hypothesis that John Keats's Ode on a Grecian Urn and Kabbani's Maritime Poem are disinterested poetic experiences which create ecstatic responses to the reader's awareness, therefore, the judgment of the reader's taste is aesthetic. The study aims at highlighting the stylistic-aesthetic factors which generate the judgment of taste. While drawing heavily on the aestheticism of the Prague Linguistic Circle and Halliday's Functional Linguistics (FL, or what has come to be called the Traditional European Functionalism, the study will analyze Keats' Ode and Kabbani's poem in terms of Kant's Kritik der Urteilstraft, KdU. The two circles of the linguistic description and aesthetic interpretation will be internally interlinked to create the coherence of the stylistic process. The study consists of an introduction, two parts, one in theory and the other in analysis; it is eventually rounded up with concluding remarks elicited from the semiotic quest.  Keywords: Stylistics, Functionalism, Aesthetics

  12. The ‘indisciplinarity’ of stylistics

    Directory of Open Access Journals (Sweden)

    Sorlin Sandrine

    2014-12-01

    Full Text Available This paper aims at showing why the stylistician can be construed as a prolific “impostor” in a most positive sense: pledged to no specific linguistic prophet, she can opt for different theoretical linguistic tools (in the sphere of pragmatics, critical discourse analysis, cognitive grammar, etc. depending on her object of study and what her research question is. The liberty claimed by the stylistician explains why stylistics is the “undisciplined” child of linguistics, shirking any clear definition of its boundaries. It will be argued that stylistics can only exist as a cross-disciplinary field given its conception of language as fundamentally contextualized. If it was a discipline determined by clear-cut pre-established boundaries, stylistics would be far more “disciplined” but would run the risk of serving only itself. The broad goal of this paper is thus to evince that the “indisciplinarity” of stylistics constitutes its very defining essence. With this aim in mind, it will demonstrate what stylistics owes to other disciplines, what it shares with similar language-based disciplines and what it can offer to other fields or practices of knowledge.

  13. Stylistic Analysis of Maya Angelou’s Equality

    Directory of Open Access Journals (Sweden)

    arina isti'anah

    2017-11-01

    Full Text Available This research presented the stylistic analysis of a poem by Maya Angelou, Equality. The poem was chosen as it became Angelou’s one of well-known poems. The Stylistic analysis aimed at comprehending the meanings of either literary or non-literary text by means of observing the language device used in the texts. In this article, the stylistic analysis was conducted to analyze Maya Angelou’s Equality. To achieve the goal of stylistic analysis, there were some language levels to observe; they were phonological, graphological, grammatical, and semantic levels. In the phonological level, the repetition of rhyme in some stanzas, assonance, consonance, and alliteration were used to voice Angelou’s dream about freedom for black people. In the graphological level, the use of prominent punctuation in stanzas 3, 6, and 9 stressed equality as the requirement for the freedom she expected. In the grammatical level, Angelou used pronoun I and you as the dominant words in the poem, revealed different class the poet experienced in the country. The use of metaphors in the poem brought the same meaning as freedom, voice, effort, and racism that black people experienced in America. This research concludes that stylistics applies to analyze literary work so that thorough appreciation to it can be achieved.

  14. Stylistic devices in comical proverbs

    Directory of Open Access Journals (Sweden)

    Burmistrova L. V.

    2017-04-01

    Full Text Available the article analyses stylistic devices in Russian and English comical proverbs. The author shows their influence on the content of comical proverbs and reveals a comic effect in them.

  15. MODERN LINGUODIDACTIC ASPECTS OF COGNITIVE APPROACH REALIZATION IN TEACHING STYLISTICS OF THE UKRAINIAN LANGUAGE TO STUDENTS

    Directory of Open Access Journals (Sweden)

    Anzhelika Popovych

    2017-09-01

    Full Text Available An approach to teaching stylistics – is a fundamental methodological category that defines the system of studying discipline, the ways of organizing the teaching material and the peculiarities of the interaction of all components of the educational process: principles, methods, ways of teaching. The linguocognitive approach in the study of stylistics aims at identifying aspects of the speech world picture, interpreting texts from the standpoint of cognitive processes, forming the cognitive and linguistic culture of students and the corresponding way of linguistic expression. The following levels of linguocognitive approach to the study of stylistics in higher education are distinguished, such as knowledge, practical and educational levels. The knowledge level involves students studying the foundations of cognitive linguistics and cognitive stylistics, systematic consideration of cognitive structures and processes, understanding the meaning of «concept» and interpreting the language and aesthetic characters of national culture. The perception of the text, its decoding, as well as the production are realized on a practical level. The educational level is aimed at forming the national linguistic and speech consciousness; respect for Ukrainian language traditions; education of speech culture; the desire to follow the aesthetic and ethical norms of communication. According to the contemporary aspects of the development of linguistic and linguistic-stylistic science, not only the clarification of the linguistic structural-level stylistic features of texts, the presence of traces and stylistic figures, but the identification of aspects of the linguistic picture of the world, the linguistic and aesthetic signs of national culture are relevant. Therefore, the cognitive-stylistic analysis of the text will be appropriate for the lessons of stylistics. The linguocognitive approach to the study of the stylistics of the Ukrainian language is extremely

  16. Some Stylistic Aspects of Social Advertising in Russia

    Directory of Open Access Journals (Sweden)

    Aigul F. Khanova

    2017-10-01

    Full Text Available The article considers some stylistic aspects of advertising database in Russia. It examines linguistic and stylistic properties and peculiarities of social advertising and the impact it has on public consciousness. It determines that social advertisements in Russia are characteristic of the vocabulary belonging to the low language norms which reflects cultural and ethical context. Figurative language and stylistic devices aim at appealing to emotions and make the advertisement more memorable. The authors deem it necessary to create a common database on social advertising in Russia in order to facilitate the analysis of economic impact and evaluate the capacity to exert effect on mainstream audience as well as determine strategies to build advertising campaigns.

  17. SAW Classification Algorithm for Chinese Text Classification

    OpenAIRE

    Xiaoli Guo; Huiyu Sun; Tiehua Zhou; Ling Wang; Zhaoyang Qu; Jiannan Zang

    2015-01-01

    Considering the explosive growth of data, the increased amount of text data’s effect on the performance of text categorization forward the need for higher requirements, such that the existing classification method cannot be satisfied. Based on the study of existing text classification technology and semantics, this paper puts forward a kind of Chinese text classification oriented SAW (Structural Auxiliary Word) algorithm. The algorithm uses the special space effect of Chinese text where words...

  18. THE COMPOSER AND FOLKLORE PROBLEM: FACTORS OF STYLISTIC STRUCTURE

    Directory of Open Access Journals (Sweden)

    COCEAROVA GALINA

    2017-12-01

    Full Text Available This paper continues the author’s earlier study of the Composer and Folklore problem from the stylistic point of view. It is noted that in academic music, where the attention is focused not only on the speech or text characteristics, but primarily on the linguistic and stylistic material of folklore, the appeal to folk sources leads to the emergence of a number of stylistic factors, both, in the formation of the national style, and in the field of ethnic culture as a whole and integral stable system. The research points to the role of folklore as the genetic code of ethnic culture, as well as to other factors acting at on the level ,of musical discourse and musical language, contributing to the formation of „language flexibility” (A. Kolmogorov and, as a result, „flexibility of style”.

  19. A journey through the stylistics of poetry

    DEFF Research Database (Denmark)

    Jensen, Kim Ebensgaard

    2015-01-01

    Review of Peter Verdonk, The Stylistics of Poetry: Context, Cognition, Discourse, History. (Series: Advances in Stylistics). London: Bloomsbury, 2013, xi + 198 pp., ISBN 978-1-4411-5878-9.......Review of Peter Verdonk, The Stylistics of Poetry: Context, Cognition, Discourse, History. (Series: Advances in Stylistics). London: Bloomsbury, 2013, xi + 198 pp., ISBN 978-1-4411-5878-9....

  20. An Introduction to Literary Quaranic Stylistics

    Science.gov (United States)

    Almenoar, Lubna

    2010-01-01

    A stylistic analysis is one approach of analyzing a literary text using literary descriptions. The use of literary texts in the literature classroom has been limited to mostly Western sources. This paper is an attempt to create an awareness of the linguistic features present in the English language translations of the meaning of the Quran. The…

  1. From defamiliarization to foregrounding and defeated expectancy: Linguo-stylistic and cognitive sketch

    Directory of Open Access Journals (Sweden)

    Kupchyshyna Yuliya

    2017-12-01

    Full Text Available The article focuses on revealing the nature of defamiliarization, foregrounding, and defeated expectancy from a linguo-stylistic and cognitive perspective. It has been stated that defamiliarization, composed by different types of foregrounding and defeated expectancy as deviation, generated with a certain stylistic purpose are complex phenomena. The article highlights cognitive factors which ensure the creation of defamiliarization and defeated expectancy in the literary texts.

  2. DEVELOPMENT OF FOREIGN LANGUAGE STYLISTIC COMPETENCE OF FUTURE PHILOLOGISTS: GRAMMATICAL ASPECT

    Directory of Open Access Journals (Sweden)

    Олена Вовк

    2015-05-01

    Full Text Available The article studies a grammatical aspect of developing stylistic competence of students of linguistic departments. Particularly, the stylistic competence which is defined as a capacity to create adequate utterances under natural conditions of communication according to a concrete situation is characterized. To highlight the importance of acquiring stylistic competence the levels of speech development of an individual are indentified and the stages of teaching grammar are differentiated. The approaches to teaching stylistic grammar are characterized within a communicative framework and relevant skills are elucidated. The role of functional styles in teaching a foreign language is clarified. The idea of teaching students to be able to make register shifts and mixture of speech registers in the process of foreign language competence acquiring are highlihgted. The theoretical principles are illustrated with the appropriate examples of exercises.

  3. Discourse Analysis in Stylistics and Literature Instruction.

    Science.gov (United States)

    Short, Mick

    1990-01-01

    A review of research regarding discourse analysis in stylistics and literature instruction covers studies of text, systematic analysis, meaning, style, literature pedagogy, and applied linguistics. A 10-citation annotated bibliography and a larger unannotated bibliography are included. (CB)

  4. Arabic text classification using Polynomial Networks

    Directory of Open Access Journals (Sweden)

    Mayy M. Al-Tahrawi

    2015-10-01

    Full Text Available In this paper, an Arabic statistical learning-based text classification system has been developed using Polynomial Neural Networks. Polynomial Networks have been recently applied to English text classification, but they were never used for Arabic text classification. In this research, we investigate the performance of Polynomial Networks in classifying Arabic texts. Experiments are conducted on a widely used Arabic dataset in text classification: Al-Jazeera News dataset. We chose this dataset to enable direct comparisons of the performance of Polynomial Networks classifier versus other well-known classifiers on this dataset in the literature of Arabic text classification. Results of experiments show that Polynomial Networks classifier is a competitive algorithm to the state-of-the-art ones in the field of Arabic text classification.

  5. A Stylistic Analysis of D.H. Lawrence’s ‘Sons and Lovers’

    Directory of Open Access Journals (Sweden)

    Nozar Niazi

    2013-05-01

    Full Text Available This paper aims at analyzing D.H. Lawrence’s ‘Sons and Lovers’ using a stylistic approach. Stylistics is a study of the amalgamation of form with content. The stylistic analysis of a novel goes beyond the traditional, intuitive interpretation, because it combines intuition and detailed linguistic analysis of the text. The defining elements of modern language are within the text itself, not prescribed from outside. With modernist texts, usually understanding comes from close study of the language system defined within the text itself. Form, technique and style are considered not as a mere vehicle of the content of the story, but an integral part of the work’s meaning and value. In our analysis of ‘Sons and Lovers’ the resources of language: lexis, syntax, phonology, figurative language, cohesion and coherence, are discussed in relation to the style of discourse in order to explore hidden meanings in the text. The resources of language are shown to be an essential part of the meaning of the novel.

  6. On stylistic automatization of lexical units in various types of contexts

    Directory of Open Access Journals (Sweden)

    В В Зуева

    2009-12-01

    Full Text Available Stylistic automatization of lexical units in various types of contexts is investigated in this article. Following the works of Boguslav Havranek and other linguists of the Prague Linguistic School automatization is treated as a contextual narrowing of the meaning of a lexical unit to the level of its complete predictability in situational contexts and the lack of stylistic contradiction with other lexical units in speech.

  7. About the role of stylistic and syntactic devices of expansion in the informational complex of dicteme of a German advertising text

    Directory of Open Access Journals (Sweden)

    Артур Нарманович Мамедов

    2012-12-01

    Full Text Available The article highlights stylistic and syntactic devices of expansion, which act as compositional means, vary normative syntactic structure of an advertising text, contribute to sense formation, creating conditions for the purpose of advertiser’s intent. By means of these language elements expressing invariant tactic sense the advertiser consciously expands and/or complicates the informative complex of dicteme, an acting text unit, transmitting superfluous impressive information together with factual one. Combination of factual and impressive items of information activates both rational and emotional perceptional channels of prospective consumer, intensifies the positioning process of an advertised article.

  8. Active Learning for Text Classification

    OpenAIRE

    Hu, Rong

    2011-01-01

    Text classification approaches are used extensively to solve real-world challenges. The success or failure of text classification systems hangs on the datasets used to train them, without a good dataset it is impossible to build a quality system. This thesis examines the applicability of active learning in text classification for the rapid and economical creation of labelled training data. Four main contributions are made in this thesis. First, we present two novel selection strategies to cho...

  9. Translating children’s literature: some insights from corpus stylistics

    Directory of Open Access Journals (Sweden)

    Anna Čermáková

    2018-01-01

    Full Text Available In this paper I explore the potential of a corpus stylistic approach to the study of literary translation. The study focuses on translation of children’s literature with its specific constrains, and illustrates with two corpus linguistic techniques: keyword and cluster analysis — specific cases of repetition. So in a broader sense the paper discusses the phenomenon of repetition in different literary (stylistic traditions. These are illustrated by examples from two children’s classics aimed at two different age groups: the Harry Potter and the Winnie the Pooh books — and their translations into Czech. Various shifts in translation, especially in the translation of children’s literature, are often explained by the operation of so-called ‘translation universals’. Though ‘repetition’ as such does not belong to the commonly discussed set of translation universals, the stylistic norms opposing repetition seem to be a strong explanation for the translation shifts identified.

  10. Consideration on the history and the stylistic use of prefixes

    Directory of Open Access Journals (Sweden)

    Antonio Carlos Silva de Carvalho

    2016-07-01

    Full Text Available This paper aims to discuss prefixes from a historical perspective, as well as to observe nuances of stylistic value in them. The choice of subject was basically due to two reasons: (i the considerations set out by Martins (2003 on the low stylistic productivity caused by prefixal derivation – especially if compared to suffixal derivation; and (ii the considerations set out by Silva (2009 on the so-called de-language, of negativity, in Manoel de Barros. At first we worked on a brief historically and etymologically-oriented incursion on prefixes; and then, subjecting the reflections we gathered to a punctual corpus by the author, we highlighted examples in which features of the morpho-stylistic nature that contribute to the singularity of his work, also linked to the aesthetics of the fragmentary and to the smallest beings, can be explored.

  11. A contrastive-stylistic study into the tense distribution in English and Slovene fictional texts

    Directory of Open Access Journals (Sweden)

    Silvana Orel Kos

    2008-12-01

    Full Text Available The article addresses contrastive and narratological issues of the unity vs. diversity of temporal spheres in fictional texts. It focuses on the presentation of mimetic discourse within the past time-sphere narrative, trying to establish the narrative or stylistic functions of the present and past time-sphere verb actions with respect to the role of the narrator or that of the character. the diegetic and mimetic functions of verb actions in certain temporal spheres, ie. tense usage in (free indirect discourse (free direct discourse, will be contrastively studied in original fictional texts and their translations, in both directions between english and Slovene. the character’s mimetic discourse may be presented through different narrative forms, spanning the report-control cline from the forms “in total control” of the character, ie. free direct discourse, to that “apparently in total control” of the narrator, ie. speech act and thought act report (cf. Leech and Short 1981: 324. in addition to the character’s verbal and mental responses, the study includes mediated instances of the character’s sensory responses, the basic formula thus being: He said that/thought that/saw that. Our contrastive analysis considers only fictional texts whose diegesis is rendered   in the narrative past tenses, as the english language system observes the sequence of tenses, while the Slovene language does not. the diegesis of a fictional text may be completely located in the present time-sphere, yet such texts do not present any major issues in terms of contrastive relevance for the studied language pair.

  12. Style and creativity: towards a theory of creative stylistics

    OpenAIRE

    Yoshifumi, Saitō

    1997-01-01

    The purpose of this thesis is to present a new theory of creative stylistics as an antithesis to traditional description-oriented stylistics. For this purpose it undertakes: (1) a selective historical survey of stylistics with special attention to its academic formation in the context of the theoretical dissociation between linguistics and literary criticism (Chapter 1), (2) a theoretical survey of stylistics with special attention to the way it has been defined and subcategorized (Cha...

  13. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...... the accuracy at the same time. The test example is classified using simpler and smaller model. The training examples in a particular cluster share the common vocabulary. At the time of clustering, we do not take into account the labels of the training examples. After the clusters have been created......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...

  14. Stylistic Analysis of the Short Story ‘The Last Word’ by Dr. A. R. Tabassum

    Directory of Open Access Journals (Sweden)

    Abdul Bari Khan

    2015-06-01

    Full Text Available In this article stylistic analysis of short story ‘The Last Word’ by Dr. A. R. Tabassum is performed.  The formative elements of the story, such as point of view, characters and allegorical element, are discussed in detail so as to give a better insight of the story. The story is analyzed stylistically in terms of figures of speech where grammatical, lexical and phonological schemes are considered, following the checklist of linguistic and stylistic categories proposed by Leech and Short. Features of repetition, parallelism, alliteration, consonance, assonance and rhyme are focused on. Finally, the findings and conclusion is given to sum up the discussion. Keywords: stylistics, analysis, short story, last word, allegory, Tabassum

  15. LEXICO-STYLISTIC CHOICES AND MEDIA IDEOLOGY IN NEWSPAPER REPORTS ON NIGER DELTA CONFLICTS

    Directory of Open Access Journals (Sweden)

    Chuka Fred Ononye

    2017-05-01

    Full Text Available Media reports on Niger Delta (Henceforth, ND conflicts have reflected a relationship between lexico-stylistic choices and media ideologies. The existing media studies on the discourse have predominantly utilised pragmatic, stylistic and discourse analytical tools in presenting and labelling discourse participants and/or their ideologies, but neglected how media ideologies can be revealed through lexico-stylistic choices made in the reports. This paper therefore examines the lexico-stylistic choices in the reports in order to establish their link to specific ideological goals of the newspapers in relaying the conflict news. Forty reports on ND conflicts, published between 2003 and 2007, sampled from two ND-based (The Tide and Pioneer and two national (The Punch and THISDAY, labelled newspapers, were subjected to stylistic and critical analyses, with insights from structural (relational semantics and aspects of stylistics discourse. Two broad lexical stylistic choices are identified, including paradigmatic (61.8%—indexed by synonymous, antonymous, hyponymous, colloquial, and register items, and coinages and syntagmatic (38.2%—marked by collocations, metaphors, pleonasms, and lexical fields features. The features are utilised for three ideological ends; namely, picking out and framing participants as perpetrators of the violence in the discourse, evaluating specific entities and their roles in the conflicts, and reducing the impact of the activities of the news actors. Although there are overlaps, the evaluative ideology is largely associated with the national newspaper, the impact reduction ideology with the ND-based newspapers, while the framist ideology is observed in the two sets of newspapers. With these findings the study has added the lexical stylistics angle to the existing scholarship on ND conflict news discourse. Thus, the newspaper reports on ND conflicts are motivated by their ideological goals to change the reader’s outlook on

  16. DYNAMIC FEATURE SELECTION FOR WEB USER IDENTIFICATION ON LINGUISTIC AND STYLISTIC FEATURES OF ONLINE TEXTS

    Directory of Open Access Journals (Sweden)

    A. A. Vorobeva

    2017-01-01

    Full Text Available The paper deals with identification and authentication of web users participating in the Internet information processes (based on features of online texts.In digital forensics web user identification based on various linguistic features can be used to discover identity of individuals, criminals or terrorists using the Internet to commit cybercrimes. Internet could be used as a tool in different types of cybercrimes (fraud and identity theft, harassment and anonymous threats, terrorist or extremist statements, distribution of illegal content and information warfare. Linguistic identification of web users is a kind of biometric identification, it can be used to narrow down the suspects, identify a criminal and prosecute him. Feature set includes various linguistic and stylistic features extracted from online texts. We propose dynamic feature selection for each web user identification task. Selection is based on calculating Manhattan distance to k-nearest neighbors (Relief-f algorithm. This approach improves the identification accuracy and minimizes the number of features. Experiments were carried out on several datasets with different level of class imbalance. Experiment results showed that features relevance varies in different set of web users (probable authors of some text; features selection for each set of web users improves identification accuracy by 4% at the average that is approximately 1% higher than with the use of static set of features. The proposed approach is most effective for a small number of training samples (messages per user.

  17. Parcellation as a stylistic dominant characteristic in the novel Nigdina by Svetlana Velmar Janković

    Directory of Open Access Journals (Sweden)

    Mimović Milica P.

    2015-01-01

    Full Text Available The subject of this research is parcellation as a stylistic dominant characteristic in the novel 'Nigdina' by Svetlana Velmar Janković, and also as the means of expression which is superior in regard to the other linguistic procedures. Considering the aim of syntactic-stylistic analysis in this paper, the examples were divided according to their syntactic functions identifying and interpreting syntactic structures that change word order, and simultaneously point out stylistic marking of the parcellated structure. The most frequent structures are the ones with parcellated positions of adjuncts. Adverbs and constructions with prepositions and case are comparably found in these syntactic positions. Moreover, the most frequent are adjuncts for the parcellation of syntactic position, then intonational and positional emphasis of apposition, and finally the parcellation of elements such as object, subject and an attributive. In conclusion, this procedure of making syntactic units independent, their frequency and superiority to the other stylistic procedures, contribute to the style of this novel, and point out parcellation as a dominant stylistic characteristic of this novel.

  18. Academic origins and characteristics of the Chinese stylistic restoration

    Directory of Open Access Journals (Sweden)

    Xi Chen

    2016-09-01

    Full Text Available The conservation practice in China, termed “Chinese stylistic restoration” in this study, has been influenced by the traditional Chinese philosophy and construction principles, the modern Chinese conservation theory of Liang Sicheng and Liu Dunzhen, and Western and international theories and policies concerning conservation. This study uses three case studies, namely, Shanghai Zhenru Temple, Jianfu Palace Garden, and Angkor Wat Chau Say Tevoda Temple, to demonstrate the main characteristics of the Chinese stylistic restoration, including its emphasis on style over authenticity, pursuit of a gestalt form, and flexible attitude toward reconstruction. Accordingly, these practices have shaped the current Chinese conservation theory as reflected in the case studies reported in “Principles for the Conservation of Heritage Sites in China” and the Qufu Declaration.

  19. A Stylistic Research of Western Advertisements

    Institute of Scientific and Technical Information of China (English)

    翟蕾

    2014-01-01

    The research involves the following two parts: the first part is the analysis of the advertising language;the second part is to analyze the register, namely apply the advertising context to a wider social context to find a more effective communicative means. The stylistics enables one to make the discourse analysis of the advertising texts both from a microcosmic and a macroscopic perspective. The twofold demonstrative function points out a new theoretical way for advertising research.

  20. Comparison of Aminpour’s Qhazal and Qhazve‘s Qhazal Based on Structural Stylistics

    Directory of Open Access Journals (Sweden)

    Ahmad Rezae

    2014-12-01

    Full Text Available Abstract The importance of stylistics in examining the texts has resulted in the burgeoning of various stylistic schools, with their differing methodologies. Among them, structural stylistics – which is the more polished version of formal stylistics – looks over the structure of words, sounds and syntax in the text and then focuses on their relation to the content. In fact, structural stylistics, drawing upon the structuralists views, is formed on the basis of structural linguistics, and works to analyze a literary text with regard to its organic unity and the inter-relationship of the parts to the whole. In other words, the main purpose is to approach the content of a work through its form and structure. In this method, stylistic features of the work are recognized through understanding the structural proportions between sounds, words and syntax. Accordingly, the structuralists regard the style as the manner of deviation and extra-regularity and the frequency of its occurrence in a particular era. The present article, first, refers to the definitions of style and stylistic schools and enumerates the features of structural stylistics and then sets out to study and compare two ghazals on the subject of the Holy Defense by Gheisar Aminpour ("Taghvimha" or "Calendars" and Alireza Ghazve ("Ghesmat" or "Destiny", in the light of structural stylistics. Through analyzing the different parts of the texts, with regard to balances and deviation, we will deal with the relatedness and proportion of these parts to the content. "Calendars" is among the best-known ghazals of Aminpour. It contains the issues of feeling ashamed of martyrs, lamenting over our negligence, and feeling left away from the martyrs. A special sense of grief and sadness, hidden in the particular rhythm and cadence of the words and combinations, helps the poet to express his feelings and thoughts. The ghazal "Destiny" deals also with the distress and exhaustion the poet feels in this

  1. Food and Beverage Stylist and Photography

    OpenAIRE

    BEKAR, Aydan; KARAKULAK, Çisem

    2016-01-01

    A food and beverage stylist makes food and beverage look appetizing by preaparing them properly in order to get customers’ attention. A food and beverage photographer gets the most impressive image by using different shooting techniques. Food and beverage stylists and phtographers prepare attractive and unusual menus ,brochures, banners and ads for food and beverage enterprises so that products can look better when customers see them. People see the works of food and beverage styling and phot...

  2. A Stylistic Analysis of Four Translations of J. D. Salinger's The Catcher in the Rye

    Directory of Open Access Journals (Sweden)

    Silva Bratož

    2004-12-01

    Full Text Available The paper looks at stylistic differences between four translations of J. D. Salinger’s Catcher in the Rye – two Slovene translations, a Serbo-Croatian, and an Italian translation. Firstly, stylistic components relevant to the novel in question are identified. In this respect, the translation of teenage speech and idiom appears to be not only the most conspicuous stylistic feature of the original but also the hardest to translate. Secondly, the ways in which the different translations have rendered certain formal and lexical features of style are compared by determining and describing their function. A large number of examples have been submitted to critical scrutiny, of which only a few representative ones are listed and explained in the paper. Finally, this paper points to some particular difficulties of the four translators in their attempts to reproduce the stylistic components of the original.

  3. Stylistic analysis of songs in beverage advertisement

    Institute of Scientific and Technical Information of China (English)

    周双卉

    2012-01-01

    With the development of the advertisement,people tend to study the stylistic analysis of it.However,in this paper,the focus will be on the songs in beverage advertisement.The analysis will be focused on the features of the beverage advertisement songs and the stylistics of it.The aim of the paper is to improve the people and the scholars' understanding of the beverage advertisement songs.

  4. A Stylistic Analysis of Complexity in William Faulkner's "A Rose for Emily"

    Science.gov (United States)

    Abdurrahman, Israa' Burhanuddin

    2016-01-01

    Applying a stylistic analysis on certain texts refers to the identification of patterns of usage in writing. However, such an analysis is not restricted just to the description of the formal characteristics of texts, but it also tries to elucidate their functional importance for the interpretation of the text. This paper highlights complexity as a…

  5. Categorization and Pathology of Persian Stylistic Researches

    OpenAIRE

    Maryam Dorpar

    2014-01-01

     Abstract In following article, surveys and researches about Persian style were categorized in two branches of historical and formalistic styles Mohammad Taghi Bahar founded stylistics as an autonomous knowledge by publishing his book, History of the evolution of Persian prose (1331), for teaching in University of Tehran. This book which has been influenced by verbal instructions of qajar dynastyâs scholars made the way generally has been followed by researchers in Persian stylistics up ...

  6. MODERN LINGUODIDACTIC ASPECTS OF COGNITIVE APPROACH REALIZATION IN TEACHING STYLISTICS OF THE UKRAINIAN LANGUAGE TO STUDENTS

    OpenAIRE

    Popovych, Anzhelika

    2017-01-01

    An approach to teaching stylistics – is a fundamental methodological category that defines the system of studying discipline, the ways of organizing the teaching material and the peculiarities of the interaction of all components of the educational process: principles, methods, ways of teaching. The linguocognitive approach in the study of stylistics aims at identifying aspects of the speech world picture, interpreting texts from the standpoint of cognitive processes, forming the cognitive an...

  7. Social Media Text Classification by Enhancing Well-Formed Text Trained Model

    Directory of Open Access Journals (Sweden)

    Phat Jotikabukkana

    2016-09-01

    Full Text Available Social media are a powerful communication tool in our era of digital information. The large amount of user-generated data is a useful novel source of data, even though it is not easy to extract the treasures from this vast and noisy trove. Since classification is an important part of text mining, many techniques have been proposed to classify this kind of information. We developed an effective technique of social media text classification by semi-supervised learning utilizing an online news source consisting of well-formed text. The computer first automatically extracts news categories, well-categorized by publishers, as classes for topic classification. A bag of words taken from news articles provides the initial keywords related to their category in the form of word vectors. The principal task is to retrieve a set of new productive keywords. Term Frequency-Inverse Document Frequency weighting (TF-IDF and Word Article Matrix (WAM are used as main methods. A modification of WAM is recomputed until it becomes the most effective model for social media text classification. The key success factor was enhancing our model with effective keywords from social media. A promising result of 99.50% accuracy was achieved, with more than 98.5% of Precision, Recall, and F-measure after updating the model three times.

  8. Research on Classification of Chinese Text Data Based on SVM

    Science.gov (United States)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  9. Stylistic Variation In Three English Translations Of The Dead Sea ...

    African Journals Online (AJOL)

    Since the discovery of the Dead Sea Scrolls in 1947 different English translations were published. In this article the stylistic variation of three of these translations are analysed. It is suggested that the issue of stylistic variation boils down to linguistically inscribed preference in the choice and construction of discourses in the ...

  10. CCM: A Text Classification Method by Clustering

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    In this paper, a new Cluster based Classification Model (CCM) for suspicious email detection and other text classification tasks, is presented. Comparative experiments of the proposed model against traditional classification models and the boosting algorithm are also discussed. Experimental results...... show that the CCM outperforms traditional classification models as well as the boosting algorithm for the task of suspicious email detection on terrorism domain email dataset and topic categorization on the Reuters-21578 and 20 Newsgroups datasets. The overall finding is that applying a cluster based...

  11. Hot complaint intelligent classification based on text mining

    Directory of Open Access Journals (Sweden)

    XIA Haifeng

    2013-10-01

    Full Text Available The complaint recognizer system plays an important role in making sure the correct classification of the hot complaint,improving the service quantity of telecommunications industry.The customers’ complaint in telecommunications industry has its special particularity which should be done in limited time,which cause the error in classification of hot complaint.The paper presents a model of complaint hot intelligent classification based on text mining,which can classify the hot complaint in the correct level of the complaint navigation.The examples show that the model can be efficient to classify the text of the complaint.

  12. Beyond the Law of Transitivity:A Functional Stylistic Study of Maya Angelou's I Know Why the Caged Bird Sings

    Directory of Open Access Journals (Sweden)

    Muthanna Makki Muhammed

    2017-03-01

    Full Text Available The dominant critical focus on Maya Angelou’s writings has been on the thematic features of her texts. Linguistic and stylistic appraisals on her works are generally sparse. This paper is a stylistic study of Maya Angelou’s autobiographical novel I Know Why the Caged Bird Sings. It aims at examining the stylistic features of the text vis-à-vis the semantic Law of Transitivity so as to investigate the features that contribute in the discourse’s trespassing the sphere of informing to the sphere of interaction and influence. The paper starts with brief notes on stylistics in relation to semantics. This is followed by a discussion of the Law of Transitivity, frequent references are made to John R. Searle’s patterns of metaphor. The varied forms of the relations between the signified or the source (the vehicle and the signifier or the target (the tenor in relation to the sign (the common ground are discussed in the light of the figurative devices employed by the author and the functions achieved in revealing the ideological issues of race and gender in the book. The study attempts also at positioning the formal and psychological elements within a sociocultural context in order to promote the reader’s understanding of the purposes and functions to which certain linguistic choices are made.

  13. Chiasmus as a Stylistic Device in Donne's and Vaughan's Poetry

    Science.gov (United States)

    I'jam, Dunya Muhammad Miqdad; Fadhil, Zahraa Adnan

    2016-01-01

    This study investigates chiasmus as a stylistic device in ten metaphysical poems (five for John Donne and five for Henry Vaughan). It aims at showing how both, Donne and Vaughan, utilize chiasmus at the different linguistic levels as a stylistic device in their poetry. Thus, to achieve this aim, it is hypothesized that chiasmus as used by Donne…

  14. Stylistic Analysis of Roald Dahl’s Cinderella

    Directory of Open Access Journals (Sweden)

    Henni Henni

    2010-01-01

    Full Text Available The paper presents a stylistic analysis of a rhyme, Cinderella, taken from Dahl’s rhyme collection, Revolting Rhymes. Roald Dahl is famous for his ability in creating extraordinary stories, in which linguistic elements, such as sounds and words, are manipulated to create an amusing story that has an unpredictable plot. The discussion covers an analysis of the narrative structure and the linguistic style applied in the rhyme, together with the discussion of the author’s purpose of applying such style. From the analysis it is found out that the style Dahl applies in the rhyme is especially useful for foregrounding.

  15. Text mining in the classification of digital documents

    Directory of Open Access Journals (Sweden)

    Marcial Contreras Barrera

    2016-11-01

    Full Text Available Objective: Develop an automated classifier for the classification of bibliographic material by means of the text mining. Methodology: The text mining is used for the development of the classifier, based on a method of type supervised, conformed by two phases; learning and recognition, in the learning phase, the classifier learns patterns across the analysis of bibliographical records, of the classification Z, belonging to library science, information sciences and information resources, recovered from the database LIBRUNAM, in this phase is obtained the classifier capable of recognizing different subclasses (LC. In the recognition phase the classifier is validated and evaluates across classification tests, for this end bibliographical records of the classification Z are taken randomly, classified by a cataloguer and processed by the automated classifier, in order to obtain the precision of the automated classifier. Results: The application of the text mining achieved the development of the automated classifier, through the method classifying documents supervised type. The precision of the classifier was calculated doing the comparison among the assigned topics manually and automated obtaining 75.70% of precision. Conclusions: The application of text mining facilitated the creation of automated classifier, allowing to obtain useful technology for the classification of bibliographical material with the aim of improving and speed up the process of organizing digital documents.

  16. THE DYNAMICS OF STYLISTICALLY MARKED VERBAL LEXIS IN THE INFINITIVE FORM IN THE RUSSIAN LITERARY CRITICISM OF THE MIDDLE AND SECOND HALF OF THE 19th CENTURY

    Directory of Open Access Journals (Sweden)

    Yakovenko Larisa Aleksandrovna

    2014-06-01

    Full Text Available The article studies the functioning of stylistically marked verbal lexis in the infinitive form in literary critical articles of Russian publicists of the middle and second half of the 19th century. The critical texts of that period are characterized by the use of different functional, stylistic and expressive emotional coloring verbal lexemes. The author reveals the lexical content of infinitive forms, determines the markedness character (functional and stylistic, or expressive and emotional. The article presents the dynamics of using infinitive forms which shows that in the texts of 19th century they are used to express critics' attitude to fiction works, litetrary images, and this attitude is determined by publicists' ideas about the ways of reality depiction. It is revealed that in the second half of 19th century this form reflects the urge to evaluate the social maturity and fiction skills of a writer, and that serves to increasing number of stylistically marked lexemes in the texts of that period.

  17. A (FORENSIC STYLISTIC ANALYSIS OF ADVERBIALS OF ATTITUDE AND EMPHASIS IN SUPREME COURT DECISIONS IN PHILIPPINE ENGLISH

    Directory of Open Access Journals (Sweden)

    Hjalmar Punla Hernandez

    2017-09-01

    Full Text Available Contemporarily, stylistics today has developed into its multiplicity – one of which is forensic stylistics. Being a powerfully legal written discourse, Supreme Court decisions are a rich corpus in which linguistic vis-a-vis stylistic choices of Court justices could be examined. This study is a humble attempt at stylistically analyzing Supreme Court decisions in Philippine English (PhE drafted by two Filipino justices. Specifically, it sought to investigate on the classes, placements, and environments of adverbials of attitude and emphasis employed by the two justices, and drew their implications to teaching and learning English for Legal Purposes (ELP. Using McMenamin (2012, Quirk, Greenbaum, Leech, and Svartvik (1985, and Dita’s (2011 frameworks, 54 randomly selected Supreme Court decisions as primary sources of legal language were analyzed. Results are the following. Firstly, the classes of adverbials of attitude in Supreme Court decisions in PhE used by the two judges were the evaluation to the subject of the clause, judgment to the whole clause, and evaluation to an action performed by the subject of the clause, while those adverbials of emphasis were adverbials of conviction and doubt. Secondly, both adverbials they used have placements that were frequently medial and less initial in sentences where they belonged. Thirdly, the two justices put their adverbials within two principal environments, i.e. within functor, and before/after the verb among others. In these regards, legal and stylistic explanations with respect to these recurrent linguistic features in the two justices’ Court decisions were revealed. Implications of the study to ELP are explained. Lastly, trajectories for future (forensic stylistic analyses have been recommended.

  18. THE PRODUCT DESIGN PROCESS USING STYLISTIC SURFACES

    Directory of Open Access Journals (Sweden)

    Arkadiusz Gita

    2017-06-01

    Full Text Available The increasing consumer requirements for the way what everyday use products look like, forces manufacturers to put more emphasis on product design. Constructors, apart from the functional aspects of the parts created, are forced to pay attention to the aesthetic aspects. Software for designing A-class surfaces is very helpful in this case. Extensive quality analysis modules facilitate the work and allow getting models with specific visual features. The authors present a design process of the product using stylistic surfaces based on the front panel of the moped casing. In addition, methods of analysis of the design surface and product technology are presented.

  19. A Stylistic Analysis of the Dialogues in Pirates of the Caribbean: On Strange Tides%A Stylistic Analysis of the Dialogues in Pirates of the Caribbean:On Strange Tides

    Institute of Scientific and Technical Information of China (English)

    李冯茹

    2017-01-01

    Dialogues in classical films are always the concentrated scripts studied by scholars. This thesis performs a stylistic analysis of dialogues from Pirates of the Caribbean: On Strange Tides at the levels of phonology, lexicon, syntax, semantics and pragmatics to make a good attempt in the application of stylistic analysis.

  20. Refutation of stylistic constructs in palaeolithic rock art

    International Nuclear Information System (INIS)

    Bednarik, R.G.

    1995-01-01

    This paper describes the first experiment of applying a series of dating methods at a single rock art site in a ''blind test''. The rock art in question, in northeastern Portugal, had been unanimously attributed to the Upper Palaeolithic by stylistic comparison. Four independent assessments have produced the identical result that the rock art is in fact of the second half of the Holocene, and mostly under 3,000 years old. This finding is compared with other recent dating results which together show that stylistic dating is not an admissible method of determining the age of Palaeolithic art. (author). 17 refs., 1 fig., 1 photo

  1. Empirical Studies On Machine Learning Based Text Classification Algorithms

    OpenAIRE

    Shweta C. Dharmadhikari; Maya Ingle; Parag Kulkarni

    2011-01-01

    Automatic classification of text documents has become an important research issue now days. Properclassification of text documents requires information retrieval, machine learning and Natural languageprocessing (NLP) techniques. Our aim is to focus on important approaches to automatic textclassification based on machine learning techniques viz. supervised, unsupervised and semi supervised.In this paper we present a review of various text classification approaches under machine learningparadig...

  2. Stylistics and the Metaphysics of Poetry

    Science.gov (United States)

    Anderson, Neil

    2007-01-01

    In order to better understand the worth of aesthetic experience in encountering poetry, fresh perspectives are helpful. This paper introduces the reader to modern stylistics: that is linguistic examinations of "the speaker's meaning" in literature and notes such "scientific" approaches to poetry do find common metaphysical ground with leading…

  3. Stylistic analysis of headlines in science journalism: A case study of New Scientist.

    Science.gov (United States)

    Molek-Kozakowska, Katarzyna

    2017-11-01

    This article explores science journalism in the context of the media competition for readers' attention. It offers a qualitative stylistic perspective on how popular journalism colonizes science communication. It examines a sample of 400 headlines collected over the period of 15 months from the ranking of five 'most-read' articles on the website of the international magazine New Scientist. Dominant lexical properties of the sample are first identified through frequency and keyness survey and then analysed qualitatively from the perspective of the stylistic projection of newsworthiness. The analysis illustrates various degrees of stylistic 'hybridity' in online popularization of scientific research. Stylistic patterns that celebrate, domesticate or personalize science coverage (characteristic of popular journalism) are intertwined with devices that foreground tentativeness, precision and informativeness (characteristic of science communication). The article reflects on the implications of including various proportions of academic and popular styles in science journalism.

  4. Classification process in a text document recommender system

    Directory of Open Access Journals (Sweden)

    Dan MUNTEANU

    2005-12-01

    Full Text Available This paper presents the classification process in a recommender system used for textual documents taken especially from web. The system uses in the classification process a combination of content filters, event filters and collaborative filters and it uses implicit and explicit feedback for evaluating documents.

  5. Stylistic Features of Comment in Arabic Blogosphere

    Directory of Open Access Journals (Sweden)

    Gabdulzyamil G. Zaynullin

    2017-11-01

    Full Text Available One of the most important issues in the study of the functioning of the Internet language is the definition of the features of each Internet genre presented in online communication, taking into account the linguocultural features of the language in question. This paper studies the genre of the Internet comments of the Arabic-speaking blogosphere and reveals its stylistic features. The most common goal of the comment is gratitude, followed by praise. We created a corpus of comments from blogs of various subjects, and then conducted the tagging, having identified the group to which we attributed a comment, depending on the subject and the communicative goal. With the help of the Lexico 3 software, the most frequent lexical units were identified, the lexical features of the comments were described, the main one being the widespread use of religionyms, and the relationship between the blog subject and the stylistic characteristics of communication was revealed. The article traces the correlation between the literary and colloquial functional style in the comments, and also draws a conclusion that the comments are of a conversational, informal character. The main devices of expressiveness that are characteristic for both network and pre-network communication were revealed, and the tendency of the analysts to observe in the comments a stable three-part composition (greeting, message, final formula. The influence of traditional Arabic rhetoric, as well as the epistolary genre, was preserved. The results of the paper can be used when studying other genres of Internet communication in Arabic and in comparative studies to create the linguistic software.

  6. Classification of protein-protein interaction full-text documents using text and citation network features.

    Science.gov (United States)

    Kolchinsky, Artemy; Abi-Haidar, Alaa; Kaur, Jasleen; Hamed, Ahmed Abdeen; Rocha, Luis M

    2010-01-01

    We participated (as Team 9) in the Article Classification Task of the Biocreative II.5 Challenge: binary classification of full-text documents relevant for protein-protein interaction. We used two distinct classifiers for the online and offline challenges: 1) the lightweight Variable Trigonometric Threshold (VTT) linear classifier we successfully introduced in BioCreative 2 for binary classification of abstracts and 2) a novel Naive Bayes classifier using features from the citation network of the relevant literature. We supplemented the supplied training data with full-text documents from the MIPS database. The lightweight VTT classifier was very competitive in this new full-text scenario: it was a top-performing submission in this task, taking into account the rank product of the Area Under the interpolated precision and recall Curve, Accuracy, Balanced F-Score, and Matthew's Correlation Coefficient performance measures. The novel citation network classifier for the biomedical text mining domain, while not a top performing classifier in the challenge, performed above the central tendency of all submissions, and therefore indicates a promising new avenue to investigate further in bibliome informatics.

  7. Parcellation as a stylistic dominant characteristic in the novel Nigdina by Svetlana Velmar Janković

    OpenAIRE

    Mimović Milica P.

    2015-01-01

    The subject of this research is parcellation as a stylistic dominant characteristic in the novel 'Nigdina' by Svetlana Velmar Janković, and also as the means of expression which is superior in regard to the other linguistic procedures. Considering the aim of syntactic-stylistic analysis in this paper, the examples were divided according to their syntactic functions identifying and interpreting syntactic structures that change word order, and simultaneously point out stylistic marking of the p...

  8. Comparison of Aminpour’s Qhazal and Qhazve‘s Qhazal Based on Structural Stylistics

    Directory of Open Access Journals (Sweden)

    Somayye Khorshidi

    2014-11-01

    Full Text Available Abstract The importance of stylistics in examining the texts has resulted in the burgeoning of various stylistic schools, with their differing methodologies. Among them, structural stylistics – which is the more polished version of formal stylistics – looks over the structure of words, sounds and syntax in the text and then focuses on their relation to the content. In fact, structural stylistics, drawing upon the structuralists views, is formed on the basis of structural linguistics, and works to analyze a literary text with regard to its organic unity and the inter-relationship of the parts to the whole. In other words, the main purpose is to approach the content of a work through its form and structure. In this method, stylistic features of the work are recognized through understanding the structural proportions between sounds, words and syntax. Accordingly, the structuralists regard the style as the manner of deviation and extra-regularity and the frequency of its occurrence in a particular era. The present article, first, refers to the definitions of style and stylistic schools and enumerates the features of structural stylistics and then sets out to study and compare two ghazals on the subject of the Holy Defense by Gheisar Aminpour ("Taghvimha" or "Calendars" and Alireza Ghazve ("Ghesmat" or "Destiny", in the light of structural stylistics. Through analyzing the different parts of the texts, with regard to balances and deviation, we will deal with the relatedness and proportion of these parts to the content. "Calendars" is among the best-known ghazals of Aminpour. It contains the issues of feeling ashamed of martyrs, lamenting over our negligence, and feeling left away from the martyrs. A special sense of grief and sadness, hidden in the particular rhythm and cadence of the words and combinations, helps the poet to express his feelings and thoughts. The ghazal "Destiny" deals also with the distress and exhaustion the poet feels

  9. TEXT CLASSIFICATION USING NAIVE BAYES UPDATEABLE ALGORITHM IN SBMPTN TEST QUESTIONS

    Directory of Open Access Journals (Sweden)

    Ristu Saptono

    2017-01-01

    Full Text Available Document classification is a growing interest in the research of text mining. Classification can be done based on the topics, languages, and so on. This study was conducted to determine how Naive Bayes Updateable performs in classifying the SBMPTN exam questions based on its theme. Increment model of one classification algorithm often used in text classification Naive Bayes classifier has the ability to learn from new data introduces with the system even after the classifier has been produced with the existing data. Naive Bayes Classifier classifies the exam questions based on the theme of the field of study by analyzing keywords that appear on the exam questions. One of feature selection method DF-Thresholding is implemented for improving the classification performance. Evaluation of the classification with Naive Bayes classifier algorithm produces 84,61% accuracy.

  10. Computational text analysis and reading comprehension exam complexity towards automatic text classification

    CERN Document Server

    Liontou, Trisevgeni

    2014-01-01

    This book delineates a range of linguistic features that characterise the reading texts used at the B2 (Independent User) and C1 (Proficient User) levels of the Greek State Certificate of English Language Proficiency exams in order to help define text difficulty per level of competence. In addition, it examines whether specific reader variables influence test takers' perceptions of reading comprehension difficulty. The end product is a Text Classification Profile per level of competence and a formula for automatically estimating text difficulty and assigning levels to texts consistently and re

  11. Text document classification based on mixture models

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana; Malík, Antonín

    2004-01-01

    Roč. 40, č. 3 (2004), s. 293-304 ISSN 0023-5954 R&D Projects: GA AV ČR IAA2075302; GA ČR GA102/03/0049; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : text classification * text categorization * multinomial mixture model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.224, year: 2004

  12. The ‘indisciplinarity’ of stylistics

    OpenAIRE

    Sorlin Sandrine

    2014-01-01

    This paper aims at showing why the stylistician can be construed as a prolific “impostor” in a most positive sense: pledged to no specific linguistic prophet, she can opt for different theoretical linguistic tools (in the sphere of pragmatics, critical discourse analysis, cognitive grammar, etc.) depending on her object of study and what her research question is. The liberty claimed by the stylistician explains why stylistics is the “undisciplined” child of linguistics, shirking any clear def...

  13. AN IMPLEMENTATION OF EIS-SVM CLASSIFIER USING RESEARCH ARTICLES FOR TEXT CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    B Ramesh

    2016-04-01

    Full Text Available Automatic text classification is a prominent research topic in text mining. The text pre-processing is a major role in text classifier. The efficiency of pre-processing techniques is increasing the performance of text classifier. In this paper, we are implementing ECAS stemmer, Efficient Instance Selection and Pre-computed Kernel Support Vector Machine for text classification using recent research articles. We are using better pre-processing techniques such as ECAS stemmer to find root word, Efficient Instance Selection for dimensionality reduction of text data and Pre-computed Kernel Support Vector Machine for classification of selected instances. In this experiments were performed on 750 research articles with three classes such as engineering article, medical articles and educational articles. The EIS-SVM classifier provides better performance in real-time research articles classification.

  14. Puerto Nuevo and the Origins of the Stylistic-Religious Paracas Tradition

    OpenAIRE

    García, Rubén

    2012-01-01

    Puerto Nuevo was previously defined by García y Pinilla as a phase/style where there is an amalgam of Chavín and Cupisnique cultural elements with those of the south coast at the end of the Initial Period and the beginning of the Early Horizon. This paper presents new evidences and comparative stylistic analyses of contemporary societies that propose that it was during Puerto Nuevo times that the Paracas stylistic and religious tradition initiated, and therefore place it chronologically at th...

  15. Strategies to Increase Accuracy in Text Classification

    NARCIS (Netherlands)

    D. Blommesteijn (Dennis)

    2014-01-01

    htmlabstractText classification via supervised learning involves various steps from processing raw data, features extraction to training and validating classifiers. Within these steps implementation decisions are critical to the resulting classifier accuracy. This paper contains a report of the

  16. Overfitting Reduction of Text Classification Based on AdaBELM

    Directory of Open Access Journals (Sweden)

    Xiaoyue Feng

    2017-07-01

    Full Text Available Overfitting is an important problem in machine learning. Several algorithms, such as the extreme learning machine (ELM, suffer from this issue when facing high-dimensional sparse data, e.g., in text classification. One common issue is that the extent of overfitting is not well quantified. In this paper, we propose a quantitative measure of overfitting referred to as the rate of overfitting (RO and a novel model, named AdaBELM, to reduce the overfitting. With RO, the overfitting problem can be quantitatively measured and identified. The newly proposed model can achieve high performance on multi-class text classification. To evaluate the generalizability of the new model, we designed experiments based on three datasets, i.e., the 20 Newsgroups, Reuters-21578, and BioMed corpora, which represent balanced, unbalanced, and real application data, respectively. Experiment results demonstrate that AdaBELM can reduce overfitting and outperform classical ELM, decision tree, random forests, and AdaBoost on all three text-classification datasets; for example, it can achieve 62.2% higher accuracy than ELM. Therefore, the proposed model has a good generalizability.

  17. Rational kernels for Arabic Root Extraction and Text Classification

    Directory of Open Access Journals (Sweden)

    Attia Nehar

    2016-04-01

    Full Text Available In this paper, we address the problems of Arabic Text Classification and root extraction using transducers and rational kernels. We introduce a new root extraction approach on the basis of the use of Arabic patterns (Pattern Based Stemmer. Transducers are used to model these patterns and root extraction is done without relying on any dictionary. Using transducers for extracting roots, documents are transformed into finite state transducers. This document representation allows us to use and explore rational kernels as a framework for Arabic Text Classification. Root extraction experiments are conducted on three word collections and yield 75.6% of accuracy. Classification experiments are done on the Saudi Press Agency dataset and N-gram kernels are tested with different values of N. Accuracy and F1 report 90.79% and 62.93% respectively. These results show that our approach, when compared with other approaches, is promising specially in terms of accuracy and F1.

  18. A STYLISTIC ANALYSIS OF THE LANGUAGE OF POLITICAL ...

    African Journals Online (AJOL)

    user

    job. This study is a synchronic stylistic analysis of the various political utterances used during the electioneering process in ... as “The variety of language according to use” in situations such as ..... “Bruharity” is adapted from the English word.

  19. Stylistic Analysis of the Short Story "The Last Word" by Dr. A. R. Tabassum

    Science.gov (United States)

    Bari Khan, Abdul; Ahmad, Madiha; Ahmad, Sofia; Ijaz, Nida

    2015-01-01

    In this article stylistic analysis of short story "The Last Word" by Dr. A. R. Tabassum is performed. The formative elements of the story, such as point of view, characters and allegorical element, are discussed in detail so as to give a better insight of the story. The story is analyzed stylistically in terms of figures of speech where…

  20. Diagnostic investigation and historical-stylistic evaluation of oil painting on metal board. Example of “Christ Crucified with two mourning angels”

    Directory of Open Access Journals (Sweden)

    Salvatore Lorusso

    2007-07-01

    Full Text Available The oil painting on metal board (40 x 30 cm under study was bought from the antiquarian French market and bears a very common representation that derives from one of Michelangelo’s designs: “Cristo Crocifisso con due angeli dolenti”. The present paper not only refers to a stylistic and historical-artistic assessment, but also alludes to knowledge in a general sense through diagnostic technique and preservation conditions. The results of the diagnostic study, together with the stylistic analysis, have confirmed that the painting is an ancient one that dates back to the first decades of the XVIIth century.

  1. Aesthetic Proximity: the Role of Stylistic Programme Elements in Format Localisation

    Directory of Open Access Journals (Sweden)

    Jolien van Keulen

    2016-08-01

    Full Text Available Implications of the transnationalisation of television are often studied by focusing on the localisation of the content of formatted programmes. Although television is essentially an audio-visual medium, little attention has been paid to the aesthetic aspects of television texts in relation to transnationalisation and formatting. Transnationalisation of production practices, such as through formatting, implies a transnational aesthetic. At the same time, aspects of style are specific to place, culture or audience. In this article, the localisation of stylistic programme elements is explored using a comparison of two reality format adaptations. It is argued that style plays an important role in the expression of the local in a transnational industry.

  2. Stylistics in the Southeast Asian ESL or EFL Classroom: A Collection of Potential Teaching Activities

    Science.gov (United States)

    Gonzales, Wilkinson Daniel Wong; Flores, Eden R.

    2016-01-01

    For the past few decades, stylistics has emerged as a discipline that encompasses both literary criticism and linguistics. The integration of both disciplines opened many opportunities for English literature and language teachers to get creative in their teaching--by introducing the stylistic approach in their classrooms. However, in a typical…

  3. "If You Have to Ask, You'll Never Know": Effects of Specialised Stylistic Expertise on Predictive Processing of Music.

    Directory of Open Access Journals (Sweden)

    Niels Chr Hansen

    Full Text Available Musical expertise entails meticulous stylistic specialisation and enculturation. Even so, research on musical training effects has focused on generalised comparisons between musicians and non-musicians, and cross-cultural work addressing specialised expertise has traded cultural specificity and sensitivity for other methodological limitations. This study aimed to experimentally dissociate the effects of specialised stylistic training and general musical expertise on the perception of melodies. Non-musicians and professional musicians specialising in classical music or jazz listened to sampled renditions of saxophone solos improvised by Charlie Parker in the bebop style. Ratings of explicit uncertainty and expectedness for different continuations of each melodic excerpt were collected. An information-theoretic model of expectation enabled selection of stimuli affording highly certain continuations in the bebop style, but highly uncertain continuations in the context of general tonal expectations, and vice versa. The results showed that expert musicians have acquired probabilistic characteristics of music influencing their experience of expectedness and predictive uncertainty. While classical musicians had internalised key aspects of the bebop style implicitly, only jazz musicians' explicit uncertainty ratings reflected the computational estimates, and jazz-specific expertise modulated the relationship between explicit and inferred uncertainty data. In spite of this, there was no evidence that non-musicians and classical musicians used a stylistically irrelevant cognitive model of general tonal music providing support for the theory of cognitive firewalls between stylistic models in predictive processing of music.

  4. "If You Have to Ask, You'll Never Know": Effects of Specialised Stylistic Expertise on Predictive Processing of Music

    DEFF Research Database (Denmark)

    Hansen, Niels Christian; Vuust, Peter; Pearce, Marcus

    2016-01-01

    Musical expertise entails meticulous stylistic specialisation and enculturation. Even so, research on musical training effects has focused on generalised comparisons between musicians and non-musicians, and cross-cultural work addressing specialised expertise has traded cultural specificity and s......-musicians and classical musicians used a stylistically irrelevant cognitive model of general tonal music providing support for the theory of cognitive firewalls between stylistic models in predictive processing of music.......Musical expertise entails meticulous stylistic specialisation and enculturation. Even so, research on musical training effects has focused on generalised comparisons between musicians and non-musicians, and cross-cultural work addressing specialised expertise has traded cultural specificity...... and sensitivity for other methodological limitations. This study aimed to experimentally dissociate the effects of specialised stylistic training and general musical expertise on the perception of melodies. Non-musicians and professional musicians specialising in classical music or jazz listened to sampled...

  5. ARABIC TEXT CLASSIFICATION USING NEW STEMMER FOR FEATURE SELECTION AND DECISION TREES

    Directory of Open Access Journals (Sweden)

    SAID BAHASSINE

    2017-06-01

    Full Text Available Text classification is the process of assignment of unclassified text to appropriate classes based on their content. The most prevalent representation for text classification is the bag of words vector. In this representation, the words that appear in documents often have multiple morphological structures, grammatical forms. In most cases, this morphological variant of words belongs to the same category. In the first part of this paper, anew stemming algorithm was developed in which each term of a given document is represented by its root. In the second part, a comparative study is conducted of the impact of two stemming algorithms namely Khoja’s stemmer and our new stemmer (referred to hereafter by origin-stemmer on Arabic text classification. This investigation was carried out using chi-square as a feature of selection to reduce the dimensionality of the feature space and decision tree classifier. In order to evaluate the performance of the classifier, this study used a corpus that consists of 5070 documents independently classified into six categories: sport, entertainment, business, Middle East, switch and world on WEKA toolkit. The recall, f-measure and precision measures are used to compare the performance of the obtained models. The experimental results show that text classification using rout stemmer outperforms classification using Khoja’s stemmer. The f-measure was 92.9% in sport category and 89.1% in business category.

  6. Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification

    National Research Council Canada - National Science Library

    Han, Euihong; Karypis, George; Kumar, Vipin

    1999-01-01

    .... The authors present a nearest neighbor classification scheme for text categorization in which the importance of discriminating words is learned using mutual information and weight adjustment techniques...

  7. Stylistic Analysis of Robert Browning's Poem "Patriot into Traitor

    Science.gov (United States)

    Ahmed, Mumtaz; Irshad, Ayesha

    2015-01-01

    The stylistic analysis of Robert Browning's poem "Patriot into Traitor" is done by using graphological, phonological, morphological and lexico-syntactic patterns. This analysis is helpful in decoding the underlying meanings of the poem. It clearly brings to surface what the poet really wants to impart.

  8. Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling Data for Disease Classification.

    Science.gov (United States)

    Wang, Yin; Li, Rudong; Zhou, Yuhua; Ling, Zongxin; Guo, Xiaokui; Xie, Lu; Liu, Lei

    2016-01-01

    Text data of 16S rRNA are informative for classifications of microbiota-associated diseases. However, the raw text data need to be systematically processed so that features for classification can be defined/extracted; moreover, the high-dimension feature spaces generated by the text data also pose an additional difficulty. Here we present a Phylogenetic Tree-Based Motif Finding algorithm (PMF) to analyze 16S rRNA text data. By integrating phylogenetic rules and other statistical indexes for classification, we can effectively reduce the dimension of the large feature spaces generated by the text datasets. Using the retrieved motifs in combination with common classification methods, we can discriminate different samples of both pneumonia and dental caries better than other existing methods. We extend the phylogenetic approaches to perform supervised learning on microbiota text data to discriminate the pathological states for pneumonia and dental caries. The results have shown that PMF may enhance the efficiency and reliability in analyzing high-dimension text data.

  9. Comparisons and Selections of Features and Classifiers for Short Text Classification

    Science.gov (United States)

    Wang, Ye; Zhou, Zhi; Jin, Shan; Liu, Debin; Lu, Mi

    2017-10-01

    Short text is considerably different from traditional long text documents due to its shortness and conciseness, which somehow hinders the applications of conventional machine learning and data mining algorithms in short text classification. According to traditional artificial intelligence methods, we divide short text classification into three steps, namely preprocessing, feature selection and classifier comparison. In this paper, we have illustrated step-by-step how we approach our goals. Specifically, in feature selection, we compared the performance and robustness of the four methods of one-hot encoding, tf-idf weighting, word2vec and paragraph2vec, and in the classification part, we deliberately chose and compared Naive Bayes, Logistic Regression, Support Vector Machine, K-nearest Neighbor and Decision Tree as our classifiers. Then, we compared and analysed the classifiers horizontally with each other and vertically with feature selections. Regarding the datasets, we crawled more than 400,000 short text files from Shanghai and Shenzhen Stock Exchanges and manually labeled them into two classes, the big and the small. There are eight labels in the big class, and 59 labels in the small class.

  10. Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling Data for Disease Classification

    Directory of Open Access Journals (Sweden)

    Yin Wang

    2016-01-01

    Full Text Available Background. Text data of 16S rRNA are informative for classifications of microbiota-associated diseases. However, the raw text data need to be systematically processed so that features for classification can be defined/extracted; moreover, the high-dimension feature spaces generated by the text data also pose an additional difficulty. Results. Here we present a Phylogenetic Tree-Based Motif Finding algorithm (PMF to analyze 16S rRNA text data. By integrating phylogenetic rules and other statistical indexes for classification, we can effectively reduce the dimension of the large feature spaces generated by the text datasets. Using the retrieved motifs in combination with common classification methods, we can discriminate different samples of both pneumonia and dental caries better than other existing methods. Conclusions. We extend the phylogenetic approaches to perform supervised learning on microbiota text data to discriminate the pathological states for pneumonia and dental caries. The results have shown that PMF may enhance the efficiency and reliability in analyzing high-dimension text data.

  11. Text document classification

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana

    č. 62 (2005), s. 53-54 ISSN 0926-4981 R&D Projects: GA AV ČR IAA2075302; GA AV ČR KSK1019101; GA MŠk 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : document representation * categorization * classification Subject RIV: BD - Theory of Information

  12. Semantic Document Image Classification Based on Valuable Text Pattern

    Directory of Open Access Journals (Sweden)

    Hossein Pourghassem

    2011-01-01

    Full Text Available Knowledge extraction from detected document image is a complex problem in the field of information technology. This problem becomes more intricate when we know, a negligible percentage of the detected document images are valuable. In this paper, a segmentation-based classification algorithm is used to analysis the document image. In this algorithm, using a two-stage segmentation approach, regions of the image are detected, and then classified to document and non-document (pure region regions in the hierarchical classification. In this paper, a novel valuable definition is proposed to classify document image in to valuable or invaluable categories. The proposed algorithm is evaluated on a database consisting of the document and non-document image that provide from Internet. Experimental results show the efficiency of the proposed algorithm in the semantic document image classification. The proposed algorithm provides accuracy rate of 98.8% for valuable and invaluable document image classification problem.

  13. Stylistic Devices in Ben Okri's The Famished Road | Ikechi ...

    African Journals Online (AJOL)

    This paper discusses stylistic devices in Ben Okri's The Famished Road. In the presentation of his story, the novelist makes use of literary devices which enrich readers' understanding and enjoyment of his subject matter. Satire, register, cliché, pidgin and proverbs are some of the devices. Others include: figurative language ...

  14. METAPHOR AS A STYLISTIC DEVICE OF ISLAMIC TEACHING

    Directory of Open Access Journals (Sweden)

    Jumino Suhadi

    2011-06-01

    Full Text Available Metafor Sebagai Sarana Stilistika Pembelajaran Islam. Artikel ini membahas tentang berbagai tipe metafora yang terdapat dalam ayat-ayat suci al-Qur’an dan Hadis berdasarkan kerangka teori sastra dan linguistik modern. Sumber data utama dalam studi ini terdiri dari ayat-ayat suci al-Qur’an yang diterjemahkan dalam Bahasa Inggris oleh Abdullah Yusuf Ali dan beberapa matan Hadis dari buku kumpulan Hadis karangan Habib Muhammad al-Haddar. Tujuan analisis ini adalah untuk memaparkan bukti kuat bahwa metafora merupakan alat stilistika ‘stylistic device’ yang dipergunakan secara luas dalam al-Qur’an dan al-Hadis dalam menyampaikan ajaran Islam. Hasil dari studi ini menunjukkan bahwa semua tipe metafora dalam arti luas terdapat dalam berbagai ayat suci al-Qur’an dan al-Hadis. Argumen tersebut menurut penulis merupakan bukti yang sangat meyakinkan bahwa metafora merupakan salah satu alat stilistika dalam menyampaikan ajaran Islam.

  15. Categorizing Children: Automated Text Classification of CHILDES files

    NARCIS (Netherlands)

    Opsomer, Rob; Knoth, Peter; Wiering, Marco; van Polen, Freek; Trapman, Jantine

    2008-01-01

    In this paper we present the application of machine learning text classification methods to two tasks: categorization of children’s speech in the CHILDES Database according to gender and age. Both tasks are binary. For age, we distinguish two age groups between the age of 1.9 and 3.0 years old. The

  16. Nigerian Visual Arts (1970-2003) and the Impact of Some Stylistic ...

    African Journals Online (AJOL)

    African Research Review ... The art productions and techniques from the stylistic tendencies have created vista of ... they experiment with materials and techniques without losing touch with their African identity. ... AJOL African Journals Online.

  17. Floral foregrounding: A corpus-assisted, cognitive stylistic study of the foregrounding of flowers in Mrs Dalloway

    DEFF Research Database (Denmark)

    Jensen, Marie Møller; Lottrup, Katrine; Nordentoft, Signe

    2018-01-01

    The study reported here combines quantitative and qualitative methods from both cognitive stylistics and corpus stylistics to analyze the flower-motif in Virginia Woolf’s novel Mrs Dalloway. The quantitative analysis compared the frequency of flower lemmas in the novel to both a reference corpus...... consisting of Woolf’s other works as well as a general corpus (the BNC). The analysis found significant differences between the frequencies in the novel and both corpora. The qualitative analysis is based on in the statistically significant results and considers cognitive entrenchment and salience...... in relation to these. Furthermore, the analysis also links these two notions to different types of foregrounding as conceptualized in stylistics proper. Finally, aspects of repetition, parallelism and symbolism in relation to the flower-motif are considered. In conclusion, it is found that the flower...

  18. Criticism versus stylistics: an analysis of their areas of overlap and ...

    African Journals Online (AJOL)

    Criticism versus stylistics: an analysis of their areas of overlap and contrast. ... AFRREV LALIGENS: An International Journal of Language, Literature and Gender Studies ... in a number of areas, a great deal of discrepancy exists between them.

  19. Toward an enhanced Arabic text classification using cosine similarity and Latent Semantic

    Directory of Open Access Journals (Sweden)

    Fawaz S. Al-Anzi

    2017-04-01

    Full Text Available Cosine similarity is one of the most popular distance measures in text classification problems. In this paper, we used this important measure to investigate the performance of Arabic language text classification. For textual features, vector space model (VSM is generally used as a model to represent textual information as numerical vectors. However, Latent Semantic Indexing (LSI is a better textual representation technique as it maintains semantic information between the words. Hence, we used the singular value decomposition (SVD method to extract textual features based on LSI. In our experiments, we conducted comparison between some of the well-known classification methods such as Naïve Bayes, k-Nearest Neighbors, Neural Network, Random Forest, Support Vector Machine, and classification tree. We used a corpus that contains 4,000 documents of ten topics (400 document for each topic. The corpus contains 2,127,197 words with about 139,168 unique words. The testing set contains 400 documents, 40 documents for each topics. As a weighing scheme, we used Term Frequency.Inverse Document Frequency (TF.IDF. This study reveals that the classification methods that use LSI features significantly outperform the TF.IDF-based methods. It also reveals that k-Nearest Neighbors (based on cosine measure and support vector machine are the best performing classifiers.

  20. Stylistics Analysis in Advertising Discourse: A Case of the Dangote Cement Advertisement in Bamenda- Cameroon

    Directory of Open Access Journals (Sweden)

    Seino Evangeline Agwa Fomukong

    2016-12-01

    Full Text Available There are many purposes for using language which determine how the writer or speaker   chooses words, syntactic expressions and figurative language. This is as a result of the fact that language has a very powerful effect over people, their actions and thoughts. This is seen in the use of language in various discourse types which include advertisements. The powerful influence language has on people therefore makes encoders to be choosy in the use of language, especially in advertisement because they have to persuade the readers. Consequently they make the language of advertisements positive and emphasize on the superiority of their products. This study discusses the advertisement of Dangote Cement on billboards in Bamenda, North West Region, Cameroon, analysing what is communicated, how it is communicated and the interpretation.  The analysis used as tools the Textual Conceptual Functions as given by Jeffries (2016, uncovering ideologies and social meanings expressed in Dangote Cement advertisement using the following apparatus: prioritisation, implying and assumption, listing, naming and description. The study has emphasized the structural analysis and the role of context to reveal functions and underlying meanings of the text. It also concludes that the advertisers use different stylistic devices that carry positivity, and a common ground that makes the readers identify with the advertisements, urging them go for the Dangote Cement. Keywords: stylistics, language, context, advertisements, ideologies, Dangote

  1. Short text sentiment classification based on feature extension and ensemble classifier

    Science.gov (United States)

    Liu, Yang; Zhu, Xie

    2018-05-01

    With the rapid development of Internet social media, excavating the emotional tendencies of the short text information from the Internet, the acquisition of useful information has attracted the attention of researchers. At present, the commonly used can be attributed to the rule-based classification and statistical machine learning classification methods. Although micro-blog sentiment analysis has made good progress, there still exist some shortcomings such as not highly accurate enough and strong dependence from sentiment classification effect. Aiming at the characteristics of Chinese short texts, such as less information, sparse features, and diverse expressions, this paper considers expanding the original text by mining related semantic information from the reviews, forwarding and other related information. First, this paper uses Word2vec to compute word similarity to extend the feature words. And then uses an ensemble classifier composed of SVM, KNN and HMM to analyze the emotion of the short text of micro-blog. The experimental results show that the proposed method can make good use of the comment forwarding information to extend the original features. Compared with the traditional method, the accuracy, recall and F1 value obtained by this method have been improved.

  2. Transfer Learning beyond Text Classification

    Science.gov (United States)

    Yang, Qiang

    Transfer learning is a new machine learning and data mining framework that allows the training and test data to come from different distributions or feature spaces. We can find many novel applications of machine learning and data mining where transfer learning is necessary. While much has been done in transfer learning in text classification and reinforcement learning, there has been a lack of documented success stories of novel applications of transfer learning in other areas. In this invited article, I will argue that transfer learning is in fact quite ubiquitous in many real world applications. In this article, I will illustrate this point through an overview of a broad spectrum of applications of transfer learning that range from collaborative filtering to sensor based location estimation and logical action model learning for AI planning. I will also discuss some potential future directions of transfer learning.

  3. I will proclaim myself what I am : corpus stylistics and the language of Shakespeare’s soliloquies

    OpenAIRE

    Murphy, Sean Edward

    2015-01-01

    This article reports on a corpus stylistic study of the language of soliloquies in Shakespeare’s plays. Literary corpus stylistics can use corpus linguistic methods to test claims made by literary critics and identify hitherto unnoticed features. Existing literary studies of soliloquies tend to define and classify them, to trace the history of the form or to offer literary appreciation; yet they pay surprisingly little attention to the language which characterises soliloquies. By creating a s...

  4. Preparing Stylistically Challenging Contemporary Classical Repertoire for Performance: Interpreting "Kumari"

    Science.gov (United States)

    Viney, Liam; Blom, Diana

    2015-01-01

    Research involving the learning processes of musicians seldom examines specific pieces of music, and limited attention has been devoted to the earliest stages of learning a stylistically challenging or new piece of 20th-/21st-century art music. This article describes the processes by which two pianists (the authors) learned Ross Edwards's…

  5. METHODS OF TEXT INFORMATION CLASSIFICATION ON THE BASIS OF ARTIFICIAL NEURAL AND SEMANTIC NETWORKS

    Directory of Open Access Journals (Sweden)

    L. V. Serebryanaya

    2016-01-01

    Full Text Available The article covers the use of perseptron, Hopfild artificial neural network and semantic network for classification of text information. Network training algorithms are studied. An algorithm of inverse mistake spreading for perceptron network and convergence algorithm for Hopfild network are implemented. On the basis of the offered models and algorithms automatic text classification software is developed and its operation results are evaluated.

  6. Stylistic features of narrative procedure in a psychological short story in the context of teaching interpretation

    Directory of Open Access Journals (Sweden)

    Stakić Mirjana M.

    2016-01-01

    Full Text Available The paper investigates the stylistic features of narrative procedure in a psychological short story in the context of its interpretation in the teaching of the Serbian language and literature. The narrative procedure in a psychological short story is characterized by the use of the first person in narrating, that is I form, an interior monologue and direct interior monologue, dreams, oversights and introspective. It is also characterized by a particular sentence structure, of often incomplete and elliptical form, used to express the conflicts going on in characters' inner sphere and verbal interaction between the characters. The narrative procedure applied in a psychological short story indicates that its plot is subdued to the internal psychological experiences. During the interpretation of a psychological short story students, through the interpretation of stylistic and narrative procedures, are directed and encouraged to discover complex and often hidden psychological mechanisms which spur the characters to act, influence their behavior, verbal expression and mutual relations. The interpretation of language signs which may have psychological and semantic potential leads to the revealing of unconscious internal psychological processes and mechanisms which take place within a literary character.

  7. The Territory of Language: Linguistics, Stylistics, and the Teaching of Composition.

    Science.gov (United States)

    McQuade, Donald A.

    Intended to chart the interconnections of linguistics, stylistics, and the teaching of composition, this book encourages a productive collective effort to cultivate linguistics among teachers of writing. Chapter titles and their authors are as follows: (1) "Grammar in American College Composition: An Historical Overview" (R. J. Connors);…

  8. A Chinese text classification system based on Naive Bayes algorithm

    Directory of Open Access Journals (Sweden)

    Cui Wei

    2016-01-01

    Full Text Available In this paper, aiming at the characteristics of Chinese text classification, using the ICTCLAS(Chinese lexical analysis system of Chinese academy of sciences for document segmentation, and for data cleaning and filtering the Stop words, using the information gain and document frequency feature selection algorithm to document feature selection. Based on this, based on the Naive Bayesian algorithm implemented text classifier , and use Chinese corpus of Fudan University has carried on the experiment and analysis on the system.

  9. "If You Have to Ask, You'll Never Know": Effects of Specialised Stylistic Expertise on Predictive Processing of Music

    OpenAIRE

    Hansen, Niels Chr.; Vuust, Peter; Pearce, Marcus

    2016-01-01

    Musical expertise entails meticulous stylistic specialisation and enculturation. Even so, research on musical training effects has focused on generalised comparisons between musicians and non-musicians, and cross-cultural work addressing specialised expertise has traded cultural specificity and sensitivity for other methodological limitations. This study aimed to experimentally dissociate the effects of specialised stylistic training and general musical expertise on the perception of melodies...

  10. Clipped Wings and the Great Abyss: Cognitive Stylistics and Implicatures in Abiezer Coppe’s ‘Prophetic’ Recantation

    OpenAIRE

    Borgogni Daniele

    2017-01-01

    In this article, two major paradigms within cognitive stylistics, the Conceptual Metaphor Theory (CMT) and the Conceptual Integration Theory (CIT), are applied as largely complementary approaches to discuss the scope and implicatures of the central metaphorical image of Copp’s Return to the wayes of Truth (1651), a text written by one of the most famous radical preachers of the Civil War period as a plea to be released from prison. The article will focus on how the linguistic and ...

  11. Healthcare Text Classification System and its Performance Evaluation: A Source of Better Intelligence by Characterizing Healthcare Text.

    Science.gov (United States)

    Srivastava, Saurabh Kumar; Singh, Sandeep Kumar; Suri, Jasjit S

    2018-04-13

    A machine learning (ML)-based text classification system has several classifiers. The performance evaluation (PE) of the ML system is typically driven by the training data size and the partition protocols used. Such systems lead to low accuracy because the text classification systems lack the ability to model the input text data in terms of noise characteristics. This research study proposes a concept of misrepresentation ratio (MRR) on input healthcare text data and models the PE criteria for validating the hypothesis. Further, such a novel system provides a platform to amalgamate several attributes of the ML system such as: data size, classifier type, partitioning protocol and percentage MRR. Our comprehensive data analysis consisted of five types of text data sets (TwitterA, WebKB4, Disease, Reuters (R8), and SMS); five kinds of classifiers (support vector machine with linear kernel (SVM-L), MLP-based neural network, AdaBoost, stochastic gradient descent and decision tree); and five types of training protocols (K2, K4, K5, K10 and JK). Using the decreasing order of MRR, our ML system demonstrates the mean classification accuracies as: 70.13 ± 0.15%, 87.34 ± 0.06%, 93.73 ± 0.03%, 94.45 ± 0.03% and 97.83 ± 0.01%, respectively, using all the classifiers and protocols. The corresponding AUC is 0.98 for SMS data using Multi-Layer Perceptron (MLP) based neural network. All the classifiers, the best accuracy of 91.84 ± 0.04% is shown to be of MLP-based neural network and this is 6% better over previously published. Further we observed that as MRR decreases, the system robustness increases and validated by standard deviations. The overall text system accuracy using all data types, classifiers, protocols is 89%, thereby showing the entire ML system to be novel, robust and unique. The system is also tested for stability and reliability.

  12. A Pragma-Stylistic Analysis of President Goodluck Ebele Jonathan Inaugural Speech

    Science.gov (United States)

    Abuya, Eromosele John

    2012-01-01

    The study was an examination through the pragma-stylistic approach to meaning of the linguistic acts that manifest in the Inaugural Speech of Goodluck Ebele Jonathan as the democratically elected president in May 2011 General Elections in Nigeria. Hence, the study focused on speech acts type of locution, illocutionary and perlocutionary in the…

  13. Improving imbalanced scientific text classification using sampling strategies and dictionaries

    Directory of Open Access Journals (Sweden)

    Borrajo L.

    2011-12-01

    Full Text Available Many real applications have the imbalanced class distribution problem, where one of the classes is represented by a very small number of cases compared to the other classes. One of the systems affected are those related to the recovery and classification of scientific documentation.

  14. Diagnostic investigations and historical-stylistic evaluation on the oil painting: "reading man by oil lamp light"

    Directory of Open Access Journals (Sweden)

    Salvatore Lorusso

    2006-02-01

    Full Text Available This investigation intends to verify the attribution of the oil painting (70x50,5 cm portraying a reading man by oil lamp light, to Gerrit van Hontorst. The note refers not only to a stylistic and historical-artistical evaluation but also to the knowledge, through diagnostic techniques, of the application to characterize components of matter, and of the manufacture execution technique and preservation conditions. This investigation denies the attribution to the painter Gerrit van Hontorst, but it does not exclude a dating within the XVII century.

  15. The Stylistics Analysis of Internet Language

    Institute of Scientific and Technical Information of China (English)

    ZHOU Huan-huan

    2015-01-01

    Internet language is the product of modern technology, especially for the advancement of Information Technology. It is a social and linguistic phenomenon which has its own stylistic and rhetoric patterns and styles compared with other languages. The reasons for the emergence of Internet language can be summarised into three kinds:firstly, the netizens need distinct languag⁃es to show their personalities and enhance the impacts of languages such as sadness, angriness and happiness; secondly, the key⁃board makes it hard to type the whole sentences when people online chatting or other activities;lastly, it is fast and convenient, especially when some online activities are time-consuming. Internet languages make the most use of the functions of linguistic deviation and satisfy the psychological and practical needs of netizens.

  16. Representativeness in corpora of literary texts: introducing the C18P project

    Directory of Open Access Journals (Sweden)

    Gemeinböck, Iris

    2016-07-01

    Full Text Available Currently there are very few specialised corpora of literary texts that are tailored to the needs of literary critics who are interested in corpus stylistic analyses of prose fiction. Many existing corpora including literary texts were compiled for linguistic research interests and are often unsuitable for corpus stylistic purposes. The paper addresses three of the main problems: the absence of labelling of the texts for literary genre, the use of extracts, and the prevalence of linguistic periodisation schemes. C18P is a corpus of prose fiction designed specifically to address these issues. It traces the early development of the novel from 1700 up until the Victorian era. It can, for instance, be used for an analysis of the characteristic linguistic features of individual literary genres and forms. The following paper introduces the design of the corpus as well as some of its potential uses.

  17. Study children\\'s literature by comparative stylistics approach (Poetry book in Ahmed Shoghi,and Mohammed Alhravyand Abbas Yamini Sharif model

    Directory of Open Access Journals (Sweden)

    salah addin abdi

    2014-12-01

    Full Text Available Stylistic is disclosure laws creativity in literary discourse structure. And case under limitation in the idea of interdependence between the texts and to look at the texts in interdependence to only Angle comparison especially as were texts between different languages and what it was limited in literary texts turned out to be working round will be the technical side of any aesthetic. This literature poets any three Ahmed Shawki and Mohamed Hrawi Egyptians and Iranian poet Abbas yamini Sharif appear in children's literature be texture , beautiful word baptism imagination and purpose of the enjoyment of the small receiver and educated and refined . This literature poets look alike poets sing one topic in their hair for children is " Alktab " models of pedagogy in their poets be of standard noodles in a level voice , coordination and harmony between the internal and external music And role of repetition with different check rhythm music and show a sense of psychological and the emphasis on meaning. Be guaranteed their poets raising a child and learned, passion is pride and love of science and learning. The imagination in the poet be kind of yamini imagination Altaleva in while an innovative Shawki’s imagination and fantasy Hrawi graph. The study of each relying on poetic texts for all three of them and the energy of poetic language and its technical and creative aesthetic and stylistic comparison methodology which is lean on comparison mainly her and emerged from the comparison in two languages literary two would different Comparative Literature but of its focus on language and style .

  18. Text World Theory and real world readers: From literature to life in a Belfast prison.

    Science.gov (United States)

    Canning, Patricia

    2017-05-01

    Cognitive stylistics offers a range of frameworks for understanding (amongst other things) what producers of literary texts 'do' with language and how they 'do' it. Less prevalent, however, is an understanding of the ways in which these same frameworks offer insights into what readers 'do' (and how they 'do' it). Text World Theory (Werth, 1999; Gavins, 2007; Whiteley, 2011) has proved useful for understanding how and why readers construct mental representations engendered by the act of reading. However, research on readers' responses to literature has largely focused on an 'idealised' reader or an 'experimental' subject-reader often derived from within the academy and conducted using contrived or amended literary fiction. Moreover, the format of traditional book groups (participants read texts privately and discuss them at a later date) as well as online community forums such as Goodreads, means that such studies derive data from post-hoc, rather than real-time textual encounters and discussions. The current study is the first of its kind in analysing real-time reading contexts with real readers during a researcher-led literary project ('read.live.learn') in Northern Ireland's only female prison. In doing so, the study is unique in addressing experimental and post hoc bias. Using Text World Theory, the paper considers the personal and social impact of reader engagement in the talk of the participants. As such, it has three interrelated aims: to argue for the social and personal benefits of reading stylistically rich literature in real-time reading groups; to demonstrate the efficacy of stylistics for understanding how those benefits come about, and to demonstrate the inter-disciplinary value of stylistics, particularly its potential for traversing traditional research parameters.

  19. Prediction of cause of death from forensic autopsy reports using text classification techniques: A comparative study.

    Science.gov (United States)

    Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa

    2018-07-01

    Automatic text classification techniques are useful for classifying plaintext medical documents. This study aims to automatically predict the cause of death from free text forensic autopsy reports by comparing various schemes for feature extraction, term weighing or feature value representation, text classification, and feature reduction. For experiments, the autopsy reports belonging to eight different causes of death were collected, preprocessed and converted into 43 master feature vectors using various schemes for feature extraction, representation, and reduction. The six different text classification techniques were applied on these 43 master feature vectors to construct a classification model that can predict the cause of death. Finally, classification model performance was evaluated using four performance measures i.e. overall accuracy, macro precision, macro-F-measure, and macro recall. From experiments, it was found that that unigram features obtained the highest performance compared to bigram, trigram, and hybrid-gram features. Furthermore, in feature representation schemes, term frequency, and term frequency with inverse document frequency obtained similar and better results when compared with binary frequency, and normalized term frequency with inverse document frequency. Furthermore, the chi-square feature reduction approach outperformed Pearson correlation, and information gain approaches. Finally, in text classification algorithms, support vector machine classifier outperforms random forest, Naive Bayes, k-nearest neighbor, decision tree, and ensemble-voted classifier. Our results and comparisons hold practical importance and serve as references for future works. Moreover, the comparison outputs will act as state-of-art techniques to compare future proposals with existing automated text classification techniques. Copyright © 2017 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  20. Real-time stylistic prediction for whole-body human motions.

    Science.gov (United States)

    Matsubara, Takamitsu; Hyon, Sang-Ho; Morimoto, Jun

    2012-01-01

    The ability to predict human motion is crucial in several contexts such as human tracking by computer vision and the synthesis of human-like computer graphics. Previous work has focused on off-line processes with well-segmented data; however, many applications such as robotics require real-time control with efficient computation. In this paper, we propose a novel approach called real-time stylistic prediction for whole-body human motions to satisfy these requirements. This approach uses a novel generative model to represent a whole-body human motion including rhythmic motion (e.g., walking) and discrete motion (e.g., jumping). The generative model is composed of a low-dimensional state (phase) dynamics and a two-factor observation model, allowing it to capture the diversity of motion styles in humans. A real-time adaptation algorithm was derived to estimate both state variables and style parameter of the model from non-stationary unlabeled sequential observations. Moreover, with a simple modification, the algorithm allows real-time adaptation even from incomplete (partial) observations. Based on the estimated state and style, a future motion sequence can be accurately predicted. In our implementation, it takes less than 15 ms for both adaptation and prediction at each observation. Our real-time stylistic prediction was evaluated for human walking, running, and jumping behaviors. Copyright © 2011 Elsevier Ltd. All rights reserved.

  1. Tula song folklore: genre-stylistic and dialectic peculiarities

    Directory of Open Access Journals (Sweden)

    Krasovskaya Nelli Alexandrovna

    2016-06-01

    Full Text Available The article analyzes the works of Tula folklore recorded in the western part of the Tula region, in terms of genre, stylistic and linguistic features. The relevance of the study is related to the fact that Tula folk songs has not been studied, linguistic features of the works are not subjected to serious analysis. The article describes the features of the genre of songs recorded in Belevsky district of Tula region, including the ancient fortunetelling chants, wedding ceremony songs, romantic ballads etc., it is cited numerous examples in the lyrics that reflect the dialectal features of the phonetic, grammatical, lexical levels. According to the authors, a modern folk song genre retains its diversity and is a kind of storeroom containing priceless linguistic wealth. The analysis allows to draw conclusions about the presence and well-preserved in the recorded music of South Russian dialect phonetic and grammatical features. So far, there is no established typology of Tula dialects, therefore, according to the authors, the fixation of folklore in the territories bordering on Tula dialects, is very important and interesting for further descriptive and comparative work on identifying the eastern and south-south-west differences in Tula dialects.

  2. Approach for Text Classification Based on the Similarity Measurement between Normal Cloud Models

    Directory of Open Access Journals (Sweden)

    Jin Dai

    2014-01-01

    Full Text Available The similarity between objects is the core research area of data mining. In order to reduce the interference of the uncertainty of nature language, a similarity measurement between normal cloud models is adopted to text classification research. On this basis, a novel text classifier based on cloud concept jumping up (CCJU-TC is proposed. It can efficiently accomplish conversion between qualitative concept and quantitative data. Through the conversion from text set to text information table based on VSM model, the text qualitative concept, which is extraction from the same category, is jumping up as a whole category concept. According to the cloud similarity between the test text and each category concept, the test text is assigned to the most similar category. By the comparison among different text classifiers in different feature selection set, it fully proves that not only does CCJU-TC have a strong ability to adapt to the different text features, but also the classification performance is also better than the traditional classifiers.

  3. A Stylistic Analysis of Linguistic Patterns in Chichamanda Ngozi Adichie’s Purple Hibiscus

    Directory of Open Access Journals (Sweden)

    Muchamad Sholakhuddin Al Fajri

    2017-06-01

    Full Text Available This study aims to carry out a detailed and systematic stylistic analysis of linguistic patterns in Purple Hibiscus Novel by Chichamanda Ngozi Adichie. It particularly analyses a specific extract of the novel in terms of narration and point of view, conversational analysis, speech and thought presentations and mind style, and how these linguistic devices and patterns are employed by the author to shape characters’ personalities and relationships between them in the reader’s mind. The result appears to suggest that the author successfully represents the protagonist, Kambili as an obedient and a salient daughter who respects deeply his father, while her father, Eugene, is constructed as a strict father and religious who imposes an absolute control on his daughter.

  4. Functional Stylistics and Peripeteic Texts

    DEFF Research Database (Denmark)

    Borchmann, Simon

    2008-01-01

    Using a pragmatically based linguistic description apparatus on literary use of language is not unproblematic. Observations show that literary use of language violates the norms contained by this apparatus. With this paper I suggest how we can deal with this problem by setting up a frame for the ...

  5. Analysis of Influence of Different Relations Types on the Quality of Thesaurus Application to Text Classification Problems

    Directory of Open Access Journals (Sweden)

    Nadezhda S. Lagutina

    2017-01-01

    Full Text Available The main purpose of the article is to analyze how effectively different types of thesaurus relations can be used for solutions of text classification tasks. The basis of the study is an automatically generated thesaurus of a subject area, that contains three types of relations: synonymous, hierarchical and associative. To generate the thesaurus the authors use a hybrid method based on several linguistic and statistical algorithms for extraction of semantic relations. The method allows to create a thesaurus with a sufficiently large number of terms and relations among them. The authors consider two problems: topical text classification and sentiment classification of large newspaper articles. To solve them, the authors developed two approaches that complement standard algorithms with a procedure that take into account thesaurus relations to determine semantic features of texts. The approach to topical classification includes the standard unsupervised BM25 algorithm and the procedure, that take into account synonymous and hierarchical relations of the thesaurus of the subject area. The approach to sentiment classification consists of two steps. At the first step, a thesaurus is created, whose terms weight polarities are calculated depending on the term occurrences in the training set or on the weights of related thesaurus terms. At the second step, the thesaurus is used to compute the features of words from texts and to classify texts by the algorithm SVM or Naive Bayes. In experiments with text corpora BBCSport, Reuters, PubMed and the corpus of articles about American immigrants, the authors varied the types of thesaurus relations that are involved in the classification and the degree of their use. The results of the experiments make it possible to evaluate the efficiency of the application of thesaurus relations for classification of raw texts and to determine under what conditions certain relationships affect more or less. In particular, the

  6. "If You Have to Ask, You'll Never Know": Effects of Specialised Stylistic Expertise on Predictive Processing of Music.

    Science.gov (United States)

    Hansen, Niels Chr; Vuust, Peter; Pearce, Marcus

    2016-01-01

    Musical expertise entails meticulous stylistic specialisation and enculturation. Even so, research on musical training effects has focused on generalised comparisons between musicians and non-musicians, and cross-cultural work addressing specialised expertise has traded cultural specificity and sensitivity for other methodological limitations. This study aimed to experimentally dissociate the effects of specialised stylistic training and general musical expertise on the perception of melodies. Non-musicians and professional musicians specialising in classical music or jazz listened to sampled renditions of saxophone solos improvised by Charlie Parker in the bebop style. Ratings of explicit uncertainty and expectedness for different continuations of each melodic excerpt were collected. An information-theoretic model of expectation enabled selection of stimuli affording highly certain continuations in the bebop style, but highly uncertain continuations in the context of general tonal expectations, and vice versa. The results showed that expert musicians have acquired probabilistic characteristics of music influencing their experience of expectedness and predictive uncertainty. While classical musicians had internalised key aspects of the bebop style implicitly, only jazz musicians' explicit uncertainty ratings reflected the computational estimates, and jazz-specific expertise modulated the relationship between explicit and inferred uncertainty data. In spite of this, there was no evidence that non-musicians and classical musicians used a stylistically irrelevant cognitive model of general tonal music providing support for the theory of cognitive firewalls between stylistic models in predictive processing of music.

  7. The primitives of Santa Clara of Ubeda: stylistic and iconographic study, critical appraisals and vicissitudes of a dispersed heritage

    Directory of Open Access Journals (Sweden)

    Clara Beltrán Catalán

    2016-12-01

    Full Text Available The authors study the stylistic and iconographic aspects of a collection of paintings on wood dating from the 15th and 16th centuries, originally in the Royal Monastery of Santa Clara at Úbeda. This collection was sold in the 1920s with the participation of the antique dealer Celestino Dupont. The research is complemented by an analysis of the critical appraisals given to these works and their history since their introduction into the art market.

  8. Automatic topic identification of health-related messages in online health community using text classification.

    Science.gov (United States)

    Lu, Yingjie

    2013-01-01

    To facilitate patient involvement in online health community and obtain informative support and emotional support they need, a topic identification approach was proposed in this paper for identifying automatically topics of the health-related messages in online health community, thus assisting patients in reaching the most relevant messages for their queries efficiently. Feature-based classification framework was presented for automatic topic identification in our study. We first collected the messages related to some predefined topics in a online health community. Then we combined three different types of features, n-gram-based features, domain-specific features and sentiment features to build four feature sets for health-related text representation. Finally, three different text classification techniques, C4.5, Naïve Bayes and SVM were adopted to evaluate our topic classification model. By comparing different feature sets and different classification techniques, we found that n-gram-based features, domain-specific features and sentiment features were all considered to be effective in distinguishing different types of health-related topics. In addition, feature reduction technique based on information gain was also effective to improve the topic classification performance. In terms of classification techniques, SVM outperformed C4.5 and Naïve Bayes significantly. The experimental results demonstrated that the proposed approach could identify the topics of online health-related messages efficiently.

  9. Stylistic features of case reports as a genre of medical discourse.

    Science.gov (United States)

    Lysanets, Yuliia; Morokhovets, Halyna; Bieliaieva, Olena

    2017-03-13

    The present paper discusses the lexical and grammatical peculiarities of English language medical case reports, taking into account their communicative purposes and intentions. The objective of the research is to clarify the principal mechanisms of producing an effective English language medical case report and thus to provide recommendations and guidelines for medical professionals who will deal with this genre. The analysis of medical case reports will largely focus on the most significant linguistic peculiarities, such as the use of active and passive voice, the choice of particular verb tenses, and pronouns. The selected medical case reports will be considered using methods of lexico-grammatical analysis, quantitative examination, and contextual, structural, narrative, and stylistic analyses. The research revealed a range of important stylistic features of medical case reports which markedly distinguish them from other genres of medical scientific writing: educational and instructive intentions, conciseness and brevity, direct and personal tone, and material presented in a narrative style. The present research has shown that the communicative strategies of the analyzed discourse, mentioned immediately above, are effectively implemented by means of specific lexical units and grammatical structures: the dominance of active voice sentences, past simple tense, personal pronouns, and modal verbs. The research has also detected the occasional use of the present perfect, present simple, and future simple tenses and passive voice which also serve particular communicative purposes of medical case reports. Medical case reports possess a range of unique characteristics which differ from those of research articles and other scientific genres within the framework of written medical discourse. It is to be emphasized that it is highly important for medical professionals to master the major stylistic principles and communicative intentions of medical case report as a genre in

  10. CLASSIFICATION OF TRAFFIC RELATED SHORT TEXTS TO ANALYSE ROAD PROBLEMS IN URBAN AREAS

    Directory of Open Access Journals (Sweden)

    A. M. M. Saldana-Perez

    2017-09-01

    Full Text Available The Volunteer Geographic Information (VGI can be used to understand the urban dynamics. In the classification of traffic related short texts to analyze road problems in urban areas, a VGI data analysis is done over a social media’s publications, in order to classify traffic events at big cities that modify the movement of vehicles and people through the roads, such as car accidents, traffic and closures. The classification of traffic events described in short texts is done by applying a supervised machine learning algorithm. In the approach users are considered as sensors which describe their surroundings and provide their geographic position at the social network. The posts are treated by a text mining process and classified into five groups. Finally, the classified events are grouped in a data corpus and geo-visualized in the study area, to detect the places with more vehicular problems.

  11. Probing the topological properties of complex networks modeling short written texts.

    Directory of Open Access Journals (Sweden)

    Diego R Amancio

    Full Text Available In recent years, graph theory has been widely employed to probe several language properties. More specifically, the so-called word adjacency model has been proven useful for tackling several practical problems, especially those relying on textual stylistic analysis. The most common approach to treat texts as networks has simply considered either large pieces of texts or entire books. This approach has certainly worked well-many informative discoveries have been made this way-but it raises an uncomfortable question: could there be important topological patterns in small pieces of texts? To address this problem, the topological properties of subtexts sampled from entire books was probed. Statistical analyses performed on a dataset comprising 50 novels revealed that most of the traditional topological measurements are stable for short subtexts. When the performance of the authorship recognition task was analyzed, it was found that a proper sampling yields a discriminability similar to the one found with full texts. Surprisingly, the support vector machine classification based on the characterization of short texts outperformed the one performed with entire books. These findings suggest that a local topological analysis of large documents might improve its global characterization. Most importantly, it was verified, as a proof of principle, that short texts can be analyzed with the methods and concepts of complex networks. As a consequence, the techniques described here can be extended in a straightforward fashion to analyze texts as time-varying complex networks.

  12. Natural Language Processing Based Instrument for Classification of Free Text Medical Records

    Directory of Open Access Journals (Sweden)

    Manana Khachidze

    2016-01-01

    Full Text Available According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray and 13 subgroups using two well-known methods: Support Vector Machine (SVM and K-Nearest Neighbor (KNN. The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system due to common features characterizing these subclasses. The overall results of the study were successful.

  13. Applying Active Learning to Assertion Classification of Concepts in Clinical Text

    Science.gov (United States)

    Chen, Yukun; Mani, Subramani; Xu, Hua

    2012-01-01

    Supervised machine learning methods for clinical natural language processing (NLP) research require a large number of annotated samples, which are very expensive to build because of the involvement of physicians. Active learning, an approach that actively samples from a large pool, provides an alternative solution. Its major goal in classification is to reduce the annotation effort while maintaining the quality of the predictive model. However, few studies have investigated its uses in clinical NLP. This paper reports an application of active learning to a clinical text classification task: to determine the assertion status of clinical concepts. The annotated corpus for the assertion classification task in the 2010 i2b2/VA Clinical NLP Challenge was used in this study. We implemented several existing and newly developed active learning algorithms and assessed their uses. The outcome is reported in the global ALC score, based on the Area under the average Learning Curve of the AUC (Area Under the Curve) score. Results showed that when the same number of annotated samples was used, active learning strategies could generate better classification models (best ALC – 0.7715) than the passive learning method (random sampling) (ALC – 0.7411). Moreover, to achieve the same classification performance, active learning strategies required fewer samples than the random sampling method. For example, to achieve an AUC of 0.79, the random sampling method used 32 samples, while our best active learning algorithm required only 12 samples, a reduction of 62.5% in manual annotation effort. PMID:22127105

  14. Classification of Traffic Related Short Texts to Analyse Road Problems in Urban Areas

    Science.gov (United States)

    Saldana-Perez, A. M. M.; Moreno-Ibarra, M.; Tores-Ruiz, M.

    2017-09-01

    The Volunteer Geographic Information (VGI) can be used to understand the urban dynamics. In the classification of traffic related short texts to analyze road problems in urban areas, a VGI data analysis is done over a social media's publications, in order to classify traffic events at big cities that modify the movement of vehicles and people through the roads, such as car accidents, traffic and closures. The classification of traffic events described in short texts is done by applying a supervised machine learning algorithm. In the approach users are considered as sensors which describe their surroundings and provide their geographic position at the social network. The posts are treated by a text mining process and classified into five groups. Finally, the classified events are grouped in a data corpus and geo-visualized in the study area, to detect the places with more vehicular problems.

  15. Translation of Lexical Stylistic Devices from English to Chinese in Com-mercial Advertisements

    Institute of Scientific and Technical Information of China (English)

    林鑫

    2014-01-01

    With rapid development of China, a growing number of foreign products are entering the Chinese market. An excel-lent translation of a product’s advertisement from English to Chinese undoubtedly contributes to its successful promotion in the Chinese market. Although the translation practice contains multiple difficulties, the translation of lexical stylistic devices is a big challenge for translators. It is not simply because lexical stylistic devices are diverse and various in form, but also because most de-vices involve linguistic and cultural differences between English and Chinese. This thesis analyzed a number of current English to Chinese translations of the devices in commercial advertisements, which mainly come from two translation scholars ’works and official websites of world-known brands. By analyzing the selected data, seven translation strategies are found to be the major translation strategies in this respect, namely literal translation, free translation, flexible translation, extended translation, adaptation translation, compensation translation and amplification translation strategies. Moreover, a number of linguistic and cultural issues which need to be considered by translators are also illustrated here.

  16. Active learning for clinical text classification: is it better than random sampling?

    Science.gov (United States)

    Figueroa, Rosa L; Ngo, Long H; Goryachev, Sergey; Wiechmann, Eduardo P

    2012-01-01

    Objective This study explores active learning algorithms as a way to reduce the requirements for large training sets in medical text classification tasks. Design Three existing active learning algorithms (distance-based (DIST), diversity-based (DIV), and a combination of both (CMB)) were used to classify text from five datasets. The performance of these algorithms was compared to that of passive learning on the five datasets. We then conducted a novel investigation of the interaction between dataset characteristics and the performance results. Measurements Classification accuracy and area under receiver operating characteristics (ROC) curves for each algorithm at different sample sizes were generated. The performance of active learning algorithms was compared with that of passive learning using a weighted mean of paired differences. To determine why the performance varies on different datasets, we measured the diversity and uncertainty of each dataset using relative entropy and correlated the results with the performance differences. Results The DIST and CMB algorithms performed better than passive learning. With a statistical significance level set at 0.05, DIST outperformed passive learning in all five datasets, while CMB was found to be better than passive learning in four datasets. We found strong correlations between the dataset diversity and the DIV performance, as well as the dataset uncertainty and the performance of the DIST algorithm. Conclusion For medical text classification, appropriate active learning algorithms can yield performance comparable to that of passive learning with considerably smaller training sets. In particular, our results suggest that DIV performs better on data with higher diversity and DIST on data with lower uncertainty. PMID:22707743

  17. Impact of Stylistic Features, Architectural and Urban Rules of the Algiers Architectural Heritage Dating Between 1830 and 1930 ON the Strength of its Buildings during the Earthquake

    Science.gov (United States)

    Souami, M. A.

    2013-07-01

    In a other work, we have highlighted a theoretical point of view that there is an relation between the earthquake-resistant architectural design codes and, the urban and stylistic characteristics of buildings and urban forms of the Algiers architectural heritage dating between 1830 and 1930. Following this, we hypothesized that its various stylistic and urban characteristics have a direct impact on the resilience of buildings to earthquakes. The purpose of this article is to try through the computer simulation examples of some stylistic and urban characteristics to prove the validity or not of our hypothesis.

  18. Universally Designed Text on the Web: Towards Readability Criteria Based on Anti-Patterns.

    Science.gov (United States)

    Eika, Evelyn

    2016-01-01

    The readability of web texts affects accessibility. The Web Content Accessibility guidelines (WCAG) state that the recommended reading level should match that of someone who has completed basic schooling. However, WCAG does not give advice on what constitutes an appropriate reading level. Web authors need tools to help composing WCAG compliant texts, and specific criteria are needed. Classic readability metrics are generally based on lengths of words and sentences and have been criticized for being over-simplistic. Automatic measures and classifications of texts' reading levels employing more advanced constructs remain an unresolved problem. If such measures were feasible, what should these be? This work examines three language constructs not captured by current readability indices but believed to significantly affect actual readability, namely, relative clauses, garden path sentences, and left-branching structures. The goal is to see whether quantifications of these stylistic features reflect readability and how they correspond to common readability measures. Manual assessments of a set of authentic web texts for such uses were conducted. The results reveal that texts related to narratives such as children's stories, which are given the highest readability value, do not contain these constructs. The structures in question occur more frequently in expository texts that aim at educating or disseminating information such as strategy and journal articles. The results suggest that language anti-patterns hold potential for establishing a set of deeper readability criteria.

  19. Language Personality of the Publicist: Rhetorical and Stylistic Canon (Yu. Senkevich “To “Ra” Across the Atlantic”

    Directory of Open Access Journals (Sweden)

    Olga V. Shatalova

    2017-10-01

    Full Text Available In article the communicative and linguistic parameters of the speech of the publicist of the XX century Yu. Senkevich which are declared as a sample for formation of the language personality in the conditions of development of the modern information and communicative environment are designated. On the example of the analysis of the book «On “Ra” through Atlantic» locates the fact of that Yu.N. Senkevich’s creativity corresponds to the main criteria of popular scientific journalism: high degree of reliability, authoritativeness of the publicist; dynamism, dramatic nature, intelligence of a statement. The unostentatious enlightenment based on updating of scientific knowledge for the addressee and a dialogization of a publicistic discourse, the high level of psychological and philosophical generality, the declaration of humanistic values form specific rhetoric which is supported by the formal and grammatical organization of the speech of the publicist. Priority of syntactic designs of a certain type, stylistic ladders and figures – dynamism of a statement and scale of representation of material define. The humour and easy self-irony as significant characteristics of the language personality define a basis of the rhetorical and stylistic canon realized in publicistic works Yu.N. Senkevich – «the human view of people and society» that in the modern information and communicative environment becomes a necessary reference point.

  20. DEEP LEARNING MODEL FOR BILINGUAL SENTIMENT CLASSIFICATION OF SHORT TEXTS

    Directory of Open Access Journals (Sweden)

    Y. B. Abdullin

    2017-01-01

    Full Text Available Sentiment analysis of short texts such as Twitter messages and comments in news portals is challenging due to the lack of contextual information. We propose a deep neural network model that uses bilingual word embeddings to effectively solve sentiment classification problem for a given pair of languages. We apply our approach to two corpora of two different language pairs: English-Russian and Russian-Kazakh. We show how to train a classifier in one language and predict in another. Our approach achieves 73% accuracy for English and 74% accuracy for Russian. For Kazakh sentiment analysis, we propose a baseline method, that achieves 60% accuracy; and a method to learn bilingual embeddings from a large unlabeled corpus using a bilingual word pairs.

  1. A Corpus-based Stylistic Analysis of Body-Soul and Heaviness-Lightness Metaphors in Kundera's Novel The Unbearable Lightness of Being

    Directory of Open Access Journals (Sweden)

    Khalid Shakir Hussein

    2015-10-01

    Full Text Available This paper represents an attempt to conduct a corpus-based stylistic analysis of  two conceptual metaphors in The Unbearable Lightness of Being, which is a novel written by Milan Kundera. Soul-body and lightness-heaviness metaphors are foregrounded as being central themes all through the novel. The way such metaphors are used in the novel indicates an insightful employment of metaphor as a cognitive tool which empowers language users with a capacity of conceptualizing different experiences. The researcher adopts conceptual metaphor theory to produce a sort of conceptual analysis incorporating Leech's semantic componential analysis within the overall analytic procedure. Different techniques are figured out in relation to the creative ways of manipulating the cognitive level of language, such as conceptual switching, conceptual extension, and conceptual fusion. These creative techniques are carefully used in the novel under investigation with different ranges of metaphorical creativity. Conceptual switching might be simple but very active in deviating from the conventional conceptual system. Conceptual extension marks certain minute elaborations conventional metaphors undergo extending the limits of cognitive conceptualization. As for conceptual fusion, it proves to be interestingly powerful in producing certain aggregations of metaphorical mappings. Keywords: Conceptual Metaphor Theory, Metaphorical Creativity, Metaphorical Mappings, Corpus Stylistics

  2. Clipped Wings and the Great Abyss: Cognitive Stylistics and Implicatures in Abiezer Coppe’s ‘Prophetic’ Recantation

    Directory of Open Access Journals (Sweden)

    Borgogni Daniele

    2017-03-01

    Full Text Available In this article, two major paradigms within cognitive stylistics, the Conceptual Metaphor Theory (CMT and the Conceptual Integration Theory (CIT, are applied as largely complementary approaches to discuss the scope and implicatures of the central metaphorical image of Copp’s Return to the wayes of Truth (1651, a text written by one of the most famous radical preachers of the Civil War period as a plea to be released from prison. The article will focus on how the linguistic and cultural contexts of Coppe’s prophetic writing, in their interaction with the dynamic conceptual relationships of a conceptual integration network, open up new possibilities of perspectivizing and insinuating radically different meanings and implicatures: the use of blends in Coppe’s text has a direct effect on the structure of the analogies that can be made between mental spaces, thereby triggering new meaning effects, supplementary symbolizing patterns, and unpredictable perlocutionary effects.

  3. Multidimensionality of Teachers' Graded Responses for Preschoolers' Stylistic Learning Behavior: The Learning-to-Learn Scales

    Science.gov (United States)

    McDermott, Paul A.; Fantuzzo, John W.; Warley, Heather P.; Waterman, Clare; Angelo, Lauren E.; Gadsden, Vivian L.; Sekino, Yumiko

    2011-01-01

    Assessment of preschool learning behavior has become very popular as a mechanism to inform cognitive development and promote successful interventions. The most widely used measures offer sound predictions but distinguish only a few types of stylistic learning and lack sensitive growth detection. The Learning-to-Learn Scales was designed to…

  4. Evaluation and Classification of Syntax Usage in Determining Short-Text Semantic Similarity

    Directory of Open Access Journals (Sweden)

    V. Batanović

    2014-06-01

    Full Text Available This paper outlines and categorizes ways of using syntactic information in a number of algorithms for determining the semantic similarity of short texts. We consider the use of word order information, part-of-speech tagging, parsing and semantic role labeling. We analyze and evaluate the effects of syntax usage on algorithm performance by utilizing the results of a paraphrase detection test on the Microsoft Research Paraphrase Corpus. We also propose a new classification of algorithms based on their applicability to languages with scarce natural language processing tools.

  5. Automatic Amharic text news classification: Aneural networks ...

    African Journals Online (AJOL)

    School of Computing and Electrical Engineering, Institute of Technology, Bahir Dar University, Bahir Dar ... The study is on classification of Amharic news automatically using neural networks approach. Learning Vector ... INTRODUCTION.

  6. Zero-Shot Style Transfer in Text Using Recurrent Neural Networks

    OpenAIRE

    Carlson, Keith; Riddell, Allen; Rockmore, Daniel

    2017-01-01

    Zero-shot translation is the task of translating between a language pair where no aligned data for the pair is provided during training. In this work we employ a model that creates paraphrases which are written in the style of another existing text. Since we provide the model with no paired examples from the source style to the target style during training, we call this task zero-shot style transfer. Herein, we identify a high-quality source of aligned, stylistically distinct text in Bible ve...

  7. Magical cooking: Some stylistic characteristics of the novel Like water for chocolate

    Directory of Open Access Journals (Sweden)

    Uršula Kastelic Vukadinović

    2013-12-01

    Full Text Available In the paper we analyze some of the stylistic characteristics of the novel Like Water for Chocolate from the perspective of a non-Mexican reader. The narrator of the story interlaces the typical linguistic structures specific to cooking recipes and the story of a forbidden love between Tita and Pedro. As a good cook she gives advice, describes the procedures for the preparation of the food, all of which she intertwines with the story as once told to her, which is explicitly expressed by using reporting verbs. We found out that one of the characteristics of the novel is the visualization of the story. The author herself says that her literary creation is based on visual images and that afterwards she turns them into a story. In the novel this is reflected in Tita`s intense response to sensory stimuli. In the article, we highlight examples from the text, structured according to the predominant senses involved. Through the food that Tita is preparing Tita and Pedro establish an unusual, passionate and sensual relationship. Among the ingredients for the dishes that are connected to their love story, the reader encounters many unknown Mexican cultural expressions, which do not hinder the understanding of the story. The text shows that we are dealing with a dish or an ingredient that is mentioned at the beginning as a part of the recipe or presented in such a context. Therefore, the coherence of the text is maintained. The reader takes pleasure in the reading that transports him to the distant worlds of magical realism.

  8. The Use of Systemic-Functional Linguistics in Automated Text Mining

    Science.gov (United States)

    2009-03-01

    what degree two or more documents are similar in terms of their meaning. Simply put, such a cognitive model aims to link the physical manifestation...These features, both in terms of frequency and their chaining across a text, were taken as salient stylistic features that had a direct relationship to...because SFL attempts to model these cognitive processes, this has the potential to improve NLP tasks by making them more ’human-like’. Secondly

  9. Linguistic and Cognitive Characteristics of the Composition of the Text of J. K. Rowling's English Tales

    Science.gov (United States)

    Solodova, Elena

    2015-01-01

    This article focuses on linguistic and cognitive characteristics inherent in the composition of the English postmodern tales written by J.K. Rowling. The composition of the text is viewed as linguistic and cognitive construal that integrates compositional plot structure, compositional meaning structure, linguistic and stylistic means of their…

  10. Stylistics of “Tarikh i Balami”

    Directory of Open Access Journals (Sweden)

    Ali Mohammadi asiyabadi

    2016-05-01

    Full Text Available AbstractTarikh i Balami", written by abu Ali Mohammad Ibn Abdollah Balami, Minister of Samanis in the fourth century, is one of the oldest books in Islamic period that contains the first step of creation till Arab raid and the prophet's life and the kings. Amir Mansur ibn Noah Samani ordered Translation of the book “Tarikh al-Omam va al-Muluk” of Tabari to his minister , Balami. But on several occasions he summarized The original book in the translation and used the other sources and removed something from his translation.   Different versions of this book show that many differences between eleven versions used for comparison, none of them became similar to each other. Replacing the oldest words with new words and the Arabic word with Persian word in the same meaning is one of problems  that copyists have caused for stylistics of this book.   Writing "Tarikh i Balami" is related to the first period of Persian prose in Iran. During this time that took Samanis and Tahiris period, the authors have written in a simple style. This style of writing, also called Khurasani style, because the authors have lived in Khorasan. The style is also called Balami, the author of the book, because he was one of the most prominent writers of this genre.   Results of comparison between this book and the other books of this period, such as mughaddamih Shahnameh abu Mansuri, translation of Tafsir i Tabari, Hudud al-alam min al-Mashriq ela al-Maqrib, Tafsir i Pak and…. show that the most important stylistic features in its language level are significant in some areas. One of them is the author tried to use Persian words, but sometimes the shorter and more familiar Arabic words have been replaced. -        Some verbs used in specific ways, for example, the old prefixes “فرا”, “فرو”, “باز”,… are used with verbs. -

  11. MASCULINE LANGUAGE IN INDONESIAN NOVELS: A FEMINIST STYLISTIC APPROACH ON BELENGGU AND PENGAKUAN PARIYEM

    Directory of Open Access Journals (Sweden)

    Supriyadi .

    2014-06-01

    Full Text Available Belenggu is a novel written by Armijn Pane in 1938, whereas Pengakuan Pariyem is a lyrical novel written by Linus Suryadi AG that published in 1980. Both are interested to be analyzed from linguistic aspects, especially in relation to gender dan patriarchal issues. In this case, the proper approach is feminist stylistics by Sara Mills since it analyzed literary works from linguistic aspects and then is enlarged on the contexts of surroundings when it was published. The results are that Belenggu basically used masculine languages including word, phrases, clauses, sentences, and discourses when it is related to its contexts. Contextually, Belenggu represented author responses to conditions of his society in which women tried to insist their rights for equality (to men. It also represented author’s critic to women since it is better that women still work domestically and support her husband. Meanwhile, Pengakuan Pariyem is a lyrical novel that considers men and women have mutual relationships although women still work domestically and men work outside.

  12. Creative and Stylistic Devices Employed by Children During a Storybook Narrative Task: A Cross-Cultural Study

    Science.gov (United States)

    Gorman, Brenda K.; Fiestas, Christine E.; Peña, Elizabeth D.; Clark, Maya Reynolds

    2018-01-01

    Purpose The purpose of this study was to analyze the effects of culture on the creative and stylistic features children employ when producing narratives based on wordless picture books. Method Participants included 60 first- and second-grade African American, Latino American, and Caucasian children. A subset of narratives based on wordless picture books collected as part of a larger study was coded and analyzed for the following creative and stylistic conventions: organizational style (topic centered, linear, cyclical), dialogue (direct, indirect), reference to character relationships (nature, naming, conduct), embellishment (fantasy, suspense, conflict), and paralinguistic devices (expressive sounds, exclamatory utterances). Results Many similarities and differences between ethnic groups were found. No significant differences were found between ethnic groups in organizational style or use of paralinguistic devices. African American children included more fantasy in their stories, Latino children named their characters more often, and Caucasian children made more references to the nature of character relationships. Conclusion Even within the context of a highly structured narrative task based on wordless picture books, culture influences children’s production of narratives. Enhanced understanding of narrative structure, creativity, and style is necessary to provide ecologically valid narrative assessment and intervention for children from diverse cultural backgrounds. PMID:21278258

  13. Monolingual accounting dictionaries for EFL text production

    Directory of Open Access Journals (Sweden)

    Sandro Nielsen

    2006-10-01

    Full Text Available Monolingual accounting dictionaries are important for producing financial reporting texts in English in an international setting, because of the lack of specialised bilingual dictionaries. As the intended user groups have different factual and linguistic competences, they require specific types of information. By identifying and analysing the users' factual and linguistic competences, user needs, use-situations and the stages involved in producing accounting texts in English as a foreign language, lexicographers will have a sound basis for designing the optimal English accounting dictionary for EFL text production. The monolingual accounting dictionary needs to include information about UK, US and international accounting terms, their grammatical properties, their potential for being combined with other words in collocations, phrases and sentences in order to meet user requirements. Data items that deal with these aspects are necessary for the international user group as they produce subject-field specific and register-specific texts in a foreign language, and the data items are relevant for the various stages in text production: draft writing, copyediting, stylistic editing and proofreading.

  14. A STYLISTIC ANALYSIS OF “THE RIME OF THE ANCIENT MARINER”

    Directory of Open Access Journals (Sweden)

    Shaukat Khan

    2016-12-01

    Full Text Available If a specimen of literary art is seen as a fine tapestry of words made by the skilled seamstress—the poet, then the lexis and structure of a language are the raw materials—the fabric and the thread—by weaving which into specific patterns the finished product is achieved. The choice of materials and their arrangement into unique patterns always bear an image of their creator, or the artist; thus, a close view of them reveals the artist’s identity and brings out the meaningful message that underlies the ornate running threads. Mostly, the students of literary studies cannot appreciate the beauty of the literary classics on their own. Consequently, they simply mimic the ideas, and sometimes even the words, of famous professional critics when asked to give their own critical judgment on the aesthetic merit or the thematic quality of a literary work in the shape of a home assignment, classroom presentation or an annual assessment test. Now, the researcher has got the inspiration for carrying out this study from an idea expounded in Widdowson (1975 that this mimicry can be replaced by genuine individual opinion if the students, or even those people who have non-academic concerns with literature, are brought to a standpoint from where they can have a closer view of the raw materials, the language resources, which are involved in the making of a literary product. And, if the product in its finished form cannot elicit a desired response from them then making them sensitive to the process of its making can be quite effective in this regard. Through the present study, an attempt has been made to show an easy access to the outlandish world of verse by means of the linguistic route which is laid with the familiar flagstones of grammar and vocabulary. Meaning thereby that in this study the elusiveness of poetry will be dealt with the precision of a social scientist, the linguist. The approach which serves as the basis of this study is not an invention

  15. Managing interactions between technological and stylistic innovation in the media industries, insights from the introduction of ebook technology in the publishing industry

    NARCIS (Netherlands)

    T.S. Schweizer (Sophie)

    2002-01-01

    textabstractThe mainstream of innovation research pays a lot of attention to technological innovation, but has neglected its interaction with another type of innovation, which is particularly important in sectors like the furniture, fashion and the media content industries: stylistic innovation.

  16. POETICS OF TRANSCENDENCE: STYLISTIC REDUCTION AS A TOOL FOR REPRESENTATION OF SACRED MEANINGS

    Directory of Open Access Journals (Sweden)

    Elena Brazgovskaya

    2016-10-01

    Full Text Available The main direction of the work is connected to the representation of abstract (transcendent objects in music and literature. The article analyses "Cantus in Memoriam Benjamin Britten" by Arvo Pärt and some poems of Czesław Miłosz. The metaphysical dimension of reality involves forms and things, existing beyond the boundaries of empirical perception and, at first sight, beyond the descriptive practices. Abstract objects are available in intellectual experience, but culture must transform them into a symbolic form. As a rule, it is connected to the practice of art minimalism. The essence of minimalism is the reduction of number of stylistic tools and “purification” the perception from the visual / auditory images (not a mimetic use of language. For the representation of the sacred Pärt uses only mensural canon form, scale and chord. These “characters” are deprived of descriptive function, but have symbolic potential (canon as a sign of stopped time, the eternal return. The distinctive feature of the Miłoszʼs style is the pursuit to “clean” the signs (indexical and symbolic. There is the reverse side of language distillation: the rejection of the subjective position, emotional experience, the distance between the person and the object of representation.

  17. Sounding Sacred: The Adoption of Biblical Archaisms in the Book of Mormon and Other 19th Century Texts

    Science.gov (United States)

    Bowen, Gregory A.

    2016-01-01

    The Book of Mormon is a text published in 1830 and considered a sacred work of scripture by adherents of the Latter-day Saint movement. Although written 200 years later, it exhibits many linguistic features of the King James translation of the Bible. Such stylistic imitation has been little studied, though a notable exception is Sigelman &…

  18. Phraseosemantic peculiarities of idioms with the word «silki» (snares (a case study of Russian classics and modern literature texts

    Directory of Open Access Journals (Sweden)

    Andrianova D.A.

    2017-03-01

    Full Text Available this article explores semantic and stylistic meaning changes of idioms with the word “silki” (snares during XVIII–XXI centuries on the basis of Russian classics and modern literature texts and publicistic writing. It is proved that the word “silki” (snares was used as a biblical expression in ecclesiastic and some fiction texts, this explanes its strong negative connotation, which is out of use in up-to-date contexts.

  19. LANGUAGE STYLE OF HABIBURRAHMAN EL-SHIRAZY IN THE DWILOGY OF AYAT-AYAT CINTA: A STYLISTIC STUDY

    Directory of Open Access Journals (Sweden)

    Aflahah Aflahah

    2017-05-01

    Full Text Available Novel is an artwork which closely relates to human life and is considered as the representation of human life journey. The language style in novel is the embodiment of language use by an author to express ideas, emotion, opinion, and to give a certain effect. The main problem will be discussed in this study is the language style found in the novels of AAC 1 and 2. This study is about individual language style, an author’s language style who have written best seller novels, namely Habiburrahman El-Shirazy (HES. The approach that is considered very appropriate in understanding the language use of HES in Dwilogy of Ayat-Ayat Cinta (DAAC is stylistic study. A stylistic study reveals how is the language style used by HES and what effects that are resulted. The approach of this research is a descriptive qualitative research. Descriptive method is used to describe the linguistic facts such as the language style based on lexical choice, sentence structure, and direct or indirect meaning. The use of language style based on lexical choice (diction in DAAC shows a typical language style of HES as well as shows his ability as a Da’i and man of letters. The analysis of language style based on sentence structure and direct/indirect meaning illustrates the descriptive style of HES. Whatever he describes, it must be very accurate. He gives very detail information of the object being spoken of either background of the story or characterization. Through language style, HES describes the emotions experienced by the characters properly so it makes the readers able to empathize. From the data had been obtained, it reveals that language style used by HES to describe characters or characterization (describe the physical condition, characters, and characteristics, describes its background, tells the plot, and convey the message. The results showed the lexical choice of scientific words and religious words are typical of HES’s language style in his DAAC, the

  20. Construction accident narrative classification: An evaluation of text mining techniques.

    Science.gov (United States)

    Goh, Yang Miang; Ubeynarayana, C U

    2017-11-01

    Learning from past accidents is fundamental to accident prevention. Thus, accident and near miss reporting are encouraged by organizations and regulators. However, for organizations managing large safety databases, the time taken to accurately classify accident and near miss narratives will be very significant. This study aims to evaluate the utility of various text mining classification techniques in classifying 1000 publicly available construction accident narratives obtained from the US OSHA website. The study evaluated six machine learning algorithms, including support vector machine (SVM), linear regression (LR), random forest (RF), k-nearest neighbor (KNN), decision tree (DT) and Naive Bayes (NB), and found that SVM produced the best performance in classifying the test set of 251 cases. Further experimentation with tokenization of the processed text and non-linear SVM were also conducted. In addition, a grid search was conducted on the hyperparameters of the SVM models. It was found that the best performing classifiers were linear SVM with unigram tokenization and radial basis function (RBF) SVM with uni-gram tokenization. In view of its relative simplicity, the linear SVM is recommended. Across the 11 labels of accident causes or types, the precision of the linear SVM ranged from 0.5 to 1, recall ranged from 0.36 to 0.9 and F1 score was between 0.45 and 0.92. The reasons for misclassification were discussed and suggestions on ways to improve the performance were provided. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. Orienting task effects on text recall in adulthood.

    Science.gov (United States)

    Simon, E W; Dixon, R A; Nowak, C A; Hultsch, D F

    1982-09-01

    This investigation examined the effects of orienting task-controlled processing on the text recall of younger (18 to 32 years), middle-aged (39 to 51 years), and older (59 to 76 years) adults. The participants were presented with a 500-word narrative text. Three groups performed orienting tasks (syntactic, stylistic, advice) within an incidental memory paradigm. A fourth group was asked for intentional recall. Analysis indicated a significant age by orienting task interaction. Younger adults recalled more propositions when recall was intentional or when it was preceded by a deep-orienting task than when it was preceded by a shallow-orienting task. Middle-aged and older adults recalled more propositions when recall was intentional than when it was incidental, regardless of the depth of the orienting task. There were no significant differences in intentional recall. In addition, a significant age x orienting task x propositional level interaction indicated that younger adults recalled more of the main ideas of the text following deep processing, whereas the middle-aged and older adults recalled more of these ideas following intentional processing.

  2. A Feature Selection Method Based on Fisher's Discriminant Ratio for Text Sentiment Classification

    Science.gov (United States)

    Wang, Suge; Li, Deyu; Wei, Yingjie; Li, Hongxia

    With the rapid growth of e-commerce, product reviews on the Web have become an important information source for customers' decision making when they intend to buy some product. As the reviews are often too many for customers to go through, how to automatically classify them into different sentiment orientation categories (i.e. positive/negative) has become a research problem. In this paper, based on Fisher's discriminant ratio, an effective feature selection method is proposed for product review text sentiment classification. In order to validate the validity of the proposed method, we compared it with other methods respectively based on information gain and mutual information while support vector machine is adopted as the classifier. In this paper, 6 subexperiments are conducted by combining different feature selection methods with 2 kinds of candidate feature sets. Under 1006 review documents of cars, the experimental results indicate that the Fisher's discriminant ratio based on word frequency estimation has the best performance with F value 83.3% while the candidate features are the words which appear in both positive and negative texts.

  3. Text, Style, and Author in Hamlet Q1

    Directory of Open Access Journals (Sweden)

    Christy Desmet

    2016-03-01

    Full Text Available The first quarto of Hamlet has traditionally been an embarrassment to attribution studies. Textual and bibliographical studies from the 1980s and beyond have permitted suspect texts to be recovered and performed, but critical appreciation tends to focus on such matters as characterization and performance possibilities rather than the text’s rhetorical integrity and aesthetic qualities. More recently, we have seen greater critical attention to Shakespeare’s suspect texts, which has increased our appreciation for and expanded our notion of Q1 Hamlet as a ‘text’. Opinion remains divided, however, on the question of who ‘wrote’ this play. This essay addresses the authorship debate somewhat indirectly by providing a different view of Hamlet Q1 based on a stylistic analysis that is grounded in Renaissance rhetoric. It characterizes the play’s style as the rhetoric of speed, with brachylogia as its representative rhetorical figure. Through review of theories about the composition of Hamlet Q1 and a rhetorical analysis of its style, the essay seeks to examine how Hamlet’s first quarto might have a recognizable style and how that style might be related to current concepts of authorship.

  4. DIRASAH AL-USLUB WA AL-USLUBIYAH FI NAQD AL-ARABY

    Directory of Open Access Journals (Sweden)

    Habibullah Ali Ibrahim Ali

    2015-08-01

    Full Text Available In this study, we addressed method and style in Arabic criticism. By the efforts of Arab critics of old and new issue, the subjects of study were as follows: First, the stylistic origins. It means a stylistic in linguistic and rhetoric and literary criticism and other related; Second, analysis of stylistic levels where we focus on stylistic study premises for literary texts; Third, a study of the method in the old Arab criticism, in this part of the study, we noted some efforts by Arab critics about stylistic and their relevance studies in this stylistic aspect; Fourth, the stylistic of contemporary Arab criticism. This aspect of the survey stood at some critics of contemporary Arab studies in Stylistics, as stated what distinguishes each study than other studies. It is included as an appendix in whale most important results achieved in this study.

  5. MT Post-editing: A Text Repair Experience for the Foreign Language Class.

    Directory of Open Access Journals (Sweden)

    Ana Niño

    2007-04-01

    Full Text Available Communication also means having to sort out the problems involved in learning a foreign language, especially with regards to production rather than reception. These learning strategies or skills can also be applied to translation teaching methodology, where students put in practice their risk taking, avoidance, reduction and/ or compensatory strategies in getting the message across. We acknowledge translation as a writing task constrained by the source text. In addition, the translation and the writing cycles have in common a generation stage and a revision stage where grammatical, lexical and stylistic correctness is assessed. Somewhere in the middle between translation and writing skills lies MT (Machine Translation post-editing that involves correcting the raw MT output with the aim of providing a quality text according to the intended purpose. Our research is intended to test the suitability of MT post-editing as an activity to promote error correction and, subsequently, to enhance written production in second and foreign language teaching.

  6. The contribution of the vaccine adverse event text mining system to the classification of possible Guillain-Barré syndrome reports.

    Science.gov (United States)

    Botsis, T; Woo, E J; Ball, R

    2013-01-01

    We previously demonstrated that a general purpose text mining system, the Vaccine adverse event Text Mining (VaeTM) system, could be used to automatically classify reports of an-aphylaxis for post-marketing safety surveillance of vaccines. To evaluate the ability of VaeTM to classify reports to the Vaccine Adverse Event Reporting System (VAERS) of possible Guillain-Barré Syndrome (GBS). We used VaeTM to extract the key diagnostic features from the text of reports in VAERS. Then, we applied the Brighton Collaboration (BC) case definition for GBS, and an information retrieval strategy (i.e. the vector space model) to quantify the specific information that is included in the key features extracted by VaeTM and compared it with the encoded information that is already stored in VAERS as Medical Dictionary for Regulatory Activities (MedDRA) Preferred Terms (PTs). We also evaluated the contribution of the primary (diagnosis and cause of death) and secondary (second level diagnosis and symptoms) diagnostic VaeTM-based features to the total VaeTM-based information. MedDRA captured more information and better supported the classification of reports for GBS than VaeTM (AUC: 0.904 vs. 0.777); the lower performance of VaeTM is likely due to the lack of extraction by VaeTM of specific laboratory results that are included in the BC criteria for GBS. On the other hand, the VaeTM-based classification exhibited greater specificity than the MedDRA-based approach (94.96% vs. 87.65%). Most of the VaeTM-based information was contained in the secondary diagnostic features. For GBS, clinical signs and symptoms alone are not sufficient to match MedDRA coding for purposes of case classification, but are preferred if specificity is the priority.

  7. The Contribution of the Vaccine Adverse Event Text Mining System to the Classification of Possible Guillain-Barré Syndrome Reports

    Science.gov (United States)

    Botsis, T.; Woo, E. J.; Ball, R.

    2013-01-01

    Background We previously demonstrated that a general purpose text mining system, the Vaccine adverse event Text Mining (VaeTM) system, could be used to automatically classify reports of an-aphylaxis for post-marketing safety surveillance of vaccines. Objective To evaluate the ability of VaeTM to classify reports to the Vaccine Adverse Event Reporting System (VAERS) of possible Guillain-Barré Syndrome (GBS). Methods We used VaeTM to extract the key diagnostic features from the text of reports in VAERS. Then, we applied the Brighton Collaboration (BC) case definition for GBS, and an information retrieval strategy (i.e. the vector space model) to quantify the specific information that is included in the key features extracted by VaeTM and compared it with the encoded information that is already stored in VAERS as Medical Dictionary for Regulatory Activities (MedDRA) Preferred Terms (PTs). We also evaluated the contribution of the primary (diagnosis and cause of death) and secondary (second level diagnosis and symptoms) diagnostic VaeTM-based features to the total VaeTM-based information. Results MedDRA captured more information and better supported the classification of reports for GBS than VaeTM (AUC: 0.904 vs. 0.777); the lower performance of VaeTM is likely due to the lack of extraction by VaeTM of specific laboratory results that are included in the BC criteria for GBS. On the other hand, the VaeTM-based classification exhibited greater specificity than the MedDRA-based approach (94.96% vs. 87.65%). Most of the VaeTM-based information was contained in the secondary diagnostic features. Conclusion For GBS, clinical signs and symptoms alone are not sufficient to match MedDRA coding for purposes of case classification, but are preferred if specificity is the priority. PMID:23650490

  8. Prominent stylistic aspects in music of Nāser Khosrow's poetry

    Directory of Open Access Journals (Sweden)

    Morteza Mohseni

    2017-04-01

    Full Text Available It is axiomatic  for those who are stylistically a little familiar  about  the periods of  Persian  poetry  that  quasidas  of  Nāser-e Khosrow are totally  different  from  those  of  the  poets  in  fifth  lunar  century  both  in  terms  of  their  contents  and  technically.  This difference  is  seen  even  in  those  areas  of  his  poetry in which the poet is confided to make innovations. This paper investigates Nāser-e Khosrow's style in the field of the music of poetry. It also aims to identify his slylistic differences comparing with those of his most famous contemporaries (Onsori, Farrokhi and Manoochehri in external, lateral and internal areas.      As the first  step, all the Nāser Khosrow's poems -exept for additions section- was considered the Scope of this study, accompanied by main parts of other three cited  poets' divans (poetical works which consists almost 23600 distiches. For the second stage, each poet's divan were separatly scrutinized in three fields of the music of poetry. The frequency of each cases were recorded and after comparing statistics related to Nāser-e Khosrow's poems with other tree poets,  stylistic charachteristics of his poetry were explored. Statistical information related to the poets were generally recorded in a table and some parts of them were shown in a bar graph. It seems necessary to note that the researcher considered two items of  innovation and frequncy in all phases of the study.     Most of the studies done in the field of Nāser-e Khosrow's music of poetry, investigated the prosody and meter of his poetry. Most of these studies considered difficulty and relevancy as the important prosodic characteristics of  his poetry. Regarding the lateral and internal areas of Nāser-e Khosrow's poetry, there are not much argument proposed except for using difficult rhymes and nominal radifs ( for lateral music of poetry and attending to figures of

  9. Clustering and classification of email contents

    Directory of Open Access Journals (Sweden)

    Izzat Alsmadi

    2015-01-01

    Full Text Available Information users depend heavily on emails’ system as one of the major sources of communication. Its importance and usage are continuously growing despite the evolution of mobile applications, social networks, etc. Emails are used on both the personal and professional levels. They can be considered as official documents in communication among users. Emails’ data mining and analysis can be conducted for several purposes such as: Spam detection and classification, subject classification, etc. In this paper, a large set of personal emails is used for the purpose of folder and subject classifications. Algorithms are developed to perform clustering and classification for this large text collection. Classification based on NGram is shown to be the best for such large text collection especially as text is Bi-language (i.e. with English and Arabic content.

  10. Sentiment classification technology based on Markov logic networks

    Science.gov (United States)

    He, Hui; Li, Zhigang; Yao, Chongchong; Zhang, Weizhe

    2016-07-01

    With diverse online media emerging, there is a growing concern of sentiment classification problem. At present, text sentiment classification mainly utilizes supervised machine learning methods, which feature certain domain dependency. On the basis of Markov logic networks (MLNs), this study proposed a cross-domain multi-task text sentiment classification method rooted in transfer learning. Through many-to-one knowledge transfer, labeled text sentiment classification, knowledge was successfully transferred into other domains, and the precision of the sentiment classification analysis in the text tendency domain was improved. The experimental results revealed the following: (1) the model based on a MLN demonstrated higher precision than the single individual learning plan model. (2) Multi-task transfer learning based on Markov logical networks could acquire more knowledge than self-domain learning. The cross-domain text sentiment classification model could significantly improve the precision and efficiency of text sentiment classification.

  11. Simile: the most salient stylistic feature in Kelile and Demne

    Directory of Open Access Journals (Sweden)

    Maryam Mahmoodi

    2014-11-01

    Full Text Available Abstract Kelile and Demne is one of the most salient samples of Persian technical prose rhetorical and semantic figures and figures of speech namely simile, metaphor, metonymy and irony are among the stylistic features of this book. Among these, simile, as the most influential imagination tool, play a dominant role in the illustrations of the book. In this article, simile has been analyzed and investigated in all its variations in Kelile and Demne. In this book, simile appears from its most laconic form (eloquent simile to its most extensive form. But the major feature of theirs is their outspokenness, explicitness and sometimes their novelty. Among the likening components, the range of image vocabulary is one of the likening features in this book. Also the point of similarity has been usually abstracted from man's states, shape, place, space, volume and generally affairs concerning visual and tactile senses. So, its perception is not too much difficult. The variety and extension of likening vehicles in this work are worth of contemplating. In analysis of simile on the credit of both parties, we can conclude that ratio-emotional similes are of the most frequent kinds of simile. And Nasrollah Monshi has extended the field of emotional similes and has manipulated the relations between objects in a novel way. Allegoric simile has been used abundantly in Kelile and Demne.  It justifies the didactic function of this text. Allegory approaches its main role in this book. i.e. arguing and convincing. The contents of allegories in this book are moral and political and in terms of from, they are anecdotes of animals and human beings. The types of similes on the credit of form - namely equalization similes implied comparative similes and subtrahend similes - have been also used. Among the salient features of this book, several images together or in interference with each other have been used in one word or sentence. Sometimes similes accompany other

  12. Simile: the most salient stylistic feature in Kelile and Demne

    Directory of Open Access Journals (Sweden)

    Maryam Mahmoodi

    2014-12-01

    Full Text Available Abstract Kelile and Demne is one of the most salient samples of Persian technical prose rhetorical and semantic figures and figures of speech namely simile, metaphor, metonymy and irony are among the stylistic features of this book. Among these, simile, as the most influential imagination tool, play a dominant role in the illustrations of the book. In this article, simile has been analyzed and investigated in all its variations in Kelile and Demne. In this book, simile appears from its most laconic form (eloquent simile to its most extensive form. But the major feature of theirs is their outspokenness, explicitness and sometimes their novelty. Among the likening components, the range of image vocabulary is one of the likening features in this book. Also the point of similarity has been usually abstracted from man's states, shape, place, space, volume and generally affairs concerning visual and tactile senses. So, its perception is not too much difficult. The variety and extension of likening vehicles in this work are worth of contemplating. In analysis of simile on the credit of both parties, we can conclude that ratio-emotional similes are of the most frequent kinds of simile. And Nasrollah Monshi has extended the field of emotional similes and has manipulated the relations between objects in a novel way. Allegoric simile has been used abundantly in Kelile and Demne.  It justifies the didactic function of this text. Allegory approaches its main role in this book. i.e. arguing and convincing. The contents of allegories in this book are moral and political and in terms of from, they are anecdotes of animals and human beings. The types of similes on the credit of form - namely equalization similes implied comparative similes and subtrahend similes - have been also used. Among the salient features of this book, several images together or in interference with each other have been used in one word or sentence. Sometimes similes accompany other

  13. Learning From Short Text Streams With Topic Drifts.

    Science.gov (United States)

    Li, Peipei; He, Lu; Wang, Haiyan; Hu, Xuegang; Zhang, Yuhong; Li, Lei; Wu, Xindong

    2017-09-18

    Short text streams such as search snippets and micro blogs have been popular on the Web with the emergence of social media. Unlike traditional normal text streams, these data present the characteristics of short length, weak signal, high volume, high velocity, topic drift, etc. Short text stream classification is hence a very challenging and significant task. However, this challenge has received little attention from the research community. Therefore, a new feature extension approach is proposed for short text stream classification with the help of a large-scale semantic network obtained from a Web corpus. It is built on an incremental ensemble classification model for efficiency. First, more semantic contexts based on the senses of terms in short texts are introduced to make up of the data sparsity using the open semantic network, in which all terms are disambiguated by their semantics to reduce the noise impact. Second, a concept cluster-based topic drifting detection method is proposed to effectively track hidden topic drifts. Finally, extensive studies demonstrate that as compared to several well-known concept drifting detection methods in data stream, our approach can detect topic drifts effectively, and it enables handling short text streams effectively while maintaining the efficiency as compared to several state-of-the-art short text classification approaches.

  14. Making School Development Credible. Text, Context, Irony

    Directory of Open Access Journals (Sweden)

    Mats Börjesson

    2012-01-01

    Full Text Available

    The article argues for the importance of an open, reflexive-methodological approach when switching between studying text, context and researcher activity. Close linguistic analysis can benefit from being linked with the researcher’s contextualisation of his empirical material as well as with more distanced readings. The more specific starting point for this article is that school development, like other similar terms such as school improvement and the like, makes use of linguistic building blocks with which whole narratives about today’s and tomorrow’s schools can be constructed. The subject of the study is a short text issued by the Swedish Schools Inspectorate (Skolinspektionen. Government language changes according to the authorities’ role in society and their own definitions of their functions, and an important aspect here is the legitimacy of the authorities’ texts. By means of various kinds of close linguistic analysis, the above-mentioned text is studied with regard to choice of categories, hierarchies of modalisation and the rhetorical effects of different types of formulations in a broader political-social landscape. The article concludes with a reflective discussion on the relationship between government language and irony as a stylistic device – a device that is based on the results of the close empirical analysis.[i]



    [i] The article is part of the project ”School  Development as Narrative”, funded by the Swedish Research Council. The author would like to thank the two reviewers for very valuable comments.

  15. Investigation into Text Classification With Kernel Based Schemes

    Science.gov (United States)

    2010-03-01

    Document Matrix TDMs Term-Document Matrices TMG Text to Matrix Generator TN True Negative TP True Positive VSM Vector Space Model xxii THIS PAGE...are represented as a term-document matrix, common evaluation metrics, and the software package Text to Matrix Generator ( TMG ). The classifier...AND METRICS This chapter introduces the indexing capabilities of the Text to Matrix Generator ( TMG ) Toolbox. Specific attention is placed on the

  16. The Role of Text Mining in Export Control

    Energy Technology Data Exchange (ETDEWEB)

    Tae, Jae-woong; Son, Choul-woong; Shin, Dong-hoon [Korea Institute of Nuclear Nonproliferation and Control, Daejeon (Korea, Republic of)

    2015-10-15

    Korean government provides classification services to exporters. It is simple to copy technology such as documents and drawings. Moreover, it is also easy that new technology derived from the existing technology. The diversity of technology makes classification difficult because the boundary between strategic and nonstrategic technology is unclear and ambiguous. Reviewers should consider previous classification cases enough. However, the increase of the classification cases prevent consistent classifications. This made another innovative and effective approaches necessary. IXCRS (Intelligent Export Control Review System) is proposed to coincide with demands. IXCRS consists of and expert system, a semantic searching system, a full text retrieval system, and image retrieval system and a document retrieval system. It is the aim of the present paper to observe the document retrieval system based on text mining and to discuss how to utilize the system. This study has demonstrated how text mining technique can be applied to export control. The document retrieval system supports reviewers to treat previous classification cases effectively. Especially, it is highly probable that similarity data will contribute to specify classification criterion. However, an analysis of the system showed a number of problems that remain to be explored such as a multilanguage problem and an inclusion relationship problem. Further research should be directed to solve problems and to apply more data mining techniques so that the system should be used as one of useful tools for export control.

  17. The Role of Text Mining in Export Control

    International Nuclear Information System (INIS)

    Tae, Jae-woong; Son, Choul-woong; Shin, Dong-hoon

    2015-01-01

    Korean government provides classification services to exporters. It is simple to copy technology such as documents and drawings. Moreover, it is also easy that new technology derived from the existing technology. The diversity of technology makes classification difficult because the boundary between strategic and nonstrategic technology is unclear and ambiguous. Reviewers should consider previous classification cases enough. However, the increase of the classification cases prevent consistent classifications. This made another innovative and effective approaches necessary. IXCRS (Intelligent Export Control Review System) is proposed to coincide with demands. IXCRS consists of and expert system, a semantic searching system, a full text retrieval system, and image retrieval system and a document retrieval system. It is the aim of the present paper to observe the document retrieval system based on text mining and to discuss how to utilize the system. This study has demonstrated how text mining technique can be applied to export control. The document retrieval system supports reviewers to treat previous classification cases effectively. Especially, it is highly probable that similarity data will contribute to specify classification criterion. However, an analysis of the system showed a number of problems that remain to be explored such as a multilanguage problem and an inclusion relationship problem. Further research should be directed to solve problems and to apply more data mining techniques so that the system should be used as one of useful tools for export control

  18. Stylistics of Nafthat ol-Masdur by Zeidari Nasvi

    Directory of Open Access Journals (Sweden)

    Fereydon Tahmasbi

    2016-05-01

    disrupt the sequence of the events. The author, in accordance with the society and the available prose, adopted the mix of Arabic and Persian prose and this is one of the effects of socio-political structure on his prose. The presence of different social classes in his book reflects the interaction between literature and society.ReferencesAhmadi, Babak (2009. Creation and Beauty: Hermeneutics and Aesthetic Queries; 5th ed., Tehran: Markaz press.---------------------(2009. Structure and Interpretation of the text, 10th ed., Tehran: Markaz press.Alavi Moghadam, Mahyar (1998. Contemporary Literary Criticism Theories (Formalism and Structuralism. 1st ed., Tehran: The organization of the study and compiling humanities books for universities (SAMT.Boudaryar et al (1995. Perplexity of Signs (examples of postmodern criticism. Babak Ahmadi et al (trans., 1st ed., Tehran: Markaz press.Don Stewart (2004. Structuralism and Post-Structuralism; Abolfazl Sajedi (trans., Journal of Hoze and University, No 36.Eagleton, Terry (2007. A prelude to literary theory; Abbass Mokhber (trans., Tehran: Markaz Publishers.Ghiasi, Mohammad Taghi (1989. An Introduction to Structural Stylistics, Tehran: Shole Andishe.Hossein Panah, Farahnaz (2007. Aesthetics of Dolat Abadi Prose, Development and teaching Persian language and literature Magazine; vol 20, No 4.Kazazi, Mirjalal (2006. Aesthetics of Persian Speech; Expression, 7th ed., Tehran: Markaz Publishers.Khatibi, Hossein (2007. Prose Technique in Persian Literature, 3rd ed., Tehran: Zavvar.Khosravi, Abouzar (2008. Historian Politician, Ata’olmolk Joveini, Mah book of history and Geography, No 126.Khosrow beigi, Hooshang (2007. Memoir writing of Shahab od-Din Nasvi; Zamane, 6th year, No 64.-------------------------------- (2006. Shahab-al-din Nasvi and his Morality, Mah book of history and Geography.Meghdadi, Bahram (1999. Culture and Literary Terminology from Plato to Modern Time, 1st ed., Tehran: Fekre Ruz.Mirsadeghi, Jamal (2004. Story and

  19. Using Unlabeled Data to Improve Text Classification

    National Research Council Canada - National Science Library

    Nigam, Kamal P

    2001-01-01

    .... This dissertation demonstrates that supervised learning algorithms that use a small number of labeled examples and many inexpensive unlabeled examples can create high-accuracy text classifiers...

  20. Information Gain Based Dimensionality Selection for Classifying Text Documents

    Energy Technology Data Exchange (ETDEWEB)

    Dumidu Wijayasekara; Milos Manic; Miles McQueen

    2013-06-01

    Selecting the optimal dimensions for various knowledge extraction applications is an essential component of data mining. Dimensionality selection techniques are utilized in classification applications to increase the classification accuracy and reduce the computational complexity. In text classification, where the dimensionality of the dataset is extremely high, dimensionality selection is even more important. This paper presents a novel, genetic algorithm based methodology, for dimensionality selection in text mining applications that utilizes information gain. The presented methodology uses information gain of each dimension to change the mutation probability of chromosomes dynamically. Since the information gain is calculated a priori, the computational complexity is not affected. The presented method was tested on a specific text classification problem and compared with conventional genetic algorithm based dimensionality selection. The results show an improvement of 3% in the true positives and 1.6% in the true negatives over conventional dimensionality selection methods.

  1. Information gathering for CLP classification

    Directory of Open Access Journals (Sweden)

    Ida Marcello

    2011-01-01

    Full Text Available Regulation 1272/2008 includes provisions for two types of classification: harmonised classification and self-classification. The harmonised classification of substances is decided at Community level and a list of harmonised classifications is included in the Annex VI of the classification, labelling and packaging Regulation (CLP. If a chemical substance is not included in the harmonised classification list it must be self-classified, based on available information, according to the requirements of Annex I of the CLP Regulation. CLP appoints that the harmonised classification will be performed for carcinogenic, mutagenic or toxic to reproduction substances (CMR substances and for respiratory sensitisers category 1 and for other hazard classes on a case-by-case basis. The first step of classification is the gathering of available and relevant information. This paper presents the procedure for gathering information and to obtain data. The data quality is also discussed.

  2. Mapping the stylistic affiliations of Le Corbusier's work

    NARCIS (Netherlands)

    Panigyrakis, P.I.

    2015-01-01

    The paper deals with Le Corbusier's connection to the term "style". A classification of his work in specific architectural styles is discussed, followed by a description of the procedure through which the style of the man Le Corbusier was constructed; in an attempt to search meaning in his continual

  3. Words Matter: Scene Text for Image Classification and Retrieval

    NARCIS (Netherlands)

    Karaoglu, S.; Tao, R.; Gevers, T.; Smeulders, A.W.M.

    Text in natural images typically adds meaning to an object or scene. In particular, text specifies which business places serve drinks (e.g., cafe, teahouse) or food (e.g., restaurant, pizzeria), and what kind of service is provided (e.g., massage, repair). The mere presence of text, its words, and

  4. Means of temporal expressions in newspaper news and press report

    Directory of Open Access Journals (Sweden)

    Čutura Ilijana R.

    2016-01-01

    Full Text Available This paper analyses most frequent linguistic means for expressing the temporal frame in the printed news and press reports. With structuralism as a chosen theoretical framework, the approach of the research is qualitative and stylistic. Since the study belongs to the field of functional stylistics, the primary methods used in the study were functional-stylistic and linguistic-stylistic ones. As the study focuses on two newspaper genres, comparative-stylistic method was used as well. The analysis has been conducted on concrete linguistic excerpts from Serbian daily newspapers published throughout Serbia from 2008 to 2015. The aims of the paper are to show model of expressing temporal frame in contemporary Serbian newspapers. This paper provides an overview of the characteristics of model and the types of temporal expression as well as their variations in contemporal Serbian newspapers. The paper also aims to determine the differencies between printed news and press reports by the choice of temporal expressions. It is shown that there is a tendency of changing schematized structure of these informative genres and some innovation in relation to the choice of linguistic means for expessing the meaning of temporally close events. The research is a contribution to journalism stylistics, more precisely to the Serbian language newspaper stylistics, and also contributes to the study of linguistic and stylistic characteristics of non-literary texts. The study is also relevant because it describes the use of adverbs and adverbial expressions in the journalistic style.

  5. The book classification of William Torrey Harris: influences of Bacon and Hegel in library classification

    Directory of Open Access Journals (Sweden)

    Rodrigo de Sales

    2017-09-01

    Full Text Available The studies of library classification generally interact with the historical contextualization approach and with the classification ideas typical of Philosophy. In the 19th century, the North-American philosopher and educator William Torrey Harris developed a book classification at the St. Louis Public School, based on Francis Bacon and Georg Wilhelm Friedrich Hegel. The objective of this essay is to analyze Harris’s classification, reflecting upon his theoretical and philosophical backgrounds. To achieve such objective, this essay adopts a critical-descriptive approach for analysis. Results show some influences of Bacon and Hegel in Harris’s classification.

  6. FACET CLASSIFICATIONS OF E-LEARNING TOOLS

    Directory of Open Access Journals (Sweden)

    Olena Yu. Balalaieva

    2013-12-01

    Full Text Available The article deals with the classification of e-learning tools based on the facet method, which suggests the separation of the parallel set of objects into independent classification groups; at the same time it is not assumed rigid classification structure and pre-built finite groups classification groups are formed by a combination of values taken from the relevant facets. An attempt to systematize the existing classification of e-learning tools from the standpoint of classification theory is made for the first time. Modern Ukrainian and foreign facet classifications of e-learning tools are described; their positive and negative features compared to classifications based on a hierarchical method are analyzed. The original author's facet classification of e-learning tools is proposed.

  7. Classification of Flotation Frothers

    Directory of Open Access Journals (Sweden)

    Jan Drzymala

    2018-02-01

    Full Text Available In this paper, a scheme of flotation frothers classification is presented. The scheme first indicates the physical system in which a frother is present and four of them i.e., pure state, aqueous solution, aqueous solution/gas system and aqueous solution/gas/solid system are distinguished. As a result, there are numerous classifications of flotation frothers. The classifications can be organized into a scheme described in detail in this paper. The frother can be present in one of four physical systems, that is pure state, aqueous solution, aqueous solution/gas and aqueous solution/gas/solid system. It results from the paper that a meaningful classification of frothers relies on choosing the physical system and next feature, trend, parameter or parameters according to which the classification is performed. The proposed classification can play a useful role in characterizing and evaluation of flotation frothers.

  8. A New English?Arabic Parallel Text Corpus for Lexicographic Applications

    Directory of Open Access Journals (Sweden)

    Hashan Al-Ajmi

    2011-10-01

    Full Text Available

    Abstract: Bilingual lexicographers, translation specialists and English teachers in the Arabworld do not have access to computerized corpora of parallel texts for the English–Arabic languagepair. This project has been carried out to meet this requirement by establishing the first generalparallel corpus of English texts and their Arabic translations. The first phase of the project involvedthe selection of general source texts having appropriate lexical and stylistic features. The chosensource texts deal with a variety of topics such as the environment, globalization, psychology, history,politics, drama, etc. Their Arabic translations were taken from The World of Knowledge seriespublished by the National Council for Culture, Arts and Letters (NCCAL in Kuwait.

    Keywords: PARALLEL CORPUS, LEXICOGRAPHY, TRANSLATION, BILINGUAL DICTIONARY,COLLOCATIONS, ALIGNMENT, SYNONYMS, DERIVATIVES, ANTONYMS, GLOSSARY,FREQUENCY

    Opsomming: 'n Nuwe Engels–Arabiese parallelletekskorpus vir leksikografiesetoepassings Tweetalige leksikograwe, vertaalkundiges en Engelsonderwysers in dieArabiese wêreld het nie toegang tot gerekenariseerde korpusse van parallelle tekste vir die Engels–Arabiese taalpaar nie. Hierdie projek is onderneem om in dié behoefte te voorsien deur die eerstealgemene parallelle korpus van Engelse tekste en hul Arabiese vertalings tot stand te bring. Dieeerste fase van die projek het die keuse van algemene brontekste behels wat geskikte leksikale enstilistiese eienskappe besit. Die gekose brontekste handel oor 'n verskeidenheid onderwerpe soosdie omgewing, globalisering, psigologie, geskiedenis, politiek, drama, ens. Hul Arabiese vertalingsis geneem uit The World of Knowledge-reeks gepubliseer deur die National Council for Culture, Artsand Letters (NCCAL in Koeweit.

    Sleutelwoorde: PARALLELLE KORPUS, LEKSIKOGRAFIE, VERTALING, TWEETALIGEWOORDEBOEK, KOLLOKASIES, OOREENSTEMMING, SINONIEME, AFLEIDINGS, ANTONIEME

  9. Emotion models for textual emotion classification

    Science.gov (United States)

    Bruna, O.; Avetisyan, H.; Holub, J.

    2016-11-01

    This paper deals with textual emotion classification which gained attention in recent years. Emotion classification is used in user experience, product evaluation, national security, and tutoring applications. It attempts to detect the emotional content in the input text and based on different approaches establish what kind of emotional content is present, if any. Textual emotion classification is the most difficult to handle, since it relies mainly on linguistic resources and it introduces many challenges to assignment of text to emotion represented by a proper model. A crucial part of each emotion detector is emotion model. Focus of this paper is to introduce emotion models used for classification. Categorical and dimensional models of emotion are explained and some more advanced approaches are mentioned.

  10. A quick survey of text categorization algorithms

    Directory of Open Access Journals (Sweden)

    Dan MUNTEANU

    2007-12-01

    Full Text Available This paper contains an overview of basic formulations and approaches to text classification. This paper surveys the algorithms used in text categorization: handcrafted rules, decision trees, decision rules, on-line learning, linear classifier, Rocchio’s algorithm, k Nearest Neighbor (kNN, Support Vector Machines (SVM.

  11. Development of Feature Set, Classification Implementation and Applications for Vowel Migration/Modification in Sung Filipino (Tagalog Texts and Perceived Intelligibility

    Directory of Open Access Journals (Sweden)

    Virginia B. Bustos

    2009-12-01

    Full Text Available With the emergence of research on real-time visual feedback to supplement vocal pedagogy, the utilization of technology in the world of music is now seen to accelerate skills learning and enhance cognitive development. The researchers of this project aim to further analyze vowel intelligibility and develop software applications intended to be used not only by professional singers but also by individuals who wish to improve their singing capability. Data in the form of sung vowels and song pieces were obtained from 46 singers. A Listening Test was then conducted on these samples to obtain the ground truth for vowel classification based on human perception. Simulation of the human auditory perception of sung Filipino vowels was performed using formant frequencies and Mel-frequency cepstral coefficients as feature vector inputs to a two-stage Discriminant Analysis classifier. The setup resulted in an over-all Training Set accuracy of 89.4% and an over-all Test Set accuracy of 90.9%. The accuracy of the classifier, measured in terms of the correspondence of vowel classifications obtained from the classifier with the results of the Listening Test, reached 92.3%. Using information obtained from the classifier, offline and online/real-time software applications were developed. The main application features include the display of the spectral envelope and spectrogram, pitch and vibrato analysis and direct feedback on the classification of the sung vowel. These features were recommended by singers who were surveyed and were incorporated in the applications to aid singers to adjust formant locations, directly determine listener’s perception of sung vowels, perform modeling effectively and carry out vowel migration.

  12. Statistical text classifier to detect specific type of medical incidents.

    Science.gov (United States)

    Wong, Zoie Shui-Yee; Akiyama, Masanori

    2013-01-01

    WHO Patient Safety has put focus to increase the coherence and expressiveness of patient safety classification with the foundation of International Classification for Patient Safety (ICPS). Text classification and statistical approaches has showed to be successful to identifysafety problems in the Aviation industryusing incident text information. It has been challenging to comprehend the taxonomy of medical incidents in a structured manner. Independent reporting mechanisms for patient safety incidents have been established in the UK, Canada, Australia, Japan, Hong Kong etc. This research demonstrates the potential to construct statistical text classifiers to detect specific type of medical incidents using incident text data. An illustrative example for classifying look-alike sound-alike (LASA) medication incidents using structured text from 227 advisories related to medication errors from Global Patient Safety Alerts (GPSA) is shown in this poster presentation. The classifier was built using logistic regression model. ROC curve and the AUC value indicated that this is a satisfactory good model.

  13. An Intelligent System For Arabic Text Categorization

    NARCIS (Netherlands)

    Syiam, M.M.; Tolba, Mohamed F.; Fayed, Z.T.; Abdel-Wahab, Mohamed S.; Ghoniemy, Said A.; Habib, Mena Badieh

    Text Categorization (classification) is the process of classifying documents into a predefined set of categories based on their content. In this paper, an intelligent Arabic text categorization system is presented. Machine learning algorithms are used in this system. Many algorithms for stemming and

  14. Expected Classification Accuracy

    Directory of Open Access Journals (Sweden)

    Lawrence M. Rudner

    2005-08-01

    Full Text Available Every time we make a classification based on a test score, we should expect some number..of misclassifications. Some examinees whose true ability is within a score range will have..observed scores outside of that range. A procedure for providing a classification table of..true and expected scores is developed for polytomously scored items under item response..theory and applied to state assessment data. A simplified procedure for estimating the..table entries is also presented.

  15. Transcending the Versification of Oraliture: Song-Text as Oral Performance among the Ilaje

    Directory of Open Access Journals (Sweden)

    N. Akingbe

    2013-12-01

    Full Text Available Oraliture is a terminology that is often employed in the description of the various genres of oral literature such as proverbs, legends, short stories, traditional songs and rhymes, song-poems, historical narratives traditional symbols, images, oral performance, myths and other traditional stylistic devices. All these devices constitute vibrant appurtenances of oral narrative performance in Africa. Oral narrative performance is invariably situated within the domain of social communication, which brings together the raconteur/performer and the audience towards the realisation of communal entertainment. While the narrator/performer, plays the leading role in an oral performance, the audience’s involvement and participation is realised through song, verbal/choral responses, gestures and, or instrumental/musical accompaniment. This oral practice usually take place at one time or the other in various African communities during the festival, ritual/religious procession which ranges from story- telling, recitation of poems, song text and dancing. This paper is essentially concerned with the illustration of the use of song- text, as oral performance among the Ilaje, a burgeoning coastal subethnic group, of the Yoruba race in the South Western Nigeria. The paper will further examine how patriotism, history, death and anti-social behaviours are evaluated through the use of songs among the Ilaje.

  16. Science and Technology Text Mining Basic Concepts

    National Research Council Canada - National Science Library

    Losiewicz, Paul

    2003-01-01

    ...). It then presents some of the most widely used data and text mining techniques, including clustering and classification methods, such as nearest neighbor, relational learning models, and genetic...

  17. Packet Classification by Multilevel Cutting of the Classification Space: An Algorithmic-Architectural Solution for IP Packet Classification in Next Generation Networks

    Directory of Open Access Journals (Sweden)

    Motasem Aldiab

    2008-01-01

    Full Text Available Traditionally, the Internet provides only a “best-effort” service, treating all packets going to the same destination equally. However, providing differentiated services for different users based on their quality requirements is increasingly becoming a demanding issue. For this, routers need to have the capability to distinguish and isolate traffic belonging to different flows. This ability to determine the flow each packet belongs to is called packet classification. Technology vendors are reluctant to support algorithmic solutions for classification due to their nondeterministic performance. Although content addressable memories (CAMs are favoured by technology vendors due to their deterministic high-lookup rates, they suffer from the problems of high-power consumption and high-silicon cost. This paper provides a new algorithmic-architectural solution for packet classification that mixes CAMs with algorithms based on multilevel cutting of the classification space into smaller spaces. The provided solution utilizes the geometrical distribution of rules in the classification space. It provides the deterministic performance of CAMs, support for dynamic updates, and added flexibility for system designers.

  18. Stable classification of the energy-momentum tensor. Summary

    International Nuclear Information System (INIS)

    Guzman-Sanchez, A.R.; Przanowski, M.; Plevansky, J.

    1990-01-01

    Starting with the algebraic classification of the energy-momentum tensor given by Plebansky, it is established that this classification is unstable under versal deformations and a new (stable) classification is given. In order to keep the text to reasonable length, we just write the basic ideas and some results. (Author) (Author)

  19. PRIMERA SISTEMATIZACIÓN DE LAS CARACTERÍSTICAS ESTILÍSTICAS DE LA ALFARERÍA FINA DEL SITIO SORIA 2 (VALLE DE YOCAVIL, NOROESTE ARGENTINO / First systematization of stylistic characters of fine pottery from Soria 2 site (Yocavil, Northwestern Argentina

    Directory of Open Access Journals (Sweden)

    Romina Clara Spano

    2011-12-01

    Full Text Available Se presenta una primera sistematización de las características de la alfarería temprana del sitio Soria 2 (valle de Yocavil, Noroeste Argentino, centrando el análisis en ejemplares pertenecientes al denominado conjunto fino. Se apunta a la caracterización de una muestra del abundante material cerámico hallado en un contexto primario, para el cual se cuenta con un fechado de inicios de la Era Cristiana. El material es clasificado recurriendo a la categoría estilo, entendiendo a la misma como la integración de aspectos morfológicos, tecnológicos y decorativos, que convergen en los “modos de hacer” vigentes durante la ocupación del sitio. Se detallan las variables analíticas puestas en juego: forma, técnica de manufactura, pasta, cocción, tratamiento de la superficie y decoración. La conjunción de dichas variables es la base para proponer modalidades estilísticas. Adicionalmente, se refiere brevemente a las prácticas en las cuales las vasijas estuvieron involucradas, tomando en cuenta los contextos de hallazgo (doméstico y funerario. El análisis sugiere que algunos ejemplares de la muestra estudiada exhiben afinidades con espacios circundantes.   Palabras clave: alfarería; modalidades estilísticas; contexto primario; Formativo; valle de Yocavil.   Abstract In this paper we present a first systematization of the features of early pottery found at the site Soria 2 (Yocavil Valley, Northwestern Argentina, focusing the analysis on the specimens belonging to the so-called fine pottery group. We aim at the characterization of a sample of the abundant ceramic material found in primary context for which there is a radiocarbon date from the beginning of the Christian era. The material is classified using the style category, considered here as the integration of morphological, technological and decorative aspects, which converge in the current “ways of doing” at those times of the site occupation. The analytical variables used

  20. Towards secondary fingerprint classification

    CSIR Research Space (South Africa)

    Msiza, IS

    2011-07-01

    Full Text Available an accuracy figure of 76.8%. This small difference between the two figures is indicative of the validity of the proposed secondary classification module. Keywords?fingerprint core; fingerprint delta; primary classifi- cation; secondary classification I..., namely, the fingerprint core and the fingerprint delta. Forensically, a fingerprint core is defined as the innermost turning point where the fingerprint ridges form a loop, while the fingerprint delta is defined as the point where these ridges form a...

  1. An edit script for taxonomic classifications

    Directory of Open Access Journals (Sweden)

    Valiente Gabriel

    2005-08-01

    Full Text Available Abstract Background The NCBI taxonomy provides one of the most powerful ways to navigate sequence data bases but currently users are forced to formulate queries according to a single taxonomic classification. Given that there is not universal agreement on the classification of organisms, providing a single classification places constraints on the questions biologists can ask. However, maintaining multiple classifications is burdensome in the face of a constantly growing NCBI classification. Results In this paper, we present a solution to the problem of generating modifications of the NCBI taxonomy, based on the computation of an edit script that summarises the differences between two classification trees. Our algorithms find the shortest possible edit script based on the identification of all shared subtrees, and only take time quasi linear in the size of the trees because classification trees have unique node labels. Conclusion These algorithms have been recently implemented, and the software is freely available for download from http://darwin.zoology.gla.ac.uk/~rpage/forest/.

  2. Text genres and registers the computation of linguistic features

    CERN Document Server

    Fang, Chengyu Alex

    2015-01-01

    This book is a description of some of the most recent advances in text classification as part of a concerted effort to achieve computer understanding of human language. In particular, it addresses state-of-the-art developments in the computation of higher-level linguistic features, ranging from etymology to grammar and syntax for the practical task of text classification according to genres, registers and subject domains. Serving as a bridge between computational methods and sophisticated linguistic analysis, this book will be of particular interest to academics and students of computational linguistics as well as professionals in natural language engineering.

  3. Analysis of Tense Interferential of Verbs in Old Narrative Texts

    Directory of Open Access Journals (Sweden)

    Mahmood Barati khansari

    2014-08-01

    Full Text Available Abstract One of the admirable methods to compose stories in Persian verse and prose, is the present Tense verbs in the meaning of past tense. This grammatical point has been hidden in the grammarian and stylist's point of view although it has been repeatedly mentioned in the texts and this point has been not mentioned in the grammatical books but some of the investigators and literati have pointed out it in their correction works. We mention their sayings: firstly, Allame Qazvini, doubtfully, mentions the interferential times of the verbs and inconsistencies of the Tenses in the correction of texts of Jahangoshaye – Joveini Book. He writes in the second footnote 2-3, that the verb Mikonam( I do is in the form of present Tense but its meaning is in the simple past Tense. As it has been observed, in the most old books the form of the verb is in the present tense but its meaning is in simple Tense ( Joveini, 1367, p. 357. Later, Fruzanfar in the correction of grammatical notes of ouhadoddin Kermani's Manaqeb, points to this point and counted it of the Eltefat Literary art ( Fruzanfar, 1347. P. 61 Mohammad Roushan informed this grammatical rule and he writes in the introduction of his book: the application of this kind of verb that is not on the basis of the dependent and independent verbs (Khagushi, 1361, p. 24. Yusofi in his correction on Bidpay Stories points to this grammatical point that it has been hidden of correctors of the book. Ha says that this grammatical point is the prose characteristic of the book. He adds that the characteristic includes in the present stories (Yusofi, 1364, p. 36. Finally, Dr. shfi'ee in his valuable notes on the Mateqol altei their mentions that this style of telling stories – the verb in the present Tense- is less in verse but the verbs in the same meaning and forms were used in old Persian as in the present time but there were inconsistence in the time and the form of the verbs in the past and the grammarians

  4. The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text.

    Science.gov (United States)

    Krallinger, Martin; Vazquez, Miguel; Leitner, Florian; Salgado, David; Chatr-Aryamontri, Andrew; Winter, Andrew; Perfetto, Livia; Briganti, Leonardo; Licata, Luana; Iannuccelli, Marta; Castagnoli, Luisa; Cesareni, Gianni; Tyers, Mike; Schneider, Gerold; Rinaldi, Fabio; Leaman, Robert; Gonzalez, Graciela; Matos, Sergio; Kim, Sun; Wilbur, W John; Rocha, Luis; Shatkay, Hagit; Tendulkar, Ashish V; Agarwal, Shashank; Liu, Feifan; Wang, Xinglong; Rak, Rafal; Noto, Keith; Elkan, Charles; Lu, Zhiyong; Dogan, Rezarta Islamaj; Fontaine, Jean-Fred; Andrade-Navarro, Miguel A; Valencia, Alfonso

    2011-10-03

    Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them. A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89% and the best AUC iP/R was 68%. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing

  5. The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text

    Science.gov (United States)

    2011-01-01

    Background Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them. Results A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89% and the best AUC iP/R was 68%. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were

  6. CNN for breaking text-based CAPTCHA with noise

    Science.gov (United States)

    Liu, Kaixuan; Zhang, Rong; Qing, Ke

    2017-07-01

    A CAPTCHA ("Completely Automated Public Turing test to tell Computers and Human Apart") system is a program that most humans can pass but current computer programs could hardly pass. As the most common type of CAPTCHAs , text-based CAPTCHA has been widely used in different websites to defense network bots. In order to breaking textbased CAPTCHA, in this paper, two trained CNN models are connected for the segmentation and classification of CAPTCHA images. Then base on these two models, we apply sliding window segmentation and voting classification methods realize an end-to-end CAPTCHA breaking system with high success rate. The experiment results show that our method is robust and effective in breaking text-based CAPTCHA with noise.

  7. Multi-label literature classification based on the Gene Ontology graph

    Directory of Open Access Journals (Sweden)

    Lu Xinghua

    2008-12-01

    Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate

  8. KUR'N'DA TEŞBİHLİ ANLATIM ÜSLÛBU

    Directory of Open Access Journals (Sweden)

    Muhammed Aydın

    2001-12-01

    Full Text Available There are various stylistic features in the Qur'an These featurescontribute greatly to the understanding and the clarification of theQur'an. One of the most significant stylistic features is certainly thesimilitude. The similitude is a very helpful way for the remembranceof the Qur'anic data. In this article we concentrate on theimportance of this stylistic device in the Qur'an. In addition, we tryto show that the similitude plays a very important role in theunderstanding of the Glories Qur'an.

  9. Vietnamese Document Representation and Classification

    Science.gov (United States)

    Nguyen, Giang-Son; Gao, Xiaoying; Andreae, Peter

    Vietnamese is very different from English and little research has been done on Vietnamese document classification, or indeed, on any kind of Vietnamese language processing, and only a few small corpora are available for research. We created a large Vietnamese text corpus with about 18000 documents, and manually classified them based on different criteria such as topics and styles, giving several classification tasks of different difficulty levels. This paper introduces a new syllable-based document representation at the morphological level of the language for efficient classification. We tested the representation on our corpus with different classification tasks using six classification algorithms and two feature selection techniques. Our experiments show that the new representation is effective for Vietnamese categorization, and suggest that best performance can be achieved using syllable-pair document representation, an SVM with a polynomial kernel as the learning algorithm, and using Information gain and an external dictionary for feature selection.

  10. Fuzzy One-Class Classification Model Using Contamination Neighborhoods

    Directory of Open Access Journals (Sweden)

    Lev V. Utkin

    2012-01-01

    Full Text Available A fuzzy classification model is studied in the paper. It is based on the contaminated (robust model which produces fuzzy expected risk measures characterizing classification errors. Optimal classification parameters of the models are derived by minimizing the fuzzy expected risk. It is shown that an algorithm for computing the classification parameters is reduced to a set of standard support vector machine tasks with weighted data points. Experimental results with synthetic data illustrate the proposed fuzzy model.

  11. Retrogressive harmonic motion as structural and stylistic characteristic of pop-rock music

    Science.gov (United States)

    Carter, Paul S.

    The central issue addressed in this dissertation is that of progressive and retrogressive harmonic motion as it is utilized in the repertoire of pop-rock music. I believe that analysis in these terms may prove to be a valuable tool for the understanding of the structure, style and perception of this music. Throughout my study of this music, various patterns of progressive and retrogressive harmonic motions within a piece reveal a kind of musical character about it, a character on which much of a work's style, organization and extramusical nature often depends. Several influential theorists, especially Jean-Phillipe Rameau, Hugo Riemann, and Arnold Schoenberg, have addressed the issues of functional harmony and the nature of the motion between chords of a tonal harmonic space. After assessing these views, I have found that it is possible to differentiate between two fundamental types of harmonic motions. This difference, one that I believe is instrumental in characterizing pop-rock music, is the basis for the analytical perspective I wish to embrace. After establishing a method of evaluating tonal harmonic root motions in these terms, I wish to examine a corpus of this music in order to discover what a characterization of its harmonic motion may reveal about each piece. Determining this harmonic character may help to establish structural and stylistic traits for that piece, its genre, composer, period, or even its sociological purpose. Conclusions may then be drawn regarding the role these patterns play in defining musical style traits of pop-rock. Partly as a tool for serving the study mentioned above I develop a graphical method of accounting for root motion I name the tonal "Space-Plot"; This apparatus allows the analyst to measure several facets about the harmonic motion of the music, and to see a wide scope of relations in and around a diatonic key.

  12. Assessing Unmet Information Needs of Breast Cancer Survivors: Exploratory Study of Online Health Forums Using Text Classification and Retrieval.

    Science.gov (United States)

    McRoy, Susan; Rastegar-Mojarad, Majid; Wang, Yanshan; Ruddy, Kathryn J; Haddad, Tufia C; Liu, Hongfang

    2018-05-15

    Patient education materials given to breast cancer survivors may not be a good fit for their information needs. Needs may change over time, be forgotten, or be misreported, for a variety of reasons. An automated content analysis of survivors' postings to online health forums can identify expressed information needs over a span of time and be repeated regularly at low cost. Identifying these unmet needs can guide improvements to existing education materials and the creation of new resources. The primary goals of this project are to assess the unmet information needs of breast cancer survivors from their own perspectives and to identify gaps between information needs and current education materials. This approach employs computational methods for content modeling and supervised text classification to data from online health forums to identify explicit and implicit requests for health-related information. Potential gaps between needs and education materials are identified using techniques from information retrieval. We provide a new taxonomy for the classification of sentences in online health forum data. 260 postings from two online health forums were selected, yielding 4179 sentences for coding. After annotation of data and training alternative one-versus-others classifiers, a random forest-based approach achieved F1 scores from 66% (Other, dataset2) to 90% (Medical, dataset1) on the primary information types. 136 expressions of need were used to generate queries to indexed education materials. Upon examination of the best two pages retrieved for each query, 12% (17/136) of queries were found to have relevant content by all coders, and 33% (45/136) were judged to have relevant content by at least one. Text from online health forums can be analyzed effectively using automated methods. Our analysis confirms that breast cancer survivors have many information needs that are not covered by the written documents they typically receive, as our results suggest that at most

  13. Optimizing tree-species classification in hyperspectal images

    CSIR Research Space (South Africa)

    Barnard, E

    2010-11-01

    Full Text Available for classification. Scaling of these components so that all features have equal variance is found to be useful, and their best performance (88.9% accurate classification) is achieved with 15 scaled features and a support vector machine as classifier. A graphical...

  14. Sentiment classification of Roman-Urdu opinions using Naïve Bayesian, Decision Tree and KNN classification techniques

    Directory of Open Access Journals (Sweden)

    Muhammad Bilal

    2016-07-01

    Full Text Available Sentiment mining is a field of text mining to determine the attitude of people about a particular product, topic, politician in newsgroup posts, review sites, comments on facebook posts twitter, etc. There are many issues involved in opinion mining. One important issue is that opinions could be in different languages (English, Urdu, Arabic, etc.. To tackle each language according to its orientation is a challenging task. Most of the research work in sentiment mining has been done in English language. Currently, limited research is being carried out on sentiment classification of other languages like Arabic, Italian, Urdu and Hindi. In this paper, three classification models are used for text classification using Waikato Environment for Knowledge Analysis (WEKA. Opinions written in Roman-Urdu and English are extracted from a blog. These extracted opinions are documented in text files to prepare a training dataset containing 150 positive and 150 negative opinions, as labeled examples. Testing data set is supplied to three different models and the results in each case are analyzed. The results show that Naïve Bayesian outperformed Decision Tree and KNN in terms of more accuracy, precision, recall and F-measure.

  15. Stylistics of Khaje Abd ol-Lah Ansari’s Epistles

    Directory of Open Access Journals (Sweden)

    Azadeh Poode

    2014-07-01

    Full Text Available  Abstract Stylistics is a knowledge that has been particularly considered by writers and speakers. Literary style of every writer shows the way of indicating speaker's thought and it is his opening key of speech's style in conveying meaning to the addressee's mind. Type of words, structures, sentences, and the way of interpreting meaning are factors in literary style elements. The effect of Khaje Abd ol-Lah Ansari's word on Sufi didactic literature is known among literary scholars. Deeply mystical concepts' integration with eloquent style in song has made the works of Khaje eternal so stylistic research about his works can show good points in literary aesthetics of his works. So besides Al-Sufi categories, his five epistles are the most preferred among his works that are described in this study.   Khaje Abd ol-Lah Ansari is among mysticism and Persian literature celebrities that besides having numerous writings, he is worthy to be analyzed in this subject in order to study his methods in writing his Sufism didactic works especially in five epistles of Kanz ol-Salekin, Vareda'at, Del va Jan, Haft Hesar, Ghalandar nama, and the pattern that is for providing next works.   At the lexical level , Khaje Abd ol-Lah's style does not have a manifest feature. His only lexical feature is repetition that is seen in three levels of letter, word, and sentence. Sometimes he repeats a word in several consecutive sentences and even in several pages. Number of old words and sounds are very few in epistles that this simplicity of language relates to the addresses that are common people and his didactic works.   According to linguistic and literary level, epistles are closer to the second period of Persian prose than the first period there is no sign of oldness in these works. Arabic words are used moderately and most of these words have been used in its modern sense. At syntax level, he has used prefix verbs specially "Dar" and

  16. Active Authentication Linguistic Modalities

    Science.gov (United States)

    2013-12-01

    collected dataset. We broadly categorize these sensors according to the degree of conscious cognitive involvement measured by the sensors. The distinction... stylistically closest candidate author to unknown writings. In an unsupervised setting, a set of writings whose authorship is unknown are classified into style...unknown text is classified by a unary author-specific classifier. The text is attributed to an author if and only if it is stylistically close enough to

  17. Communicating English for Science and Technology

    DEFF Research Database (Denmark)

    Mousten, Birthe

    The book introduces and discusses some of the ideas, stylistics, methods, aids and conventions used in English for Science and Technology. The book centres on a mix of theoretical considerations, examples, drills and texts.......The book introduces and discusses some of the ideas, stylistics, methods, aids and conventions used in English for Science and Technology. The book centres on a mix of theoretical considerations, examples, drills and texts....

  18. Radar transmitter classification using non-stationary signal classifier

    CSIR Research Space (South Africa)

    Du Plessis, MC

    2009-07-01

    Full Text Available support vector machine which is applied to the radar pulse's time-frequency representation. The time-frequency representation is refined using particle swarm optimization to increase the classification accuracy. The classification accuracy is tested...

  19. Raw materials resources classification and characterisation for ...

    African Journals Online (AJOL)

    Raw materials resources classification and characterisation for ceramic tableware production in Nigeria. PSA Irabor. Abstract. No Abstract. Journal of Applied Science, Engineering and Technology Vol. 2(1) 2002: 48-52. Full Text: EMAIL FULL TEXT EMAIL FULL TEXT · DOWNLOAD FULL TEXT DOWNLOAD FULL TEXT.

  20. Classification of the regional aborigine ceramic production and its distribution in the central region of Cuba based on NAA and EPMA

    International Nuclear Information System (INIS)

    Padilla Lavarez, Roman; Celaya Gonzalez, Miriam; Godo Rodriguez, Pedro Pablo; Pla, Rita Rosa; Montoya Rossi, Eduardo; Van Espen, Pierr M.

    2001-01-01

    The archaeological researches in Cuba have been oriented mainly to the study of both the eastern and westerns territories of the island: the former province Oriente, considered as the location with the higher density of Agroalfarero communities; and the former provinces of Pinar del Rio, La Havana and Matanzas where mostly Archaic sites have been discovered. However, and in despite of the rich diversity and relevance of the central region of Cuba, this region has been only partially and incompletely studied. The variety in the developed of diverse social and economic structures located in the heart of the Cuban archipelago the central region was the centre of the confluence of communities arriving from the eastern territories herding strong agricultural and pottery traditions interacted with the Archaic already established centuries ago in these territories and successive transculturation and assimilation processes should take place. The classification of the aborigine pottery manufacture of the centro region in three major stylistic variations (Jagua, Yayabo and Yaguajay) is described in details on the lathes interpretations of Celaya and Godo, 1998 an the origin and format transformation of the themes represented in the pottery The suitable of combining Neutron Activation Analysis and electron Probe Microanalysis for the purposes of establishing compositional classification of archaeological pottery is in several compositional groups served as new evidence to the discussion of ceramic production and regional interaction between the Subtaino and Archaic settlements

  1. RESEARCH OF CLASSIFICATION FEATURES OF THE FINANCIAL CONTROL

    Directory of Open Access Journals (Sweden)

    Knarik K. Arabyan

    2013-01-01

    Full Text Available One of the major problems is an improvement of classification features in the financial control theory. There is not a consensus concerning the form classification and the methods of financial control. This factor hinders the development of methodology and investigation of other issues of the financial control theory. The author summarizes scientists’ approaches to studying the classification features of financial control in the article.

  2. Classifications of patterned hair loss: a review

    Directory of Open Access Journals (Sweden)

    Mrinal Gupta

    2016-01-01

    Full Text Available Patterned hair loss is the most common cause of hair loss seen in both the sexes after puberty. Numerous classification systems have been proposed by various researchers for grading purposes. These systems vary from the simpler systems based on recession of the hairline to the more advanced multifactorial systems based on the morphological and dynamic parameters that affect the scalp and the hair itself. Most of these preexisting systems have certain limitations. Currently, the Hamilton-Norwood classification system for males and the Ludwig system for females are most commonly used to describe patterns of hair loss. In this article, we review the various classification systems for patterned hair loss in both the sexes. Relevant articles were identified through searches of MEDLINE and EMBASE. Search terms included but were not limited to androgenic alopecia classification, patterned hair loss classification, male pattern baldness classification, and female pattern hair loss classification. Further publications were identified from the reference lists of the reviewed articles.

  3. Five-way Smoking Status Classification Using Text Hot-Spot Identification and Error-correcting Output Codes

    OpenAIRE

    Cohen, Aaron M.

    2008-01-01

    We participated in the i2b2 smoking status classification challenge task. The purpose of this task was to evaluate the ability of systems to automatically identify patient smoking status from discharge summaries. Our submission included several techniques that we compared and studied, including hot-spot identification, zero-vector filtering, inverse class frequency weighting, error-correcting output codes, and post-processing rules. We evaluated our approaches using the same methods as the i2...

  4. Experiments on Supervised Learning Algorithms for Text Categorization

    Science.gov (United States)

    Namburu, Setu Madhavi; Tu, Haiying; Luo, Jianhui; Pattipati, Krishna R.

    2005-01-01

    Modern information society is facing the challenge of handling massive volume of online documents, news, intelligence reports, and so on. How to use the information accurately and in a timely manner becomes a major concern in many areas. While the general information may also include images and voice, we focus on the categorization of text data in this paper. We provide a brief overview of the information processing flow for text categorization, and discuss two supervised learning algorithms, viz., support vector machines (SVM) and partial least squares (PLS), which have been successfully applied in other domains, e.g., fault diagnosis [9]. While SVM has been well explored for binary classification and was reported as an efficient algorithm for text categorization, PLS has not yet been applied to text categorization. Our experiments are conducted on three data sets: Reuter's- 21578 dataset about corporate mergers and data acquisitions (ACQ), WebKB and the 20-Newsgroups. Results show that the performance of PLS is comparable to SVM in text categorization. A major drawback of SVM for multi-class categorization is that it requires a voting scheme based on the results of pair-wise classification. PLS does not have this drawback and could be a better candidate for multi-class text categorization.

  5. Issues surrounding the classification of accounting information

    Directory of Open Access Journals (Sweden)

    Huibrecht Van der Poll

    2011-06-01

    Full Text Available The act of classifying information created by accounting practices is ubiquitous in the accounting process; from recording to reporting, it has almost become second nature. The classification has to correspond to the requirements and demands of the changing environment in which it is practised. Evidence suggests that the current classification of items in financial statements is not keeping pace with the needs of users and the new financial constructs generated by the industry. This study addresses the issue of classification in two ways: by means of a critical analysis of classification theory and practices and by means of a questionnaire that was developed and sent to compilers and users of financial statements. A new classification framework for accounting information in the balance sheet and income statement is proposed.

  6. Classification of Cortical Brain Malformations

    Directory of Open Access Journals (Sweden)

    J Gordon Millichap

    2008-03-01

    Full Text Available Clinical, radiological, and genetic classifications of 113 cases of malformations of cortical development (MCD were evaluated at the Erasmus Medical Center-Sophia Children's Hospital, Rotterdam, the Netherlands.

  7. Some Issues in the Automatic Classification of U.S. Patents Working Notes for the AAAI-98 Workshop on Learning for Text Categorization

    National Research Council Canada - National Science Library

    Larkey, Leah

    1998-01-01

    The classification of U.S. patents poses some special problems due to the enormous size of the corpus, the size and complex hierarchical structure of the classification system, and the size and structure of patent documents...

  8. AUTHORS' STRATEGIES IN THE ENGLISH TEXT (ON THE EXAMPLE OF THE TRILOGY "LORD OF THE RINGS" BY J. R. R. TOLKIEN

    Directory of Open Access Journals (Sweden)

    Suvorova, P.E.

    2016-06-01

    Full Text Available Authors of the article proceed from the assumption that a certain author's orientation in the selection of the artistic worldbuilding can be revealed in any text. This concerns not the stylistic techniques and choice of the artistic means but the detection of formation of strategy and division of a literary text. It describes the process of creating a space-time organization of the text according to the author's task and intention. It is proved that the landscape in the time continuum in the trilogy of J. R. R. Tolkien forms a certain visual impression and transmits information of the associative (psychological type, meaning that, the author's textual strategy is aimed at identifying links between internal and external information about the space-time picture of the world. The authors come to the conclusion about three main strategies in the creation of a space-time model in a trilogy of J. R. R. Tolkien simultaneously implemented in the creation of the visual image, transfer of information of associative type, usage of the landscape and temporal situation description for the implementation and disclosure of basic idea of the work.

  9. Proposal plan of classification faceted for federal universities

    Directory of Open Access Journals (Sweden)

    Renata Santos Brandão

    2017-09-01

    Full Text Available This study aims to present a faceted classification plan for the archival management of documents in the federal universities of Brazil. For this, was done a literature review on the archival management in Brazil, the types of classification plans and the theory of the Ranganathan faceted classification, through searches in databases in the areas of Librarianship and Archivology. It was identified the classification plan used in the Federal Institutions of Higher Education to represent the functional facet and created the structural classification plan to represent the structural facet. The two classification plans were inserted into a digital repository management system to give rise to the faceted classification plan. The system used was Tainacan, free software wordpress-based used in digital document management. The developed faceted classification plan allows the user to choose and even combine the way to look for the information that guarantees agreater efficiency in the information retrieval.

  10. Automatic Hierarchical Color Image Classification

    Directory of Open Access Journals (Sweden)

    Jing Huang

    2003-02-01

    Full Text Available Organizing images into semantic categories can be extremely useful for content-based image retrieval and image annotation. Grouping images into semantic classes is a difficult problem, however. Image classification attempts to solve this hard problem by using low-level image features. In this paper, we propose a method for hierarchical classification of images via supervised learning. This scheme relies on using a good low-level feature and subsequently performing feature-space reconfiguration using singular value decomposition to reduce noise and dimensionality. We use the training data to obtain a hierarchical classification tree that can be used to categorize new images. Our experimental results suggest that this scheme not only performs better than standard nearest-neighbor techniques, but also has both storage and computational advantages.

  11. GLOBAL LAND COVER CLASSIFICATION USING MODIS SURFACE REFLECTANCE PROSUCTS

    Directory of Open Access Journals (Sweden)

    K. Fukue

    2016-06-01

    Full Text Available The objective of this study is to develop high accuracy land cover classification algorithm for Global scale by using multi-temporal MODIS land reflectance products. In this study, time-domain co-occurrence matrix was introduced as a classification feature which provides time-series signature of land covers. Further, the non-parametric minimum distance classifier was introduced for timedomain co-occurrence matrix, which performs multi-dimensional pattern matching for time-domain co-occurrence matrices of a classification target pixel and each classification classes. The global land cover classification experiments have been conducted by applying the proposed classification method using 46 multi-temporal(in one year SR(Surface Reflectance and NBAR(Nadir BRDF-Adjusted Reflectance products, respectively. IGBP 17 land cover categories were used in our classification experiments. As the results, SR and NBAR products showed similar classification accuracy of 99%.

  12. Saving our science from ourselves: the plight of biological classification

    Directory of Open Access Journals (Sweden)

    Malte C. Ebach

    2011-06-01

    Full Text Available Saving our science from ourselves: the plight of biological classification. Biological classification ( nomenclature, taxonomy, and systematics is being sold short. The desire for new technologies, faster and cheaper taxonomic descriptions, identifications, and revisions is symptomatic of a lack of appreciation and understanding of classification. The problem of gadget-driven science, a lack of best practice and the inability to accept classification as a descriptive and empirical science are discussed. The worst cases scenario is a future in which classifications are purely artificial and uninformative.

  13. Negation handling in sentiment classification using rule-based adapted from Indonesian language syntactic for Indonesian text in Twitter

    Science.gov (United States)

    Amalia, Rizkiana; Arif Bijaksana, Moch; Darmantoro, Dhinta

    2018-03-01

    The presence of the word negation is able to change the polarity of the text if it is not handled properly it will affect the performance of the sentiment classification. Negation words in Indonesian are ‘tidak’, ‘bukan’, ‘belum’ and ‘jangan’. Also, there is a conjunction word that able to reverse the actual values, as the word ‘tetapi’, or ‘tapi’. Unigram has shortcomings in dealing with the existence of negation because it treats negation word and the negated words as separate words. A general approach for negation handling in English text gives the tag ‘NEG_’ for following words after negation until the first punctuation. But this may gives the tag to un-negated, and this approach does not handle negation and conjunction in one sentences. The rule-based method to determine what words negated by adapting the rules of Indonesian language syntactic of negation to determine the scope of negation was proposed in this study. With adapting syntactic rules and tagging “NEG_” using SVM classifier with RBF kernel has better performance results than the other experiments. Considering the average F1-score value, the performance of this proposed method can be improved against baseline equal to 1.79% (baseline without negation handling) and 5% (baseline with existing negation handling) for a dataset that all tweets contain negation words. And also for the second dataset that has the various number of negation words in document tweet. It can be improved against baseline at 2.69% (without negation handling) and 3.17% (with existing negation handling).

  14. Analysis of Tense Interferential of Verbs in Old Narrative Texts

    Directory of Open Access Journals (Sweden)

    Amir Zeighami

    2014-07-01

    Full Text Available Abstract One of the admirable methods to compose stories in Persian verse and prose, is the present Tense verbs in the meaning of past tense. This grammatical point has been hidden in the grammarian and stylist's point of view although it has been repeatedly mentioned in the texts and this point has been not mentioned in the grammatical books but some of the investigators and literati have pointed out it in their correction works. We mention their sayings: firstly, Allame Qazvini, doubtfully, mentions the interferential times of the verbs and inconsistencies of the Tenses in the correction of texts of Jahangoshaye – Joveini Book. He writes in the second footnote 2-3, that the verb Mikonam( I do is in the form of present Tense but its meaning is in the simple past Tense. As it has been observed, in the most old books the form of the verb is in the present tense but its meaning is in simple Tense ( Joveini, 1367, p. 357. Later, Fruzanfar in the correction of grammatical notes of ouhadoddin Kermani's Manaqeb, points to this point and counted it of the Eltefat Literary art ( Fruzanfar, 1347. P. 61 Mohammad Roushan informed this grammatical rule and he writes in the introduction of his book: the application of this kind of verb that is not on the basis of the dependent and independent verbs (Khagushi, 1361, p. 24. Yusofi in his correction on Bidpay Stories points to this grammatical point that it has been hidden of correctors of the book. Ha says that this grammatical point is the prose characteristic of the book. He adds that the characteristic includes in the present stories (Yusofi, 1364, p. 36. Finally, Dr. shfi'ee in his valuable notes on the Mateqol altei their mentions that this style of telling stories – the verb in the present Tense- is less in verse but the verbs in the same meaning and forms were used in old Persian as in the present time but there were inconsistence in the time and the form of the verbs in the past and

  15. Classifying Classifications

    DEFF Research Database (Denmark)

    Debus, Michael S.

    2017-01-01

    This paper critically analyzes seventeen game classifications. The classifications were chosen on the basis of diversity, ranging from pre-digital classification (e.g. Murray 1952), over game studies classifications (e.g. Elverdam & Aarseth 2007) to classifications of drinking games (e.g. LaBrie et...... al. 2013). The analysis aims at three goals: The classifications’ internal consistency, the abstraction of classification criteria and the identification of differences in classification across fields and/or time. Especially the abstraction of classification criteria can be used in future endeavors...... into the topic of game classifications....

  16. Technological and stylistic evaluation of the Early Bronze Age pottery at Tarsus-Gozlukule, Turkey: Pottery production and its interaction with economic, social, and cultural spheres

    Science.gov (United States)

    Unlu, Elif

    This dissertation presents a technological and stylistic assessment of Early Bronze Age pottery production at Tarsus-Gozlukule, a multi-period mound settlement located in the Cilician Plain in southern Turkey. Pottery production, like all other man-made objects, is firstly a technological act. This dissertation maintains that material style (involving formal, technical, and decorative choices expressed by the artisan) of an artifact should be investigated as a whole as such an integrative study would be the most adequate way of understanding economic circumstances, social representation, and cultural boundaries. To facilitate this integrative investigation, seventy-two samples of Early Bronze Age pottery excavated from Tarsus-Gozlukule in the 1930s and 1940s.were selected for mineralogical, morphological, and chemical analyses. Petrographic and powder X-Ray Diffraction analyses were performed to determine the mineralogical makeup, Environmental Scanning Electron Microscope imagery was used to determine the morphology of these samples, and semi-quantitave Energy Dispersive X-Ray Spectroscopy analysis was performed on some samples to determine chemical properties of the clays. As a result of these scientific analyses various fabric groups were established. Afterwards formal shape and stylistic analysis was performed where shapes and surface treatments of the samples were analyzed and compared to the known local and non-local examples. Such an integrative approach to pottery production facilitates a better definition of the local pottery production process and enables an assessment of the technological know-how of the local pottery producers, their labor organization and its role within the operating markets, their function within the sociopolitical structure, and how such issues relate to the cultural boundaries within the community. Defining the paradigm of the local pottery production process leads to a broader investigation of issues related to the technological

  17. Empirical investigations into full-text protein interaction Article Categorization Task (ACT) in the BioCreative II.5 Challenge.

    Science.gov (United States)

    Lan, Man; Su, Jian

    2010-01-01

    The selection of protein interaction documents is one important application for biology research and has a direct impact on the quality of downstream BioNLP applications, i.e., information extraction and retrieval, summarization, QA, etc. The BioCreative II.5 Challenge Article Categorization task (ACT) involves doing a binary text classification to determine whether a given structured full-text article contains protein interaction information. This may be the first attempt at classification of full-text protein interaction documents in wide community. In this paper, we compare and evaluate the effectiveness of different section types in full-text articles for text classification. Moreover, in practice, the less number of true-positive samples results in unstable performance and unreliable classifier trained on it. Previous research on learning with skewed class distributions has altered the class distribution using oversampling and downsampling. We also investigate the skewed protein interaction classification and analyze the effect of various issues related to the choice of external sources, oversampling training sets, classifiers, etc. We report on the various factors above to show that 1) a full-text biomedical article contains a wealth of scientific information important to users that may not be completely represented by abstracts and/or keywords, which improves the accuracy performance of classification and 2) reinforcing true-positive samples significantly increases the accuracy and stability performance of classification.

  18. Mapping of the Universe of Knowledge in Different Classification Schemes

    Directory of Open Access Journals (Sweden)

    M. P. Satija

    2017-06-01

    Full Text Available Given the variety of approaches to mapping the universe of knowledge that have been presented and discussed in the literature, the purpose of this paper is to systematize their main principles and their applications in the major general modern library classification schemes. We conducted an analysis of the literature on classification and the main classification systems, namely Dewey/Universal Decimal Classification, Cutter’s Expansive Classification, Subject Classification of J.D. Brown, Colon Classification, Library of Congress Classification, Bibliographic Classification, Rider’s International Classification, Bibliothecal Bibliographic Klassification (BBK, and Broad System of Ordering (BSO. We conclude that the arrangement of the main classes can be done following four principles that are not mutually exclusive: ideological principle, social purpose principle, scientific order, and division by discipline. The paper provides examples and analysis of each system. We also conclude that as knowledge is ever-changing, classifications also change and present a different structure of knowledge depending upon the society and time of their design.

  19. Utilizing Multi-Field Text Features for Efficient Email Spam Filtering

    Directory of Open Access Journals (Sweden)

    Wuying Liu

    2012-06-01

    Full Text Available Large-scale spam emails cause a serious waste of time and resources. This paper investigates the text features of email documents and the feature noises among multi-field texts, resulting in an observation of a power law distribution of feature strings within each text field. According to the observation, we propose an efficient filtering approach including a compound weight method and a lightweight field text classification algorithm. The compound weight method considers both the historical classifying ability of each field classifier and the classifying contribution of each text field in the current classified email. The lightweight field text classification algorithm straightforwardly calculates the arithmetical average of multiple conditional probabilities predicted from feature strings according to a string-frequency index for labeled emails storing. The string-frequency index structure has a random-sampling-based compressible property owing to the power law distribution and can largely reduce the storage space. The experimental results in the TREC spam track show that the proposed approach can complete the filtering task in low space cost and high speed, whose overall performance 1-ROCA exceeds the best one among the participators at the trec07p evaluation.

  20. Classification of remotely sensed images

    CSIR Research Space (South Africa)

    Dudeni, N

    2008-10-01

    Full Text Available For this research, the researchers examine various existing image classification algorithms with the aim of demonstrating how these algorithms can be applied to remote sensing images. These algorithms are broadly divided into supervised...

  1. Internet and library classification as determinants of students ...

    African Journals Online (AJOL)

    Internet and library classification as determinants of students utilisation of information resources in University of Calabar Library. ... DOWNLOAD FULL TEXT Open Access DOWNLOAD FULL TEXT Subscription or Fee Access ...

  2. Specific classification of financial analysis of enterprise activity

    Directory of Open Access Journals (Sweden)

    Synkevych Nadiia I.

    2014-01-01

    Full Text Available Despite the fact that one can find a big variety of classifications of types of financial analysis of enterprise activity, which differ with their approach to classification and a number of classification features and their content, in modern scientific literature, their complex comparison and analysis of existing classification have not been done. This explains urgency of this study. The article studies classification of types of financial analysis of scientists and presents own approach to this problem. By the results of analysis the article improves and builds up a specific classification of financial analysis of enterprise activity and offers classification by the following features: objects, subjects, goals of study, automation level, time period of the analytical base, scope of study, organisation system, classification features of the subject, spatial belonging, sufficiency, information sources, periodicity, criterial base, method of data selection for analysis and time direction. All types of financial analysis significantly differ with their inherent properties and parameters depending on the goals of financial analysis. The developed specific classification provides subjects of financial analysis of enterprise activity with a possibility to identify a specific type of financial analysis, which would correctly meet the set goals.

  3. Land-cover classification with an expert classification algorithm using digital aerial photographs

    Directory of Open Access Journals (Sweden)

    José L. de la Cruz

    2010-05-01

    Full Text Available The purpose of this study was to evaluate the usefulness of the spectral information of digital aerial sensors in determining land-cover classification using new digital techniques. The land covers that have been evaluated are the following, (1 bare soil, (2 cereals, including maize (Zea mays L., oats (Avena sativa L., rye (Secale cereale L., wheat (Triticum aestivum L. and barley (Hordeun vulgare L., (3 high protein crops, such as peas (Pisum sativum L. and beans (Vicia faba L., (4 alfalfa (Medicago sativa L., (5 woodlands and scrublands, including holly oak (Quercus ilex L. and common retama (Retama sphaerocarpa L., (6 urban soil, (7 olive groves (Olea europaea L. and (8 burnt crop stubble. The best result was obtained using an expert classification algorithm, achieving a reliability rate of 95%. This result showed that the images of digital airborne sensors hold considerable promise for the future in the field of digital classifications because these images contain valuable information that takes advantage of the geometric viewpoint. Moreover, new classification techniques reduce problems encountered using high-resolution images; while reliabilities are achieved that are better than those achieved with traditional methods.

  4. Understanding about the classification of pulp inflammation

    Directory of Open Access Journals (Sweden)

    Trijoedani Widodo

    2007-03-01

    Full Text Available Since most authors use the reversible pulpitis and irreversible pulpitis classification, however, many dentists still do not implement these new classifications. Research was made using a descriptive method by proposing questionnaire to dentists from various dental clinics. The numbers of the dentists participating in this research are 22 dentists. All respondents use the diagnosis sheet during their examinations on patients. Nonetheless, it can't be known what diagnosis card used and most of the dentists are still using the old classification. Concerning responses given towards the new classification: a the new classification had been heard, however, it was not clear (36.3%; b the new classification has never been heard at all (63.6%. Then, responses concerning whether a new development is important to be followed-up or not: a there are those who think that information concerning new development is very important (27.2%; b those who feel that it is important to have new information (68.3%; c those who think that new information is not important (8%. It concluded that information concerning the development of classification of pulp inflammation did not reach the dentists.

  5. Text Mining in Organizational Research.

    Science.gov (United States)

    Kobayashi, Vladimer B; Mol, Stefan T; Berkers, Hannah A; Kismihók, Gábor; Den Hartog, Deanne N

    2018-07-01

    Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies.

  6. Learning Convolutional Text Representations for Visual Question Answering

    OpenAIRE

    Wang, Zhengyang; Ji, Shuiwang

    2017-01-01

    Visual question answering is a recently proposed artificial intelligence task that requires a deep understanding of both images and texts. In deep learning, images are typically modeled through convolutional neural networks, and texts are typically modeled through recurrent neural networks. While the requirement for modeling images is similar to traditional computer vision tasks, such as object recognition and image classification, visual question answering raises a different need for textual...

  7. CADASTRAL CLASSIFICATION OF THE LAND PLOTS IN UKRAINE

    Directory of Open Access Journals (Sweden)

    KIRICHEK Yu. O.

    2016-04-01

    Full Text Available Summary. Work concerns development of national system of classification of the land plots. The developed classification will allow to solve correctly a number of the corresponding cadastral, land management, estimated and other tasks. The analysis of classifications of lands, improvements and real estate in general is made. The created offers concerning creation of a new classification of the land plots in Ukraine. Today the Ukrainian real estate market has no single system that separates the system property groups, classes and types. This significantly complicates the work and can not fully be aware of the specific situation of real estate market. This task is designed to solve classification properties, it is used to transition from a diversity of individual properties to a limited number of classes of evaluation objects. The classification is different functional purpose (use facilities assessment, which determines the difference in value.

  8. A proposed data base system for detection, classification and ...

    African Journals Online (AJOL)

    A proposed data base system for detection, classification and location of fault on electricity company of Ghana electrical distribution system. Isaac Owusu-Nyarko, Mensah-Ananoo Eugine. Abstract. No Abstract. Keywords: database, classification of fault, power, distribution system, SCADA, ECG. Full Text: EMAIL FULL TEXT ...

  9. Definition and classification of epilepsy. Classification of epileptic seizures 2016

    Directory of Open Access Journals (Sweden)

    K. Yu. Mukhin

    2017-01-01

    Full Text Available Epilepsy is one of the most common neurological diseases, especially in childhood and adolescence. The incidence varies from 15 to 113 cases per 100 000 population with the maximum among children under 1 year old. The prevalence of epilepsy is high, ranging from 5 to 8 cases (in some regions – 10 cases per 1000 children under 15 years old. Classification of the disease has great importance for diagnosis, treatment and prognosis. The article presents a novel strategy for classification of epileptic seizures, developed in 2016. It contains a number of brand new concepts, including a very important one, saying that some seizures, previously considered as generalized or focal only, can be, in fact, both focal and generalized. They include tonic, atonic, myoclonic seizures and epileptic spasms. The term “secondarily generalized seizure” is replace by the term “bilateral tonic-clonic seizure” (as soon as it is not a separate type of epileptic seizures, and the term reflects the spread of discharge from any area of cerebral cortex and evolution of any types of focal seizures. International League Against Epilepsy recommends to abandon the term “pseudo-epileptic seizures” and replace it by the term “psychogenic non-epileptic seizures”. If a doctor is not sure that seizures have epileptic nature, the term “paroxysmal event” should be used without specifying the disease. The conception of childhood epileptic encephalopathies, developed within this novel classification project, is one of the most significant achievements, since in this case not only the seizures, but even epileptiform activity can induce severe disorders of higher mental functions. In addition to detailed description of the new strategy for classification of epileptic seizures, the article contains a comprehensive review of the existing principles of epilepsy and epileptic seizures classification.

  10. The Application of Machine Learning Algorithms for Text Mining based on Sentiment Analysis Approach

    Directory of Open Access Journals (Sweden)

    Reza Samizade

    2018-06-01

    Full Text Available Classification of the cyber texts and comments into two categories of positive and negative sentiment among social media users is of high importance in the research are related to text mining. In this research, we applied supervised classification methods to classify Persian texts based on sentiment in cyber space. The result of this research is in a form of a system that can decide whether a comment which is published in cyber space such as social networks is considered positive or negative. The comments that are published in Persian movie and movie review websites from 1392 to 1395 are considered as the data set for this research. A part of these data are considered as training and others are considered as testing data. Prior to implementing the algorithms, pre-processing activities such as tokenizing, removing stop words, and n-germs process were applied on the texts. Naïve Bayes, Neural Networks and support vector machine were used for text classification in this study. Out of sample tests showed that there is no evidence indicating that the accuracy of SVM approach is statistically higher than Naïve Bayes or that the accuracy of Naïve Bayes is not statistically higher than NN approach. However, the researchers can conclude that the accuracy of the classification using SVM approach is statistically higher than the accuracy of NN approach in 5% confidence level.

  11. AN OBJECT-BASED METHOD FOR CHINESE LANDFORM TYPES CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    H. Ding

    2016-06-01

    Full Text Available Landform classification is a necessary task for various fields of landscape and regional planning, for example for landscape evaluation, erosion studies, hazard prediction, et al. This study proposes an improved object-based classification for Chinese landform types using the factor importance analysis of random forest and the gray-level co-occurrence matrix (GLCM. In this research, based on 1km DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method. Random forest classification tree is applied to evaluate the importance of the terrain factors, which are used as multi-scale segmentation thresholds. Then the GLCM is conducted for the knowledge base of classification. The classification result was checked by using the 1:4,000,000 Chinese Geomorphological Map as reference. And the overall classification accuracy of the proposed method is 5.7% higher than ISODATA unsupervised classification, and 15.7% higher than the traditional object-based classification method.

  12. Improvement of Classification of Enterprise Circulating Funds

    Directory of Open Access Journals (Sweden)

    Rohanova Hanna O.

    2014-02-01

    Full Text Available The goal of the article lies in revelation of possibilities of increase of efficiency of managing enterprise circulating funds by means of improvement of their classification features. Having analysed approaches of many economists to classification of enterprise circulating funds, systemised and supplementing them, the article offers grouping classification features of enterprise circulating funds. In the result of the study the article offers an expanded classification of circulating funds, which clearly shows the role of circulating funds in managing enterprise finance and economy in general. The article supplements and groups classification features of enterprise circulating funds by: the organisation level, functioning character, sources of formation and their cost, and level of management efficiency. The article shows that the provided grouping of classification features of circulating funds allows exerting all-sided and purposeful influence upon indicators of efficiency of circulating funds functioning and facilitates their rational management in general. The prospect of further studies in this direction is identification of the level of attraction of loan resources by production enterprises for financing circulating funds.

  13. Enterprise Potential: Essence, Classification and Types

    Directory of Open Access Journals (Sweden)

    Turylo Anatolii M.

    2014-02-01

    Full Text Available The article considers existing approaches to classification of the enterprise potential as an economic notion. It offers own vision of classification of enterprise potential, which meets modern tendencies of enterprise development. Classification ensures a possibility of a wider description and assessment of enterprise potential and also allows identification of its most significant characteristics. Classification of the enterprise potential is developed by different criteria: by functions, by resource support, by ability to adapt, by the level of detection, by the spectrum of taking into account possibilities, by the period of coverage of possibilities and by the level of use. Analysis of components of the enterprise potential allows obtaining a complete and trustworthy assessment of the state of an enterprise. Adaptation potential of an enterprise is based on principles systemacy and dynamism, it characterises possibilities of adjustment of an enterprise to external and internal economic conditions.

  14. KNN BASED CLASSIFICATION OF DIGITAL MODULATED SIGNALS

    Directory of Open Access Journals (Sweden)

    Sajjad Ahmed Ghauri

    2016-11-01

    Full Text Available Demodulation process without the knowledge of modulation scheme requires Automatic Modulation Classification (AMC. When receiver has limited information about received signal then AMC become essential process. AMC finds important place in the field many civil and military fields such as modern electronic warfare, interfering source recognition, frequency management, link adaptation etc. In this paper we explore the use of K-nearest neighbor (KNN for modulation classification with different distance measurement methods. Five modulation schemes are used for classification purpose which is Binary Phase Shift Keying (BPSK, Quadrature Phase Shift Keying (QPSK, Quadrature Amplitude Modulation (QAM, 16-QAM and 64-QAM. Higher order cummulants (HOC are used as an input feature set to the classifier. Simulation results shows that proposed classification method provides better results for the considered modulation formats.

  15. A new approach to the classification of African oral texts | Kam ...

    African Journals Online (AJOL)

    Toutes ces raisons ont conduit à un nouvel examen des différents genres oraux dans le cadre africain et à proposer une division de ces textes en cinq grandes catégories. Mots clés: littérature orale, genres oraux, textes oraux, discours, énoncés, jeux de plaisanterie, chercheurs en littérature orale. Tydskrif vir Letterkunde ...

  16. THE PROBLEMS OF FIXED ASSETS CLASSIFICATION FOR ACCOUNTING

    Directory of Open Access Journals (Sweden)

    Sophiia Kafka

    2016-06-01

    Full Text Available This article provides a critical analysis of research in accounting of fixed assets; the basic issues of fixed assets accounting that have been developed by the Ukrainian scientists during 1999-2016 have been determined. It is established that the problems of non-current assets taxation and their classification are the most noteworthy. In the dissertations the issues of fixed assets classification are of exclusively particular branch nature, so its improvement is important. The purpose of the article is developing science-based classification of fixed assets for accounting purposes since their composition is quite diverse. The classification of fixed assets for accounting purposes have been summarized and developed in Figure 1 according to the results of the research. The accomplished analysis of existing approaches to classification of fixed assets has made it possible to specify its basic types and justify the classification criteria of fixed assets for the main objects of fixed assets. Key words: non-current assets, fixed assets, accounting, valuation, classification of the fixed assets. JEL:G M41  

  17. Event Classification using Concepts

    NARCIS (Netherlands)

    Boer, M.H.T. de; Schutte, K.; Kraaij, W.

    2013-01-01

    The semantic gap is one of the challenges in the GOOSE project. In this paper a Semantic Event Classification (SEC) system is proposed as an initial step in tackling the semantic gap challenge in the GOOSE project. This system uses semantic text analysis, multiple feature detectors using the BoW

  18. Extension classification method for low-carbon product cases

    Directory of Open Access Journals (Sweden)

    Yanwei Zhao

    2016-05-01

    Full Text Available In product low-carbon design, intelligent decision systems integrated with certain classification algorithms recommend the existing design cases to designers. However, these systems mostly dependent on prior experience, and product designers not only expect to get a satisfactory case from an intelligent system but also hope to achieve assistance in modifying unsatisfactory cases. In this article, we proposed a new categorization method composed of static and dynamic classification based on extension theory. This classification method can be integrated into case-based reasoning system to get accurate classification results and to inform designers of detailed information about unsatisfactory cases. First, we establish the static classification model for cases by dependent function in a hierarchical structure. Then for dynamic classification, we make transformation for cases based on case model, attributes, attribute values, and dependent function, thus cases can take qualitative changes. Finally, the applicability of proposed method is demonstrated through a case study of screw air compressor cases.

  19. El frañol en Pas pleurer de Lydie Salvayre y su traducción al español

    Directory of Open Access Journals (Sweden)

    Benoît Filhol

    2018-04-01

    Full Text Available Set in the spanish civil war and resonant with autobiographical echoes, Lydie Salvayre’s Pas pleurer exemplifies the contact between Spanish and French, by means of the mixed language called frañol. In terms of reception, this prominent stylistic factor –often highlighted by critics– led to a mixture of praise and disapproval. This article has a two-fold objective: first, conducting an exhaustive analysis of frañol in the novel, by means of a classification system allowing us to interpret Salvayre’s stylistic choices. Second, examining the Spanish translation of Pas pleurer, taking into consideration the challenges embedded in the translation of frañol. Le roman Pas pleurer de Lydie Salvayre se plonge dans l’épisode historique de la guerre civile espagnole au travers d’une histoire aux accents autobiographiques et rejoint les œuvres écrites en fragnol, une langue qui mélange le français et l’espagnol. Cette spécificité linguistique fut l’aspect le plus évoqué par la critique et sa réception recueillit aussi bien des éloges que des réprobations. Cet article poursuit deux objectifs. Tout d’abord, nous avons réalisé une analyse exhaustive du fragnol dans le roman au travers d’un système de classification adapté à l’œuvre qui nous permet d’interpréter le choix stylistique opéré par Salvayre. Dans un second temps, nous examinons la traduction espagnole de Pas pleurer, considérant que la transposition du fragnol constitue un défi de premier ordre pour le traducteur.

  20. Facial aging: A clinical classification

    Directory of Open Access Journals (Sweden)

    Shiffman Melvin

    2007-01-01

    Full Text Available The purpose of this classification of facial aging is to have a simple clinical method to determine the severity of the aging process in the face. This allows a quick estimate as to the types of procedures that the patient would need to have the best results. Procedures that are presently used for facial rejuvenation include laser, chemical peels, suture lifts, fillers, modified facelift and full facelift. The physician is already using his best judgment to determine which procedure would be best for any particular patient. This classification may help to refine these decisions.

  1. Using Shakespeare's Sotto Voce to Determine True Identity From Text

    Directory of Open Access Journals (Sweden)

    David Kernot

    2018-03-01

    Full Text Available Little is known of the private life of William Shakespeare, but he is famous for his collection of plays and poems, even though many of the works attributed to him were published anonymously. Determining the identity of Shakespeare has fascinated scholars for 400 years, and four significant figures in English literary history have been suggested as likely alternatives to Shakespeare for some disputed works: Bacon, de Vere, Stanley, and Marlowe. A myriad of computational and statistical tools and techniques have been used to determine the true authorship of his works. Many of these techniques rely on basic statistical correlations, word counts, collocated word groups, or keyword density, but no one method has been decided on. We suggest that an alternative technique that uses word semantics to draw on personality can provide an accurate profile of a person. To test this claim, we analyse the works of Shakespeare, Christopher Marlowe, and Elizabeth Cary. We use Word Accumulation Curves, Hierarchical Clustering overlays, Principal Component Analysis, and Linear Discriminant Analysis techniques in combination with RPAS, a multi-faceted text analysis approach that draws on a writer's personality, or self to identify subtle characteristics within a person's writing style. Here we find that RPAS can separate the known authored works of Shakespeare from Marlowe and Cary. Further, it separates their contested works, works suspected of being written by others. While few authorship identification techniques identify self from the way a person writes, we demonstrate that these stylistic characteristics are as applicable 400 years ago as they are today and have the potential to be used within cyberspace for law enforcement purposes.

  2. Patent Keyword Extraction Algorithm Based on Distributed Representation for Patent Classification

    Directory of Open Access Journals (Sweden)

    Jie Hu

    2018-02-01

    Full Text Available Many text mining tasks such as text retrieval, text summarization, and text comparisons depend on the extraction of representative keywords from the main text. Most existing keyword extraction algorithms are based on discrete bag-of-words type of word representation of the text. In this paper, we propose a patent keyword extraction algorithm (PKEA based on the distributed Skip-gram model for patent classification. We also develop a set of quantitative performance measures for keyword extraction evaluation based on information gain and cross-validation, based on Support Vector Machine (SVM classification, which are valuable when human-annotated keywords are not available. We used a standard benchmark dataset and a homemade patent dataset to evaluate the performance of PKEA. Our patent dataset includes 2500 patents from five distinct technological fields related to autonomous cars (GPS systems, lidar systems, object recognition systems, radar systems, and vehicle control systems. We compared our method with Frequency, Term Frequency-Inverse Document Frequency (TF-IDF, TextRank and Rapid Automatic Keyword Extraction (RAKE. The experimental results show that our proposed algorithm provides a promising way to extract keywords from patent texts for patent classification.

  3. Cluster Validity Classification Approaches Based on Geometric Probability and Application in the Classification of Remotely Sensed Images

    Directory of Open Access Journals (Sweden)

    LI Jian-Wei

    2014-08-01

    Full Text Available On the basis of the cluster validity function based on geometric probability in literature [1, 2], propose a cluster analysis method based on geometric probability to process large amount of data in rectangular area. The basic idea is top-down stepwise refinement, firstly categories then subcategories. On all clustering levels, use the cluster validity function based on geometric probability firstly, determine clusters and the gathering direction, then determine the center of clustering and the border of clusters. Through TM remote sensing image classification examples, compare with the supervision and unsupervised classification in ERDAS and the cluster analysis method based on geometric probability in two-dimensional square which is proposed in literature 2. Results show that the proposed method can significantly improve the classification accuracy.

  4. Modified angle's classification for primary dentition

    Directory of Open Access Journals (Sweden)

    Kaushik Narendra Chandranee

    2017-01-01

    Full Text Available Aim: This study aims to propose a modification of Angle's classification for primary dentition and to assess its applicability in children from Central India, Nagpur. Methods: Modification in Angle's classification has been proposed for application in primary dentition. Small roman numbers i/ii/iii are used for primary dentition notation to represent Angle's Class I/II/III molar relationships as in permanent dentition, respectively. To assess applicability of modified Angle's classification a cross-sectional preschool 2000 children population from central India; 3–6 years of age residing in Nagpur metropolitan city of Maharashtra state were selected randomly as per the inclusion and exclusion criteria. Results: Majority 93.35% children were found to have bilateral Class i followed by 2.5% bilateral Class ii and 0.2% bilateral half cusp Class iii molar relationships as per the modified Angle's classification for primary dentition. About 3.75% children had various combinations of Class ii relationships and 0.2% children were having Class iii subdivision relationship. Conclusions: Modification of Angle's classification for application in primary dentition has been proposed. A cross-sectional investigation using new classification revealed various 6.25% Class ii and 0.4% Class iii molar relationships cases in preschool children population in a metropolitan city of Nagpur. Application of the modified Angle's classification to other population groups is warranted to validate its routine application in clinical pediatric dentistry.

  5. TEXT CLASSIFICATION FOR AUTOMATIC DETECTION OF E-CIGARETTE USE AND USE FOR SMOKING CESSATION FROM TWITTER: A FEASIBILITY PILOT.

    Science.gov (United States)

    Aphinyanaphongs, Yin; Lulejian, Armine; Brown, Duncan Penfold; Bonneau, Richard; Krebs, Paul

    2016-01-01

    Rapid increases in e-cigarette use and potential exposure to harmful byproducts have shifted public health focus to e-cigarettes as a possible drug of abuse. Effective surveillance of use and prevalence would allow appropriate regulatory responses. An ideal surveillance system would collect usage data in real time, focus on populations of interest, include populations unable to take the survey, allow a breadth of questions to answer, and enable geo-location analysis. Social media streams may provide this ideal system. To realize this use case, a foundational question is whether we can detect e-cigarette use at all. This work reports two pilot tasks using text classification to identify automatically Tweets that indicate e-cigarette use and/or e-cigarette use for smoking cessation. We build and define both datasets and compare performance of 4 state of the art classifiers and a keyword search for each task. Our results demonstrate excellent classifier performance of up to 0.90 and 0.94 area under the curve in each category. These promising initial results form the foundation for further studies to realize the ideal surveillance solution.

  6. Multivariate Approaches to Classification in Extragalactic Astronomy

    Directory of Open Access Journals (Sweden)

    Didier eFraix-Burnet

    2015-08-01

    Full Text Available Clustering objects into synthetic groups is a natural activity of any science. Astrophysics is not an exception and is now facing a deluge of data. For galaxies, the one-century old Hubble classification and the Hubble tuning fork are still largely in use, together with numerous mono- or bivariate classifications most often made by eye. However, a classification must be driven by the data, and sophisticated multivariate statistical tools are used more and more often. In this paper we review these different approaches in order to situate them in the general context of unsupervised and supervised learning. We insist on the astrophysical outcomes of these studies to show that multivariate analyses provide an obvious path toward a renewal of our classification of galaxies and are invaluable tools to investigate the physics and evolution of galaxies.

  7. When phonetics matters: creation and perception of female images in song folklore

    Directory of Open Access Journals (Sweden)

    Stashko Halyna

    2017-06-01

    Full Text Available This paper presents a stylistic analysis of female images in American song folklore in order to examine how sound symbolic language elements contribute to the construction of verbal images. The results obtained show the link between sound and meaning and how such phonetic means of stylistics as assonance, alliteration, and onomatopoeia function to reinforce the meanings of words or to set the mood typical of the characters. Their synergy helps create and interpret female images and provides relevant atmosphere and background to them in folk song texts.

  8. La literatura como recurso didáctico complementario en la enseñanza y aprendizaje de la dermatología Literature as a complementary educational resource in the teaching and learning of dermatology

    Directory of Open Access Journals (Sweden)

    Francisco Vázquez-López

    2012-03-01

    Full Text Available Introducción. La reflexión sobre los recursos literarios puede mejorar su aplicación como herramienta de aprendizaje para profesionales de la salud, tanto en contextos transversales como específicos de especialidad. Proponemos una novedosa clasificación de los recursos literarios, desde una perspectiva médica, con el objetivo de facilitar su aplicación en la enseñanza y aprendizaje de dermatología. Materiales y métodos. Análisis cualitativo de múltiples textos literarios y selección de aquellos textos que incluían referencias a una enfermedad cutánea frecuente (acné, que se seleccionó como paradigma. Resultados. Se realizó una clasificación basada en cinco grupos: autobiografía de la enfermedad cutánea, el escritor en la descripción de la enfermedad cutánea de otros, la enfermedad cutánea como recurso estilístico en la literatura, el médico como escritor: recursos estilísticos en la descripción médica de la enfermedad, y literatura médica relacionada con los textos seleccionados. Conclusiones. Los recursos literarios, desde una perspectiva médica, se pueden clasificar en cinco grupos; esta clasificación puede facilitar su aplicación docente y su integración como recurso didáctico en la enseñanza especializada de la dermatología y en la enseñanza médica en general.Introduction. To classify literary texts from a medical perspective facilitates their application in the teaching and learning of medicine and dermatology. Materials and methods. Qualitative analysis of multiple literary texts and selection of those texts that contain references to a common skin disease (acne, which was selected as a paradigm. Results. A classification was carried out of the literary texts analyzed, based on five groups: autobiography of the skin disease; the writer describing skin disease in others; skin disease as a stylistic device in literature; the physician as writer: stylistic devices in the medical description of the

  9. A Semisupervised Cascade Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Stamatis Karlos

    2016-01-01

    Full Text Available Classification is one of the most important tasks of data mining techniques, which have been adopted by several modern applications. The shortage of enough labeled data in the majority of these applications has shifted the interest towards using semisupervised methods. Under such schemes, the use of collected unlabeled data combined with a clearly smaller set of labeled examples leads to similar or even better classification accuracy against supervised algorithms, which use labeled examples exclusively during the training phase. A novel approach for increasing semisupervised classification using Cascade Classifier technique is presented in this paper. The main characteristic of Cascade Classifier strategy is the use of a base classifier for increasing the feature space by adding either the predicted class or the probability class distribution of the initial data. The classifier of the second level is supplied with the new dataset and extracts the decision for each instance. In this work, a self-trained NB∇C4.5 classifier algorithm is presented, which combines the characteristics of Naive Bayes as a base classifier and the speed of C4.5 for final classification. We performed an in-depth comparison with other well-known semisupervised classification methods on standard benchmark datasets and we finally reached to the point that the presented technique has better accuracy in most cases.

  10. Boosting bonsai trees for handwritten/printed text discrimination

    Science.gov (United States)

    Ricquebourg, Yann; Raymond, Christian; Poirriez, Baptiste; Lemaitre, Aurélie; Coüasnon, Bertrand

    2013-12-01

    Boosting over decision-stumps proved its efficiency in Natural Language Processing essentially with symbolic features, and its good properties (fast, few and not critical parameters, not sensitive to over-fitting) could be of great interest in the numeric world of pixel images. In this article we investigated the use of boosting over small decision trees, in image classification processing, for the discrimination of handwritten/printed text. Then, we conducted experiments to compare it to usual SVM-based classification revealing convincing results with very close performance, but with faster predictions and behaving far less as a black-box. Those promising results tend to make use of this classifier in more complex recognition tasks like multiclass problems.

  11. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  12. A Gameplay Definition through Videogame Classification

    Directory of Open Access Journals (Sweden)

    Damien Djaouti

    2008-01-01

    Full Text Available This paper is part of an experimental approach aimed to raise a videogames classification. Being inspired by the methodology that Propp used for the classification of Russian fairy tales, we have identified recurrent diagrams within rules of videogames, that we called “Gameplay Bricks”. The combinations of these different bricks should allow us to represent a classification of all videogames in accordance with their rules. In this article, we will study the nature of these bricks, especially the link they seem to have with two types of game rules: the rules that allow the player to “manipulate” the elements of the game, and the rules defining the “goal” of the game. This study will lead to an hypothesis about the nature of gameplay.

  13. Into the Past: A Step Towards a Robust Kimberley Rock Art Chronology.

    Directory of Open Access Journals (Sweden)

    June Ross

    Full Text Available The recent establishment of a minimum age estimate of 39.9 ka for the origin of rock art in Sulawesi has challenged claims that Western Europe was the locus for the production of the world's earliest art assemblages. Tantalising excavated evidence found across northern Australian suggests that Australia too contains a wealth of ancient art. However, the dating of rock art itself remains the greatest obstacle to be addressed if the significance of Australian assemblages are to be recognised on the world stage. A recent archaeological project in the northwest Kimberley trialled three dating techniques in order to establish chronological markers for the proposed, regional, relative stylistic sequence. Applications using optically-stimulated luminescence (OSL provided nine minimum age estimates for fossilised mudwasp nests overlying a range of rock art styles, while Accelerator Mass Spectrometry radiocarbon (AMS 14C results provided an additional four. Results confirm that at least one phase of the northwest Kimberley rock art assemblage is Pleistocene in origin. A complete motif located on the ceiling of a rockshelter returned a minimum age estimate of 16 ± 1 ka. Further, our results demonstrate the inherent problems in relying solely on stylistic classifications to order rock art assemblages into temporal sequences. An earlier than expected minimum age estimate for one style and a maximum age estimate for another together illustrate that the Holocene Kimberley rock art sequence is likely to be far more complex than generally accepted with different styles produced contemporaneously well into the last few millennia. It is evident that reliance on techniques that produce minimum age estimates means that many more dating programs will need to be undertaken before the stylistic sequence can be securely dated.

  14. Decimal Classification Editions

    Directory of Open Access Journals (Sweden)

    Zenovia Niculescu

    2009-01-01

    Full Text Available The study approaches the evolution of Dewey Decimal Classification editions from the perspective of updating the terminology, reallocating and expanding the main and auxilary structure of Dewey indexing language. The comparative analysis of DDC editions emphasizes the efficiency of Dewey scheme from the point of view of improving the informational offer, through basic index terms, revised and developed, as well as valuing the auxilary notations.

  15. A New Method for Solving Supervised Data Classification Problems

    Directory of Open Access Journals (Sweden)

    Parvaneh Shabanzadeh

    2014-01-01

    Full Text Available Supervised data classification is one of the techniques used to extract nontrivial information from data. Classification is a widely used technique in various fields, including data mining, industry, medicine, science, and law. This paper considers a new algorithm for supervised data classification problems associated with the cluster analysis. The mathematical formulations for this algorithm are based on nonsmooth, nonconvex optimization. A new algorithm for solving this optimization problem is utilized. The new algorithm uses a derivative-free technique, with robustness and efficiency. To improve classification performance and efficiency in generating classification model, a new feature selection algorithm based on techniques of convex programming is suggested. Proposed methods are tested on real-world datasets. Results of numerical experiments have been presented which demonstrate the effectiveness of the proposed algorithms.

  16. Combining multiple classifiers for age classification

    CSIR Research Space (South Africa)

    Van Heerden, C

    2009-11-01

    Full Text Available The authors compare several different classifier combination methods on a single task, namely speaker age classification. This task is well suited to combination strategies, since significantly different feature classes are employed. Support vector...

  17. Semi-Supervised Learning for Classification of Protein Sequence Data

    Directory of Open Access Journals (Sweden)

    Brian R. King

    2008-01-01

    Full Text Available Protein sequence data continue to become available at an exponential rate. Annotation of functional and structural attributes of these data lags far behind, with only a small fraction of the data understood and labeled by experimental methods. Classification methods that are based on semi-supervised learning can increase the overall accuracy of classifying partly labeled data in many domains, but very few methods exist that have shown their effect on protein sequence classification. We show how proven methods from text classification can be applied to protein sequence data, as we consider both existing and novel extensions to the basic methods, and demonstrate restrictions and differences that must be considered. We demonstrate comparative results against the transductive support vector machine, and show superior results on the most difficult classification problems. Our results show that large repositories of unlabeled protein sequence data can indeed be used to improve predictive performance, particularly in situations where there are fewer labeled protein sequences available, and/or the data are highly unbalanced in nature.

  18. Exploring different approaches for music genre classification

    Directory of Open Access Journals (Sweden)

    Antonio Jose Homsi Goulart

    2012-07-01

    Full Text Available In this letter, we present different approaches for music genre classification. The proposed techniques, which are composed of a feature extraction stage followed by a classification procedure, explore both the variations of parameters used as input and the classifier architecture. Tests were carried out with three styles of music, namely blues, classical, and lounge, which are considered informally by some musicians as being “big dividers” among music genres, showing the efficacy of the proposed algorithms and establishing a relationship between the relevance of each set of parameters for each music style and each classifier. In contrast to other works, entropies and fractal dimensions are the features adopted for the classifications.

  19. Automatic Classification of Attacks on IP Telephony

    Directory of Open Access Journals (Sweden)

    Jakub Safarik

    2013-01-01

    Full Text Available This article proposes an algorithm for automatic analysis of attack data in IP telephony network with a neural network. Data for the analysis is gathered from variable monitoring application running in the network. These monitoring systems are a typical part of nowadays network. Information from them is usually used after attack. It is possible to use an automatic classification of IP telephony attacks for nearly real-time classification and counter attack or mitigation of potential attacks. The classification use proposed neural network, and the article covers design of a neural network and its practical implementation. It contains also methods for neural network learning and data gathering functions from honeypot application.

  20. Land Cover and Land Use Classification with TWOPAC: towards Automated Processing for Pixel- and Object-Based Image Classification

    Directory of Open Access Journals (Sweden)

    Stefan Dech

    2012-09-01

    Full Text Available We present a novel and innovative automated processing environment for the derivation of land cover (LC and land use (LU information. This processing framework named TWOPAC (TWinned Object and Pixel based Automated classification Chain enables the standardized, independent, user-friendly, and comparable derivation of LC and LU information, with minimized manual classification labor. TWOPAC allows classification of multi-spectral and multi-temporal remote sensing imagery from different sensor types. TWOPAC enables not only pixel-based classification, but also allows classification based on object-based characteristics. Classification is based on a Decision Tree approach (DT for which the well-known C5.0 code has been implemented, which builds decision trees based on the concept of information entropy. TWOPAC enables automatic generation of the decision tree classifier based on a C5.0-retrieved ascii-file, as well as fully automatic validation of the classification output via sample based accuracy assessment.Envisaging the automated generation of standardized land cover products, as well as area-wide classification of large amounts of data in preferably a short processing time, standardized interfaces for process control, Web Processing Services (WPS, as introduced by the Open Geospatial Consortium (OGC, are utilized. TWOPAC’s functionality to process geospatial raster or vector data via web resources (server, network enables TWOPAC’s usability independent of any commercial client or desktop software and allows for large scale data processing on servers. Furthermore, the components of TWOPAC were built-up using open source code components and are implemented as a plug-in for Quantum GIS software for easy handling of the classification process from the user’s perspective.

  1. SHIP CLASSIFICATION FROM MULTISPECTRAL VIDEOS

    Directory of Open Access Journals (Sweden)

    Frederique Robert-Inacio

    2012-05-01

    Full Text Available Surveillance of a seaport can be achieved by different means: radar, sonar, cameras, radio communications and so on. Such a surveillance aims, on the one hand, to manage cargo and tanker traffic, and, on the other hand, to prevent terrorist attacks in sensitive areas. In this paper an application to video-surveillance of a seaport entrance is presented, and more particularly, the different steps enabling to classify mobile shapes. This classification is based on a parameter measuring the similarity degree between the shape under study and a set of reference shapes. The classification result describes the considered mobile in terms of shape and speed.

  2. A New Classification Approach Based on Multiple Classification Rules

    OpenAIRE

    Zhongmei Zhou

    2014-01-01

    A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when t...

  3. 78 FR 68983 - Cotton Futures Classification: Optional Classification Procedure

    Science.gov (United States)

    2013-11-18

    ...-AD33 Cotton Futures Classification: Optional Classification Procedure AGENCY: Agricultural Marketing... regulations to allow for the addition of an optional cotton futures classification procedure--identified and... response to requests from the U.S. cotton industry and ICE, AMS will offer a futures classification option...

  4. Multiclass Boosting with Adaptive Group-Based kNN and Its Application in Text Categorization

    Directory of Open Access Journals (Sweden)

    Lei La

    2012-01-01

    Full Text Available AdaBoost is an excellent committee-based tool for classification. However, its effectiveness and efficiency in multiclass categorization face the challenges from methods based on support vector machine (SVM, neural networks (NN, naïve Bayes, and k-nearest neighbor (kNN. This paper uses a novel multi-class AdaBoost algorithm to avoid reducing the multi-class classification problem to multiple two-class classification problems. This novel method is more effective. In addition, it keeps the accuracy advantage of existing AdaBoost. An adaptive group-based kNN method is proposed in this paper to build more accurate weak classifiers and in this way control the number of basis classifiers in an acceptable range. To further enhance the performance, weak classifiers are combined into a strong classifier through a double iterative weighted way and construct an adaptive group-based kNN boosting algorithm (AGkNN-AdaBoost. We implement AGkNN-AdaBoost in a Chinese text categorization system. Experimental results showed that the classification algorithm proposed in this paper has better performance both in precision and recall than many other text categorization methods including traditional AdaBoost. In addition, the processing speed is significantly enhanced than original AdaBoost and many other classic categorization algorithms.

  5. A Way Forward for Ship Classification and Technical Services

    Directory of Open Access Journals (Sweden)

    Lam-Bee Goh

    2014-04-01

    Full Text Available Classification societies are one of key organizations that promote the highest standards in ship safety and quality shipping. The paper reviews the ship classification industry and identifies what the classification societies can do to add value to the maritime industry more effectively. To meet this objective, an analysis of the five competitive forces is carried out, together with an opinion survey performed on some of the leading shipping companies, to assess and to establish some of the key factors which should be considered when formulating an overall business strategy for the growth of the classification services business. The findings from the study are discussed with the strategic options and choices. A classification services industrial value chain analysis together with ship management and operation is undertaken to explore the opportunities for classification societies. These findings also provide guidance to policy-makers who design and seek to implement more effective international shipping policies.

  6. Integrating Globality and Locality for Robust Representation Based Classification

    Directory of Open Access Journals (Sweden)

    Zheng Zhang

    2014-01-01

    Full Text Available The representation based classification method (RBCM has shown huge potential for face recognition since it first emerged. Linear regression classification (LRC method and collaborative representation classification (CRC method are two well-known RBCMs. LRC and CRC exploit training samples of each class and all the training samples to represent the testing sample, respectively, and subsequently conduct classification on the basis of the representation residual. LRC method can be viewed as a “locality representation” method because it just uses the training samples of each class to represent the testing sample and it cannot embody the effectiveness of the “globality representation.” On the contrary, it seems that CRC method cannot own the benefit of locality of the general RBCM. Thus we propose to integrate CRC and LRC to perform more robust representation based classification. The experimental results on benchmark face databases substantially demonstrate that the proposed method achieves high classification accuracy.

  7. Involvement of Machine Learning for Breast Cancer Image Classification: A Survey

    Directory of Open Access Journals (Sweden)

    Abdullah-Al Nahid

    2017-01-01

    Full Text Available Breast cancer is one of the largest causes of women’s death in the world today. Advance engineering of natural image classification techniques and Artificial Intelligence methods has largely been used for the breast-image classification task. The involvement of digital image classification allows the doctor and the physicians a second opinion, and it saves the doctors’ and physicians’ time. Despite the various publications on breast image classification, very few review papers are available which provide a detailed description of breast cancer image classification techniques, feature extraction and selection procedures, classification measuring parameterizations, and image classification findings. We have put a special emphasis on the Convolutional Neural Network (CNN method for breast image classification. Along with the CNN method we have also described the involvement of the conventional Neural Network (NN, Logic Based classifiers such as the Random Forest (RF algorithm, Support Vector Machines (SVM, Bayesian methods, and a few of the semisupervised and unsupervised methods which have been used for breast image classification.

  8. Classification of low-resource livestock producers in the North West ...

    African Journals Online (AJOL)

    Classification of low-resource livestock producers in the North West Province. I.V. Nsahlai, A.T. Sedumedi. Abstract. (South African J of Animal Science, 2000, 30, Supplement 1: 109-110). Full Text: EMAIL FREE FULL TEXT EMAIL FREE FULL TEXT · DOWNLOAD FULL TEXT DOWNLOAD FULL TEXT.

  9. Document representations for classification of short web-page descriptions

    Directory of Open Access Journals (Sweden)

    Radovanović Miloš

    2008-01-01

    Full Text Available Motivated by applying Text Categorization to classification of Web search results, this paper describes an extensive experimental study of the impact of bag-of- words document representations on the performance of five major classifiers - Naïve Bayes, SVM, Voted Perceptron, kNN and C4.5. The texts, representing short Web-page descriptions sorted into a large hierarchy of topics, are taken from the dmoz Open Directory Web-page ontology, and classifiers are trained to automatically determine the topics which may be relevant to a previously unseen Web-page. Different transformations of input data: stemming, normalization, logtf and idf, together with dimensionality reduction, are found to have a statistically significant improving or degrading effect on classification performance measured by classical metrics - accuracy, precision, recall, F1 and F2. The emphasis of the study is not on determining the best document representation which corresponds to each classifier, but rather on describing the effects of every individual transformation on classification, together with their mutual relationships. .

  10. Defuzzification Strategies for Fuzzy Classifications of Remote Sensing Data

    Directory of Open Access Journals (Sweden)

    Peter Hofmann

    2016-06-01

    Full Text Available The classes in fuzzy classification schemes are defined as fuzzy sets, partitioning the feature space through fuzzy rules, defined by fuzzy membership functions. Applying fuzzy classification schemes in remote sensing allows each pixel or segment to be an incomplete member of more than one class simultaneously, i.e., one that does not fully meet all of the classification criteria for any one of the classes and is member of more than one class simultaneously. This can lead to fuzzy, ambiguous and uncertain class assignation, which is unacceptable for many applications, indicating the need for a reliable defuzzification method. Defuzzification in remote sensing has to date, been performed by “crisp-assigning” each fuzzy-classified pixel or segment to the class for which it best fulfills the fuzzy classification rules, regardless of its classification fuzziness, uncertainty or ambiguity (maximum method. The defuzzification of an uncertain or ambiguous fuzzy classification leads to a more or less reliable crisp classification. In this paper the most common parameters for expressing classification uncertainty, fuzziness and ambiguity are analysed and discussed in terms of their ability to express the reliability of a crisp classification. This is done by means of a typical practical example from Object Based Image Analysis (OBIA.

  11. NEW CLASSIFICATION OF ECOPOLICES

    Directory of Open Access Journals (Sweden)

    VOROBYOV V. V.

    2016-09-01

    Full Text Available Problem statement. Ecopolices are the newest stage of the urban planning. They have to be consideredsuchas material and energy informational structures, included to the dynamic-evolutionary matrix netsofex change processes in the ecosystems. However, there are not made the ecopolice classifications, developing on suchapproaches basis. And this determined the topicality of the article. Analysis of publications on theoretical and applied aspects of the ecopolices formation showed, that the work on them is managed mainly in the context of the latest scientific and technological achievements in the various knowledge fields. These settlements are technocratic. They are connected with the morphology of space, network structures of regional and local natural ecosystems, without independent stability, can not exist without continuous man support. Another words, they do not work in with an ecopolices idea. It is come to a head for objective, symbiotic searching of ecopolices concept with the development of their classifications. Purpose statement is to develop the objective evidence for ecopolices and to propose their new classification. Conclusion. On the base of the ecopolices classification have to lie an elements correlation idea of their general plans and men activity type according with natural mechanism of accepting, reworking and transmission of material, energy and information between geo-ecosystems, planet, man, ecopolices material part and Cosmos. New ecopolices classification should be based on the principles of multi-dimensional, time-spaced symbiotic clarity with exchange ecosystem networks. The ecopolice function with this approach comes not from the subjective anthropocentric economy but from the holistic objective of Genesis paradigm. Or, otherwise - not from the Consequence, but from the Cause.

  12. Differential Classification of Dementia

    Directory of Open Access Journals (Sweden)

    E. Mohr

    1995-01-01

    Full Text Available In the absence of biological markers, dementia classification remains complex both in terms of characterization as well as early detection of the presence or absence of dementing symptoms, particularly in diseases with possible secondary dementia. An empirical, statistical approach using neuropsychological measures was therefore developed to distinguish demented from non-demented patients and to identify differential patterns of cognitive dysfunction in neurodegenerative disease. Age-scaled neurobehavioral test results (Wechsler Adult Intelligence Scale—Revised and Wechsler Memory Scale from Alzheimer's (AD and Huntington's (HD patients, matched for intellectual disability, as well as normal controls were used to derive a classification formula. Stepwise discriminant analysis accurately (99% correct distinguished controls from demented patients, and separated the two patient groups (79% correct. Variables discriminating between HD and AD patient groups consisted of complex psychomotor tasks, visuospatial function, attention and memory. The reliability of the classification formula was demonstrated with a new, independent sample of AD and HD patients which yielded virtually identical results (classification accuracy for dementia: 96%; AD versus HD: 78%. To validate the formula, the discriminant function was applied to Parkinson's (PD patients, 38% of whom were classified as demented. The validity of the classification was demonstrated by significant PD subgroup differences on measures of dementia not included in the discriminant function. Moreover, a majority of demented PD patients (65% were classified as having an HD-like pattern of cognitive deficits, in line with previous reports of the subcortical nature of PD dementia. This approach may thus be useful in classifying presence or absence of dementia and in discriminating between dementia subtypes in cases of secondary or coincidental dementia.

  13. Classification of forensic autopsy reports through conceptual graph-based document representation model.

    Science.gov (United States)

    Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali

    2018-06-01

    Text categorization has been used extensively in recent years to classify plain-text clinical reports. This study employs text categorization techniques for the classification of open narrative forensic autopsy reports. One of the key steps in text classification is document representation. In document representation, a clinical report is transformed into a format that is suitable for classification. The traditional document representation technique for text categorization is the bag-of-words (BoW) technique. In this study, the traditional BoW technique is ineffective in classifying forensic autopsy reports because it merely extracts frequent but discriminative features from clinical reports. Moreover, this technique fails to capture word inversion, as well as word-level synonymy and polysemy, when classifying autopsy reports. Hence, the BoW technique suffers from low accuracy and low robustness unless it is improved with contextual and application-specific information. To overcome the aforementioned limitations of the BoW technique, this research aims to develop an effective conceptual graph-based document representation (CGDR) technique to classify 1500 forensic autopsy reports from four (4) manners of death (MoD) and sixteen (16) causes of death (CoD). Term-based and Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) based conceptual features were extracted and represented through graphs. These features were then used to train a two-level text classifier. The first level classifier was responsible for predicting MoD. In addition, the second level classifier was responsible for predicting CoD using the proposed conceptual graph-based document representation technique. To demonstrate the significance of the proposed technique, its results were compared with those of six (6) state-of-the-art document representation techniques. Lastly, this study compared the effects of one-level classification and two-level classification on the experimental results

  14. Towards Automatic Classification of Wikipedia Content

    Science.gov (United States)

    Szymański, Julian

    Wikipedia - the Free Encyclopedia encounters the problem of proper classification of new articles everyday. The process of assignment of articles to categories is performed manually and it is a time consuming task. It requires knowledge about Wikipedia structure, which is beyond typical editor competence, which leads to human-caused mistakes - omitting or wrong assignments of articles to categories. The article presents application of SVM classifier for automatic classification of documents from The Free Encyclopedia. The classifier application has been tested while using two text representations: inter-documents connections (hyperlinks) and word content. The results of the performed experiments evaluated on hand crafted data show that the Wikipedia classification process can be partially automated. The proposed approach can be used for building a decision support system which suggests editors the best categories that fit new content entered to Wikipedia.

  15. Classification of Metal-Deficient Dwarfs in the Vilnius Photometric System

    Directory of Open Access Journals (Sweden)

    Lazauskaitė R.

    2003-12-01

    Full Text Available Methods used for the quantitative classification of metal-deficient stars in the Vilnius photometric system are reviewed. We present a new calibration of absolute magnitudes for dwarfs and subdwarfs, based on Hipparcos parallaxes. The new classification scheme is applied to a sample of Population II visual binaries.

  16. Titulus Scuola: the new file classification schema for Italian schools

    Directory of Open Access Journals (Sweden)

    Gianni Penzo Doria

    2017-05-01

    Full Text Available This article presents the new file classification schema national for Italian schools, produced by the Italian Directorate General of Archives of the Ministry for Cultural Heritage, within the project Titulus Scuola. This classification schema represents the starting point for a standard documental system, aimed at the digital administration.

  17. Is overall similarity classification less effortful than single-dimension classification?

    Science.gov (United States)

    Wills, Andy J; Milton, Fraser; Longmore, Christopher A; Hester, Sarah; Robinson, Jo

    2013-01-01

    It is sometimes argued that the implementation of an overall similarity classification is less effortful than the implementation of a single-dimension classification. In the current article, we argue that the evidence securely in support of this view is limited, and report additional evidence in support of the opposite proposition--overall similarity classification is more effortful than single-dimension classification. Using a match-to-standards procedure, Experiments 1A, 1B and 2 demonstrate that concurrent load reduces the prevalence of overall similarity classification, and that this effect is robust to changes in the concurrent load task employed, the level of time pressure experienced, and the short-term memory requirements of the classification task. Experiment 3 demonstrates that participants who produced overall similarity classifications from the outset have larger working memory capacities than those who produced single-dimension classifications initially, and Experiment 4 demonstrates that instructions to respond meticulously increase the prevalence of overall similarity classification.

  18. Memristive Perceptron for Combinational Logic Classification

    Directory of Open Access Journals (Sweden)

    Lidan Wang

    2013-01-01

    Full Text Available The resistance of the memristor depends upon the past history of the input current or voltage; so it can function as synapse in neural networks. In this paper, a novel perceptron combined with the memristor is proposed to implement the combinational logic classification. The relationship between the memristive conductance change and the synapse weight update is deduced, and the memristive perceptron model and its synaptic weight update rule are explored. The feasibility of the novel memristive perceptron for implementing the combinational logic classification (NAND, NOR, XOR, and NXOR is confirmed by MATLAB simulation.

  19. EEG BASED COGNITIVE WORKLOAD CLASSIFICATION DURING NASA MATB-II MULTITASKING

    Directory of Open Access Journals (Sweden)

    Sushil Chandra

    2015-06-01

    Full Text Available The objective of this experiment was to determine the best possible input EEG feature for classification of the workload while designing load balancing logic for an automated operator. The input features compared in this study consisted of spectral features of Electroencephalography, objective scoring and subjective scoring. Method utilizes to identify best EEG feature as an input in Neural Network Classifiers for workload classification, to identify channels which could provide classification with the highest accuracy and for identification of EEG feature which could give discrimination among workload level without adding any classifiers. The result had shown Engagement Index is the best feature for neural network classification.

  20. Basic Hand Gestures Classification Based on Surface Electromyography

    Directory of Open Access Journals (Sweden)

    Aleksander Palkowski

    2016-01-01

    Full Text Available This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the proposed method.

  1. Application of Cocktail method in vegetation classification

    Directory of Open Access Journals (Sweden)

    Hamed Asadi

    2016-09-01

    Full Text Available This study intends to assess the application of Cocktail method in the classification of large vegetation databases. For this purpose, Buxus hyrcana dataset consisted of 442 relevés with 89 species were used and by the modified TWINSPAN. For running the Cocktail method, first primarily classification was done by modified TWINSPAN, and by performing phi analysis in the groups resulted five species were selected which had the highest fidelity value. Then sociological species groups were formed by examining co-occurrence of these 5 species with other species in the database. 21 plant communities belongs to 6 variant, 17 sub associations, 11 associations, 4 alliance, 1 order and 1 class were recognized by assigning 379 releves to the sociological species groups by using logical formulas. Also, 63 releves by the logical formula were not assigned to any sociological species groups, by FPFI index were assigned to the sociological species groups which had the most index value. According to 91% classification agreement with Brown-Blanquet classification and Cocktail classification, we suggest Cocktail method to vegetation scientists as an efficient alternative of Braun-Blanquet method to classify large vegetation databases.

  2. Knowledge Dictionary for Information Extraction on the Arabic Text Data

    Directory of Open Access Journals (Sweden)

    Wahyu Jauharis Saputra

    2013-04-01

    Full Text Available Information extraction is an early stage of a process of textual data analysis. Information extraction is required to get information from textual data that can be used for process analysis, such as classification and categorization. A textual data is strongly influenced by the language. Arabic is gaining a significant attention in many studies because Arabic language is very different from others, and in contrast to other languages, tools and research on the Arabic language is still lacking. The information extracted using the knowledge dictionary is a concept of expression. A knowledge dictionary is usually constructed manually by an expert and this would take a long time and is specific to a problem only. This paper proposed a method for automatically building a knowledge dictionary. Dictionary knowledge is formed by classifying sentences having the same concept, assuming that they will have a high similarity value. The concept that has been extracted can be used as features for subsequent computational process such as classification or categorization. Dataset used in this paper was the Arabic text dataset. Extraction result was tested by using a decision tree classification engine and the highest precision value obtained was 71.0% while the highest recall value was 75.0%. 

  3. Structural Analysis of the Oxymoron in the Sonnets of William Shakespeare

    Directory of Open Access Journals (Sweden)

    Liliya R. Sakaeva

    2017-11-01

    Full Text Available This paper considers the structural groups of oxymoron in the Russian and English languages. It is also relevant to study oppositional lexical units represented in heterogeneous system languages from the standpoint of linguistic and extra linguistic meanings, since the figures of contrast are inconceivable without the associative-emotional and evaluative qualifications of the objects of opposition. They give the analysis of the oxymoron’s nature and its functions in two different-structured languages. The article has carried out lexical and semantic characteristics of oxymoron. In the linguistic literature there is no generalized, concrete and universal structural and semantic classification of this stylistic device. This study attempts to create a structural and semantic classification, combining all the existing varieties of this figure of contrast. The analysis is applied in the linguistic examination of the Sonnets written by William Shakespeare. When studying, systemizing and analyzing the opposite units, it is extremely important to study their structural features. The main objective of this study is to identify and describe the types of oxymoron in the language of Shakespeare’s sonnets.

  4. Text Classification and Distributional features techniques in Datamining and Warehousing

    OpenAIRE

    Bethu, Srikanth; Babu, G Charless; Vinoda, J; Priyadarshini, E; rao, M Raghavendra

    2013-01-01

    Text Categorization is traditionally done by using the term frequency and inverse document frequency.This type of method is not very good because, some words which are not so important may appear in the document .The term frequency of unimportant words may increase and document may be classified in the wrong category.For reducing the error of classifying of documents in wrong category. The Distributional features are introduced. In the Distribuional Features, the Distribution of the words in ...

  5. "What is relevant in a text document?": An interpretable machine learning approach.

    Directory of Open Access Journals (Sweden)

    Leila Arras

    Full Text Available Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML models have been trained to automatically map documents to these abstract concepts, allowing to annotate very large text collections, more than could be processed by a human in a lifetime. Besides predicting the text's category very accurately, it is also highly desirable to understand how and why the categorization process takes place. In this paper, we demonstrate that such understanding can be achieved by tracing the classification decision back to individual words using layer-wise relevance propagation (LRP, a recently developed technique for explaining predictions of complex non-linear classifiers. We train two word-based ML models, a convolutional neural network (CNN and a bag-of-words SVM classifier, on a topic categorization task and adapt the LRP method to decompose the predictions of these models onto words. Resulting scores indicate how much individual words contribute to the overall classification decision. This enables one to distill relevant information from text documents without an explicit semantic information extraction step. We further use the word-wise relevance scores for generating novel vector-based document representations which capture semantic information. Based on these document vectors, we introduce a measure of model explanatory power and show that, although the SVM and CNN models perform similarly in terms of classification accuracy, the latter exhibits a higher level of explainability which makes it more comprehensible for humans and potentially more useful for other applications.

  6. Key-phrase based classification of public health web pages.

    Science.gov (United States)

    Dolamic, Ljiljana; Boyer, Célia

    2013-01-01

    This paper describes and evaluates the public health web pages classification model based on key phrase extraction and matching. Easily extendible both in terms of new classes as well as the new language this method proves to be a good solution for text classification faced with the total lack of training data. To evaluate the proposed solution we have used a small collection of public health related web pages created by a double blind manual classification. Our experiments have shown that by choosing the adequate threshold value the desired value for either precision or recall can be achieved.

  7. MULTI-TEMPORAL CLASSIFICATION AND CHANGE DETECTION USING UAV IMAGES

    Directory of Open Access Journals (Sweden)

    S. Makuti

    2018-05-01

    Full Text Available In this paper different methodologies for the classification and change detection of UAV image blocks are explored. UAV is not only the cheapest platform for image acquisition but it is also the easiest platform to operate in repeated data collections over a changing area like a building construction site. Two change detection techniques have been evaluated in this study: the pre-classification and the post-classification algorithms. These methods are based on three main steps: feature extraction, classification and change detection. A set of state of the art features have been used in the tests: colour features (HSV, textural features (GLCM and 3D geometric features. For classification purposes Conditional Random Field (CRF has been used: the unary potential was determined using the Random Forest algorithm while the pairwise potential was defined by the fully connected CRF. In the performed tests, different feature configurations and settings have been considered to assess the performance of these methods in such challenging task. Experimental results showed that the post-classification approach outperforms the pre-classification change detection method. This was analysed using the overall accuracy, where by post classification have an accuracy of up to 62.6 % and the pre classification change detection have an accuracy of 46.5 %. These results represent a first useful indication for future works and developments.

  8. Classification across gene expression microarray studies

    Directory of Open Access Journals (Sweden)

    Kuner Ruprecht

    2009-12-01

    Full Text Available Abstract Background The increasing number of gene expression microarray studies represents an important resource in biomedical research. As a result, gene expression based diagnosis has entered clinical practice for patient stratification in breast cancer. However, the integration and combined analysis of microarray studies remains still a challenge. We assessed the potential benefit of data integration on the classification accuracy and systematically evaluated the generalization performance of selected methods on four breast cancer studies comprising almost 1000 independent samples. To this end, we introduced an evaluation framework which aims to establish good statistical practice and a graphical way to monitor differences. The classification goal was to correctly predict estrogen receptor status (negative/positive and histological grade (low/high of each tumor sample in an independent study which was not used for the training. For the classification we chose support vector machines (SVM, predictive analysis of microarrays (PAM, random forest (RF and k-top scoring pairs (kTSP. Guided by considerations relevant for classification across studies we developed a generalization of kTSP which we evaluated in addition. Our derived version (DV aims to improve the robustness of the intrinsic invariance of kTSP with respect to technologies and preprocessing. Results For each individual study the generalization error was benchmarked via complete cross-validation and was found to be similar for all classification methods. The misclassification rates were substantially higher in classification across studies, when each single study was used as an independent test set while all remaining studies were combined for the training of the classifier. However, with increasing number of independent microarray studies used in the training, the overall classification performance improved. DV performed better than the average and showed slightly less variance. In

  9. A linear-RBF multikernel SVM to classify big text corpora.

    Science.gov (United States)

    Romero, R; Iglesias, E L; Borrajo, L

    2015-01-01

    Support vector machine (SVM) is a powerful technique for classification. However, SVM is not suitable for classification of large datasets or text corpora, because the training complexity of SVMs is highly dependent on the input size. Recent developments in the literature on the SVM and other kernel methods emphasize the need to consider multiple kernels or parameterizations of kernels because they provide greater flexibility. This paper shows a multikernel SVM to manage highly dimensional data, providing an automatic parameterization with low computational cost and improving results against SVMs parameterized under a brute-force search. The model consists in spreading the dataset into cohesive term slices (clusters) to construct a defined structure (multikernel). The new approach is tested on different text corpora. Experimental results show that the new classifier has good accuracy compared with the classic SVM, while the training is significantly faster than several other SVM classifiers.

  10. Nurses' perception about risk classification in an emergency service

    Directory of Open Access Journals (Sweden)

    Cristiane Chaves de Souza

    2014-04-01

    Full Text Available Objective. Get to know how nurses perceive the accomplishment of risk classification in an emergency service. Methodology. In this qualitative study, 11 nurses were included with at least two months of experience in the risk classification of patients who visited the emergency service. Semistructured interviews were used to collect the information. The data were collected between August and December 2011. For data analysis, Bardin's theoretical framework was used. Results. The nurses in the study consider the risk classification as a work organization instruments that permits closer contact between nurses and patients. The nursing skills needed for risk classification were identified: knowledge about the scale used, clinical perspective, patience and agility. The availability of risk classification scales was the main facilitator of this work. The main difficulties were the disorganization of the care network and the health team's lack of knowledge of the protocol. Conclusion. Risk classification offers an opportunity for professional autonomy to the extent that it is the main responsible for regulating care at the entry door of the emergency services.

  11. Proverbs 30:18-19 in the Light of Ancient Mesopotamian Cuneiform Texts

    Directory of Open Access Journals (Sweden)

    Böck, Barbara

    2009-12-01

    Full Text Available The meaning of Proverbs 30:18-19 has long been disputed. Most scholars interpret the Biblical couplets textually on stylistic features only; an explanation of the contextual association between the four motifs mentioned (eagle, serpent, boat, man and woman has not yet been undertaken. The present paper aims at shedding light on the motivation for this association, taking into consideration ancient Near Eastern cuneiform compositions for the first time. It is further suggested that Proverbs 30:18-19 derived originally from a riddle that had its setting in a wedding ceremony.El significado de Proverbios 30:18-19 sigue desafiando la exégesis de los biblistas. La mayoría de los comentaristas interpretan los versos bíblicos textualmente, ciñéndose al análisis de las figuras de estilo. Sin embargo, todavía no se ha dado ninguna explicación a la asociación contextual entre los cuatro motivos del proverbio (águila, serpiente, barco, hombre y mujer. Por primera vez, este artículo estudia composiciones de la literatura cuneiforme que ofrecen un telón de fondo para interpretar el sentido de los distintos elementos y del conjunto del proverbio bíblico. Según esta nueva lectura, Proverbios 30:18-19 describiría una adivinanza propuesta durante una ceremonia matrimonial.

  12. Learning soil classification with the Kayapó indians

    Directory of Open Access Journals (Sweden)

    Cooper Miguel

    2005-01-01

    Full Text Available The Kayapó Xicrin do Cateté (Xicrin indigenous reserve is located within the Amazon forest in Pará (Brazil. The Xicrins have developed a soil classification system that is incorporated in their language and culture. The etymology of their classification system and its logical structure makes it similar and comparable with modern soil classification. The etymology of the Xicrin's language is based on the junction of radicals to form words for different soil names. The name of the soil is formed by the main noun radical "puka", to which adjectives referring to soil morphological attributes are added. Modern classification systems are also based on similar morphological variables, and analytical support for defining boundaries of chemical or physical soil attributes are important only in lower hierarchical levels. Soil scientists have developed a soil classification system that is sensitive for the restrictions and potentialities the soil will show for modern agriculture. The Xicrins classify soils for what is important for their life style, i.e. a harmonic and friendly life with the resources they gain from the forest.

  13. A statistical approach to root system classification.

    Directory of Open Access Journals (Sweden)

    Gernot eBodner

    2013-08-01

    Full Text Available Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for plant functional type identification in ecology can be applied to the classification of root systems. We demonstrate that combining principal component and cluster analysis yields a meaningful classification of rooting types based on morphological traits. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. Biplot inspection is used to determine key traits and to ensure stability in cluster based grouping. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Three rooting types emerged from measured data, distinguished by diameter/weight, density and spatial distribution respectively. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement

  14. Classification in context

    DEFF Research Database (Denmark)

    Mai, Jens Erik

    2004-01-01

    This paper surveys classification research literature, discusses various classification theories, and shows that the focus has traditionally been on establishing a scientific foundation for classification research. This paper argues that a shift has taken place, and suggests that contemporary...... classification research focus on contextual information as the guide for the design and construction of classification schemes....

  15. Classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2017-01-01

    This article presents and discusses definitions of the term “classification” and the related concepts “Concept/conceptualization,”“categorization,” “ordering,” “taxonomy” and “typology.” It further presents and discusses theories of classification including the influences of Aristotle...... and Wittgenstein. It presents different views on forming classes, including logical division, numerical taxonomy, historical classification, hermeneutical and pragmatic/critical views. Finally, issues related to artificial versus natural classification and taxonomic monism versus taxonomic pluralism are briefly...

  16. Phenotype classification of zebrafish embryos by supervised learning.

    Directory of Open Access Journals (Sweden)

    Nathalie Jeanray

    Full Text Available Zebrafish is increasingly used to assess biological properties of chemical substances and thus is becoming a specific tool for toxicological and pharmacological studies. The effects of chemical substances on embryo survival and development are generally evaluated manually through microscopic observation by an expert and documented by several typical photographs. Here, we present a methodology to automatically classify brightfield images of wildtype zebrafish embryos according to their defects by using an image analysis approach based on supervised machine learning. We show that, compared to manual classification, automatic classification results in 90 to 100% agreement with consensus voting of biological experts in nine out of eleven considered defects in 3 days old zebrafish larvae. Automation of the analysis and classification of zebrafish embryo pictures reduces the workload and time required for the biological expert and increases the reproducibility and objectivity of this classification.

  17. Nonlinear programming for classification problems in machine learning

    Science.gov (United States)

    Astorino, Annabella; Fuduli, Antonio; Gaudioso, Manlio

    2016-10-01

    We survey some nonlinear models for classification problems arising in machine learning. In the last years this field has become more and more relevant due to a lot of practical applications, such as text and web classification, object recognition in machine vision, gene expression profile analysis, DNA and protein analysis, medical diagnosis, customer profiling etc. Classification deals with separation of sets by means of appropriate separation surfaces, which is generally obtained by solving a numerical optimization model. While linear separability is the basis of the most popular approach to classification, the Support Vector Machine (SVM), in the recent years using nonlinear separating surfaces has received some attention. The objective of this work is to recall some of such proposals, mainly in terms of the numerical optimization models. In particular we tackle the polyhedral, ellipsoidal, spherical and conical separation approaches and, for some of them, we also consider the semisupervised versions.

  18. A Look at the Practice of Risk Classification: Integrative Review

    Directory of Open Access Journals (Sweden)

    Luiz Alves Morais Filho

    2017-03-01

    Full Text Available Introduction: the increase in the number of patients in emergency services / emergency brought the need for screening / risk classification as a way to organize the urgency and emergency care in the health institutions. Objectives: know how to develop the risk classification practice in the Brazilian reality using the scientific production, the insertion of nurses in risk classification using the Brazilian scientific production. Methods: an integrative review was carried out, the data occurred during September 2015 in the following databases: Scientific Electronic Library Online (SciELO, Medical Literature Analysis and Retrieval System Online (Medline, and the Latin American and Caribbean System of Information on Health Sciences (LILACS "GOOGLE SCHOLAR." Results: it found 9,874 articles and selected 33 for analysis. The results were organized in 04 categories: Risk classification as assistance qualifier; risk classification’s organization; operation weaknesses of the risk classification and nurse's role in risk classification. Conclusion: We conclude that the risk classification qualifies the assistance in emergency services; there are many difficulties for the risk classification’s operation and the nurse has been established as a professional with technical and legal competence to perform the risk classification.

  19. Automated authorship attribution using advanced signal classification techniques.

    Directory of Open Access Journals (Sweden)

    Maryam Ebrahimpour

    Full Text Available In this paper, we develop two automated authorship attribution schemes, one based on Multiple Discriminant Analysis (MDA and the other based on a Support Vector Machine (SVM. The classification features we exploit are based on word frequencies in the text. We adopt an approach of preprocessing each text by stripping it of all characters except a-z and space. This is in order to increase the portability of the software to different types of texts. We test the methodology on a corpus of undisputed English texts, and use leave-one-out cross validation to demonstrate classification accuracies in excess of 90%. We further test our methods on the Federalist Papers, which have a partly disputed authorship and a fair degree of scholarly consensus. And finally, we apply our methodology to the question of the authorship of the Letter to the Hebrews by comparing it against a number of original Greek texts of known authorship. These tests identify where some of the limitations lie, motivating a number of open questions for future work. An open source implementation of our methodology is freely available for use at https://github.com/matthewberryman/author-detection.

  20. A Hybrid Feature Selection Approach for Arabic Documents Classification

    NARCIS (Netherlands)

    Habib, Mena Badieh; Sarhan, Ahmed A. E.; Salem, Abdel-Badeeh M.; Fayed, Zaki T.; Gharib, Tarek F.

    Text Categorization (classification) is the process of classifying documents into a predefined set of categories based on their content. Text categorization algorithms usually represent documents as bags of words and consequently have to deal with huge number of features. Feature selection tries to

  1. Vaccine adverse event text mining system for extracting features from vaccine safety reports.

    Science.gov (United States)

    Botsis, Taxiarchis; Buttolph, Thomas; Nguyen, Michael D; Winiecki, Scott; Woo, Emily Jane; Ball, Robert

    2012-01-01

    To develop and evaluate a text mining system for extracting key clinical features from vaccine adverse event reporting system (VAERS) narratives to aid in the automated review of adverse event reports. Based upon clinical significance to VAERS reviewing physicians, we defined the primary (diagnosis and cause of death) and secondary features (eg, symptoms) for extraction. We built a novel vaccine adverse event text mining (VaeTM) system based on a semantic text mining strategy. The performance of VaeTM was evaluated using a total of 300 VAERS reports in three sequential evaluations of 100 reports each. Moreover, we evaluated the VaeTM contribution to case classification; an information retrieval-based approach was used for the identification of anaphylaxis cases in a set of reports and was compared with two other methods: a dedicated text classifier and an online tool. The performance metrics of VaeTM were text mining metrics: recall, precision and F-measure. We also conducted a qualitative difference analysis and calculated sensitivity and specificity for classification of anaphylaxis cases based on the above three approaches. VaeTM performed best in extracting diagnosis, second level diagnosis, drug, vaccine, and lot number features (lenient F-measure in the third evaluation: 0.897, 0.817, 0.858, 0.874, and 0.914, respectively). In terms of case classification, high sensitivity was achieved (83.1%); this was equal and better compared to the text classifier (83.1%) and the online tool (40.7%), respectively. Our VaeTM implementation of a semantic text mining strategy shows promise in providing accurate and efficient extraction of key features from VAERS narratives.

  2. Drug-related webpages classification based on multi-modal local decision fusion

    Science.gov (United States)

    Hu, Ruiguang; Su, Xiaojing; Liu, Yanxin

    2018-03-01

    In this paper, multi-modal local decision fusion is used for drug-related webpages classification. First, meaningful text are extracted through HTML parsing, and effective images are chosen by the FOCARSS algorithm. Second, six SVM classifiers are trained for six kinds of drug-taking instruments, which are represented by PHOG. One SVM classifier is trained for the cannabis, which is represented by the mid-feature of BOW model. For each instance in a webpage, seven SVMs give seven labels for its image, and other seven labels are given by searching the names of drug-taking instruments and cannabis in its related text. Concatenating seven labels of image and seven labels of text, the representation of those instances in webpages are generated. Last, Multi-Instance Learning is used to classify those drugrelated webpages. Experimental results demonstrate that the classification accuracy of multi-instance learning with multi-modal local decision fusion is much higher than those of single-modal classification.

  3. A simple phenotypic classification for celiac disease

    Directory of Open Access Journals (Sweden)

    Ajit Sood

    2018-04-01

    Full Text Available Background/Aims : Celiac disease is a global health problem. The presentation of celiac disease has unfolded over years and it is now known that it can manifest at different ages, has varied presentations, and is prone to develop complications, if not managed properly. Although the Oslo definitions provide consensus on the various terminologies used in literature, there is no phenotypic classification providing a composite diagnosis for the disease. Methods : Various variables identified for phenotypic classification included age at diagnosis, age at onset of symptoms, clinical presentation, family history and complications. These were applied to the existing registry of 1,664 patients at Dayanand Medical College and Hospital, Ludhiana, India. In addition, age was evaluated as below 15 and below 18 years. Cross tabulations were used for the verification of the classification using the existing data. Expert opinion was sought from both international and national experts of varying fields. Results : After empirical verification, age at diagnosis was considered appropriate in between A1 (<18 and A2 (≧18. The disease presentation has been classified into 3 types–P1 (classical, P2 (non-classical and P3 (asymptomatic. Complications were considered as absent (C0 or present (C1. A single phenotypic classification based on these 3 characteristics, namely age at the diagnosis, clinical presentation, and intestinal complications (APC classification was derived. Conclusions : APC classification (age at diagnosis, presentation, complications is a simple disease explanatory classification for patients with celiac disease aimed at providing a composite diagnosis.

  4. Adaptive SVM for Data Stream Classification

    Directory of Open Access Journals (Sweden)

    Isah A. Lawal

    2017-07-01

    Full Text Available In this paper, we address the problem of learning an adaptive classifier for the classification of continuous streams of data. We present a solution based on incremental extensions of the Support Vector Machine (SVM learning paradigm that updates an existing SVM whenever new training data are acquired. To ensure that the SVM effectiveness is guaranteed while exploiting the newly gathered data, we introduce an on-line model selection approach in the incremental learning process. We evaluated the proposed method on real world applications including on-line spam email filtering and human action classification from videos. Experimental results show the effectiveness and the potential of the proposed approach.

  5. 78 FR 54970 - Cotton Futures Classification: Optional Classification Procedure

    Science.gov (United States)

    2013-09-09

    ... Service 7 CFR Part 27 [AMS-CN-13-0043] RIN 0581-AD33 Cotton Futures Classification: Optional Classification Procedure AGENCY: Agricultural Marketing Service, USDA. ACTION: Proposed rule. SUMMARY: The... optional cotton futures classification procedure--identified and known as ``registration'' by the U.S...

  6. Dante per i bambini: percorsi tra riduzioni e riscritture nella prima metà del Novecento

    Directory of Open Access Journals (Sweden)

    Sabrina Fava

    2014-12-01

    Full Text Available  During the XX century the adapted texts for young people of the Divine Comedy started when cultural industry has affirmed. It has given impulse to the publishing industry for youth and thanks to the process of popular education it has developed the popularization of the culture to the infancy. The essay presents the literary history of reductions of the Divine Comedy and it highlights that the stylistic quality has allowed to develop interest among the young readers. Growing up, the stylistic quality introduces to the reading of the original text. The aesthetic beauty of the Divine Comedy can approach to the childhood without losing itself. The adapted texts underline the potentialities of the original text and they show the different way of educating future adult readers.

  7. Knowledge-based approach to video content classification

    Science.gov (United States)

    Chen, Yu; Wong, Edward K.

    2001-01-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  8. Revue bibliographique: les méthodes chimiques d'identification et de classification des champignons

    Directory of Open Access Journals (Sweden)

    Verscheure M.

    2002-01-01

    Full Text Available Chemotaxonomy of fungi : a review. For few years, advancements of molecular methods and analytical techniques enabled scientists to realise a classification of microorganisms based on biochemical characteristics. This classification, called chemotaxonomy, includes molecular methods and chemical methods which provide additional data and lead to a better identification and/or classification.

  9. Oscillating feature subset search algorithm for text categorization

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana; Somol, Petr; Pudil, Pavel

    2006-01-01

    Roč. 44, č. 4225 (2006), s. 578-587 ISSN 0302-9743 R&D Projects: GA AV ČR IAA2075302; GA MŠk 2C06019 EU Projects: European Commission(XE) 507752 - MUSCLE Institutional research plan: CEZ:AV0Z10750506 Keywords : text classification * feature selection * oscillating search algorithm * Bhattacharyya distance Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.402, year: 2005

  10. Inventory classification based on decoupling points

    Directory of Open Access Journals (Sweden)

    Joakim Wikner

    2015-01-01

    Full Text Available The ideal state of continuous one-piece flow may never be achieved. Still the logistics manager can improve the flow by carefully positioning inventory to buffer against variations. Strategies such as lean, postponement, mass customization, and outsourcing all rely on strategic positioning of decoupling points to separate forecast-driven from customer-order-driven flows. Planning and scheduling of the flow are also based on classification of decoupling points as master scheduled or not. A comprehensive classification scheme for these types of decoupling points is introduced. The approach rests on identification of flows as being either demand based or supply based. The demand or supply is then combined with exogenous factors, classified as independent, or endogenous factors, classified as dependent. As a result, eight types of strategic as well as tactical decoupling points are identified resulting in a process-based framework for inventory classification that can be used for flow design.

  11. Wagner classification and culture analysis of diabetic foot infection

    Directory of Open Access Journals (Sweden)

    Fatma Bozkurt

    2011-03-01

    Full Text Available The aim of this study was to determine the concordance ratio between microorganisms isolated from deep tissue culture and those from superficial culture in patients with diabetic foot according to Wagner’s wound classification method.Materials and methods: A total of 63 patients with Diabetic foot infection, who were admitted to Dicle University Hospital between October 2006 and November 2007, were included into the study. Wagner’s classification method was used for wound classification. For microbiologic studies superficial and deep tissue specimens were obtained from each patient, and were rapidly sent to laboratory for aerob and anaerob cultures. Microbiologic data were analyzed and interpreted in line with sensitivity and specifity formula.Results: Thirty-eight (60% of the patients were in Wagner’s classification ≤2, while 25 (40% patients were Wagner’s classification ≥3. According to our culture results, 66 (69% Gr (+ and 30 (31% Gr (- microorganisms grew in Wagner classification ≤2 patients. While in Wagner classification ≥3; 25 (35% Gr (+ and 46 (65% Gr (- microorganisms grew. Microorganisms grew in 89% of superficial cultures and 64% of the deep tissue cultures in patients with Wagner classification ≤2, while microorganism grew in 64% of Wagner classification ≥3.Conclusion: In ulcers of diabetic food infections, initial treatment should be started according to result of sterile superficial culture, but deep tissue culture should be taken, if unresponsive to initial treatment.

  12. A comparative evaluation of sequence classification programs

    Directory of Open Access Journals (Sweden)

    Bazinet Adam L

    2012-05-01

    Full Text Available Abstract Background A fundamental problem in modern genomics is to taxonomically or functionally classify DNA sequence fragments derived from environmental sampling (i.e., metagenomics. Several different methods have been proposed for doing this effectively and efficiently, and many have been implemented in software. In addition to varying their basic algorithmic approach to classification, some methods screen sequence reads for ’barcoding genes’ like 16S rRNA, or various types of protein-coding genes. Due to the sheer number and complexity of methods, it can be difficult for a researcher to choose one that is well-suited for a particular analysis. Results We divided the very large number of programs that have been released in recent years for solving the sequence classification problem into three main categories based on the general algorithm they use to compare a query sequence against a database of sequences. We also evaluated the performance of the leading programs in each category on data sets whose taxonomic and functional composition is known. Conclusions We found significant variability in classification accuracy, precision, and resource consumption of sequence classification programs when used to analyze various metagenomics data sets. However, we observe some general trends and patterns that will be useful to researchers who use sequence classification programs.

  13. CLASSIFICATION BY USING MULTISPECTRAL POINT CLOUD DATA

    Directory of Open Access Journals (Sweden)

    C. T. Liao

    2012-07-01

    Full Text Available Remote sensing images are generally recorded in two-dimensional format containing multispectral information. Also, the semantic information is clearly visualized, which ground features can be better recognized and classified via supervised or unsupervised classification methods easily. Nevertheless, the shortcomings of multispectral images are highly depending on light conditions, and classification results lack of three-dimensional semantic information. On the other hand, LiDAR has become a main technology for acquiring high accuracy point cloud data. The advantages of LiDAR are high data acquisition rate, independent of light conditions and can directly produce three-dimensional coordinates. However, comparing with multispectral images, the disadvantage is multispectral information shortage, which remains a challenge in ground feature classification through massive point cloud data. Consequently, by combining the advantages of both LiDAR and multispectral images, point cloud data with three-dimensional coordinates and multispectral information can produce a integrate solution for point cloud classification. Therefore, this research acquires visible light and near infrared images, via close range photogrammetry, by matching images automatically through free online service for multispectral point cloud generation. Then, one can use three-dimensional affine coordinate transformation to compare the data increment. At last, the given threshold of height and color information is set as threshold in classification.

  14. TENSOR MODELING BASED FOR AIRBORNE LiDAR DATA CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    N. Li

    2016-06-01

    Full Text Available Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the “raw” data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could keep the initial spatial structure and insure the consideration of the neighborhood. Based on a small number of component features a k nearest neighborhood classification is applied.

  15. Classification of refrigerants; Classification des fluides frigorigenes

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2001-07-01

    This document was made from the US standard ANSI/ASHRAE 34 published in 2001 and entitled 'designation and safety classification of refrigerants'. This classification allows to clearly organize in an international way the overall refrigerants used in the world thanks to a codification of the refrigerants in correspondence with their chemical composition. This note explains this codification: prefix, suffixes (hydrocarbons and derived fluids, azeotropic and non-azeotropic mixtures, various organic compounds, non-organic compounds), safety classification (toxicity, flammability, case of mixtures). (J.S.)

  16. Transportation Modes Classification Using Sensors on Smartphones

    Directory of Open Access Journals (Sweden)

    Shih-Hau Fang

    2016-08-01

    Full Text Available This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.

  17. Supervised Cross-Modal Factor Analysis for Multiple Modal Data Classification

    KAUST Repository

    Wang, Jingbin

    2015-10-09

    In this paper we study the problem of learning from multiple modal data for purpose of document classification. In this problem, each document is composed two different modals of data, i.e., An image and a text. Cross-modal factor analysis (CFA) has been proposed to project the two different modals of data to a shared data space, so that the classification of a image or a text can be performed directly in this space. A disadvantage of CFA is that it has ignored the supervision information. In this paper, we improve CFA by incorporating the supervision information to represent and classify both image and text modals of documents. We project both image and text data to a shared data space by factor analysis, and then train a class label predictor in the shared space to use the class label information. The factor analysis parameter and the predictor parameter are learned jointly by solving one single objective function. With this objective function, we minimize the distance between the projections of image and text of the same document, and the classification error of the projection measured by hinge loss function. The objective function is optimized by an alternate optimization strategy in an iterative algorithm. Experiments in two different multiple modal document data sets show the advantage of the proposed algorithm over other CFA methods.

  18. Performance Evaluation of Frequency Transform Based Block Classification of Compound Image Segmentation Techniques

    Science.gov (United States)

    Selwyn, Ebenezer Juliet; Florinabel, D. Jemi

    2018-04-01

    Compound image segmentation plays a vital role in the compression of computer screen images. Computer screen images are images which are mixed with textual, graphical, or pictorial contents. In this paper, we present a comparison of two transform based block classification of compound images based on metrics like speed of classification, precision and recall rate. Block based classification approaches normally divide the compound images into fixed size blocks of non-overlapping in nature. Then frequency transform like Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) are applied over each block. Mean and standard deviation are computed for each 8 × 8 block and are used as features set to classify the compound images into text/graphics and picture/background block. The classification accuracy of block classification based segmentation techniques are measured by evaluation metrics like precision and recall rate. Compound images of smooth background and complex background images containing text of varying size, colour and orientation are considered for testing. Experimental evidence shows that the DWT based segmentation provides significant improvement in recall rate and precision rate approximately 2.3% than DCT based segmentation with an increase in block classification time for both smooth and complex background images.

  19. Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion.

    Science.gov (United States)

    Agarwal, Shashank; Yu, Hong

    2009-12-01

    Biomedical texts can be typically represented by four rhetorical categories: Introduction, Methods, Results and Discussion (IMRAD). Classifying sentences into these categories can benefit many other text-mining tasks. Although many studies have applied different approaches for automatically classifying sentences in MEDLINE abstracts into the IMRAD categories, few have explored the classification of sentences that appear in full-text biomedical articles. We first evaluated whether sentences in full-text biomedical articles could be reliably annotated into the IMRAD format and then explored different approaches for automatically classifying these sentences into the IMRAD categories. Our results show an overall annotation agreement of 82.14% with a Kappa score of 0.756. The best classification system is a multinomial naïve Bayes classifier trained on manually annotated data that achieved 91.95% accuracy and an average F-score of 91.55%, which is significantly higher than baseline systems. A web version of this system is available online at-http://wood.ims.uwm.edu/full_text_classifier/.

  20. An Online Multisensor Data Fusion Framework for Radar Emitter Classification

    Directory of Open Access Journals (Sweden)

    Dongqing Zhou

    2016-01-01

    Full Text Available Radar emitter classification is a special application of data clustering for classifying unknown radar emitters in airborne electronic support system. In this paper, a novel online multisensor data fusion framework is proposed for radar emitter classification under the background of network centric warfare. The framework is composed of local processing and multisensor fusion processing, from which the rough and precise classification results are obtained, respectively. What is more, the proposed algorithm does not need prior knowledge and training process; it can dynamically update the number of the clusters and the cluster centers when new pulses arrive. At last, the experimental results show that the proposed framework is an efficacious way to solve radar emitter classification problem in networked warfare.

  1. On the introduction of secondary fingerprint classification

    CSIR Research Space (South Africa)

    Msiza, IS

    2011-07-01

    Full Text Available The concept of fingerprint classification is an important one because of the need to, before executing a database search procedure, virtually break the fingerprint template database into smaller, manageable partitions. This is done in order to avoid...

  2. Sentiment Classification of Documents in Serbian: The Effects of Morphological Normalization and Word Embeddings

    Directory of Open Access Journals (Sweden)

    V. Batanović

    2017-11-01

    Full Text Available An open issue in the sentiment classification of texts written in Serbian is the effect of different forms of morphological normalization and the usefulness of leveraging large amounts of unlabeled texts. In this paper, we assess the impact of lemmatizers and stemmers for Serbian on classifiers trained and evaluated on the Serbian Movie Review Dataset. We also consider the effectiveness of using word embeddings, generated from a large unlabeled corpus, as classification features.

  3. LDA boost classification: boosting by topics

    Science.gov (United States)

    Lei, La; Qiao, Guo; Qimin, Cao; Qitao, Li

    2012-12-01

    AdaBoost is an efficacious classification algorithm especially in text categorization (TC) tasks. The methodology of setting up a classifier committee and voting on the documents for classification can achieve high categorization precision. However, traditional Vector Space Model can easily lead to the curse of dimensionality and feature sparsity problems; so it affects classification performance seriously. This article proposed a novel classification algorithm called LDABoost based on boosting ideology which uses Latent Dirichlet Allocation (LDA) to modeling the feature space. Instead of using words or phrase, LDABoost use latent topics as the features. In this way, the feature dimension is significantly reduced. Improved Naïve Bayes (NB) is designed as the weaker classifier which keeps the efficiency advantage of classic NB algorithm and has higher precision. Moreover, a two-stage iterative weighted method called Cute Integration in this article is proposed for improving the accuracy by integrating weak classifiers into strong classifier in a more rational way. Mutual Information is used as metrics of weights allocation. The voting information and the categorization decision made by basis classifiers are fully utilized for generating the strong classifier. Experimental results reveals LDABoost making categorization in a low-dimensional space, it has higher accuracy than traditional AdaBoost algorithms and many other classic classification algorithms. Moreover, its runtime consumption is lower than different versions of AdaBoost, TC algorithms based on support vector machine and Neural Networks.

  4. Effectiveness of Multivariate Time Series Classification Using Shapelets

    Directory of Open Access Journals (Sweden)

    A. P. Karpenko

    2015-01-01

    Full Text Available Typically, time series classifiers require signal pre-processing (filtering signals from noise and artifact removal, etc., enhancement of signal features (amplitude, frequency, spectrum, etc., classification of signal features in space using the classical techniques and classification algorithms of multivariate data. We consider a method of classifying time series, which does not require enhancement of the signal features. The method uses the shapelets of time series (time series shapelets i.e. small fragments of this series, which reflect properties of one of its classes most of all.Despite the significant number of publications on the theory and shapelet applications for classification of time series, the task to evaluate the effectiveness of this technique remains relevant. An objective of this publication is to study the effectiveness of a number of modifications of the original shapelet method as applied to the multivariate series classification that is a littlestudied problem. The paper presents the problem statement of multivariate time series classification using the shapelets and describes the shapelet–based basic method of binary classification, as well as various generalizations and proposed modification of the method. It also offers the software that implements a modified method and results of computational experiments confirming the effectiveness of the algorithmic and software solutions.The paper shows that the modified method and the software to use it allow us to reach the classification accuracy of about 85%, at best. The shapelet search time increases in proportion to input data dimension.

  5. Joint Feature Selection and Classification for Multilabel Learning.

    Science.gov (United States)

    Huang, Jun; Li, Guorong; Huang, Qingming; Wu, Xindong

    2018-03-01

    Multilabel learning deals with examples having multiple class labels simultaneously. It has been applied to a variety of applications, such as text categorization and image annotation. A large number of algorithms have been proposed for multilabel learning, most of which concentrate on multilabel classification problems and only a few of them are feature selection algorithms. Current multilabel classification models are mainly built on a single data representation composed of all the features which are shared by all the class labels. Since each class label might be decided by some specific features of its own, and the problems of classification and feature selection are often addressed independently, in this paper, we propose a novel method which can perform joint feature selection and classification for multilabel learning, named JFSC. Different from many existing methods, JFSC learns both shared features and label-specific features by considering pairwise label correlations, and builds the multilabel classifier on the learned low-dimensional data representations simultaneously. A comparative study with state-of-the-art approaches manifests a competitive performance of our proposed method both in classification and feature selection for multilabel learning.

  6. Shared Features of L2 Writing: Intergroup Homogeneity and Text Classification

    Science.gov (United States)

    Crossley, Scott A.; McNamara, Danielle S.

    2011-01-01

    This study investigates intergroup homogeneity within high intermediate and advanced L2 writers of English from Czech, Finnish, German, and Spanish first language backgrounds. A variety of linguistic features related to lexical sophistication, syntactic complexity, and cohesion were used to compare texts written by L1 speakers of English to L2…

  7. A simplified immunohistochemical classification of skeletal muscle fibres in mouse

    Directory of Open Access Journals (Sweden)

    M. Kammoun

    2014-06-01

    Full Text Available The classification of muscle fibres is of particular interest for the study of the skeletal muscle properties in a wide range of scientific fields, especially animal phenotyping. It is therefore important to define a reliable method for classifying fibre types. The aim of this study was to establish a simplified method for the immunohistochemical classification of fibres in mouse. To carry it out, we first tested a combination of several anti myosin heavy chain (MyHC antibodies in order to choose a minimum number of antibodies to implement a semi-automatic classification. Then, we compared the classification of fibres to the MyHC electrophoretic pattern on the same samples. Only two anti MyHC antibodies on serial sections with the fluorescent labeling of the Laminin were necessary to classify properly fibre types in Tibialis Anterior and Soleus mouse muscles in normal physiological conditions. This classification was virtually identical to the classification realized by the electrophoretic separation of MyHC. This immunohistochemical classification can be applied to the total area of Tibialis Anterior and Soleus mouse muscles. Thus, we provide here a useful, simple and time-efficient method for immunohistochemical classification of fibres, applicable for research in mouse

  8. PROGRESSIVE DENSIFICATION AND REGION GROWING METHODS FOR LIDAR DATA CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    J. L. Pérez-García

    2012-07-01

    Full Text Available At present, airborne laser scanner systems are one of the most frequent methods used to obtain digital terrain elevation models. While having the advantage of direct measurement on the object, the point cloud obtained has the need for classification of their points according to its belonging to the ground. This need for classification of raw data has led to appearance of multiple filters focused LiDAR classification information. According this approach, this paper presents a classification method that combines LiDAR data segmentation techniques and progressive densification to carry out the location of the points belonging to the ground. The proposed methodology is tested on several datasets with different terrain characteristics and data availability. In all case, we analyze the advantages and disadvantages that have been obtained compared with the individual techniques application and, in a special way, the benefits derived from the integration of both classification techniques. In order to provide a more comprehensive quality control of the classification process, the obtained results have been compared with the derived from a manual procedure, which is used as reference classification. The results are also compared with other automatic classification methodologies included in some commercial software packages, highly contrasted by users for LiDAR data treatment.

  9. ANALYSIS OF THE GUIDELINES FOR CLASSIFICATION OFADVERTISING COSTS IN TAXATION

    Directory of Open Access Journals (Sweden)

    A. Diederichs

    2016-07-01

    Full Text Available Advertising plays a distinct role in economies around the world. Previous studieshave not resolved the question related to the classification of advertising as anexpense or capital asset. Understanding the principles set out in TheIncome TaxAct 58 of 1962, with regard to the classification of advertising cost as capital orrevenue of nature is important, since the incorrect interpretation of principles willhave a direct impact on tax liability. The focus of this study is the classification ofadvertising costs for tax purposes. Research questions posed in this paper areanswered through the development of a classification process that may assist withthe classification of advertising costs for the purpose of taxation. Guidelines forthe classification of advertising costs as capital or revenue of nature are needed tocorrectly classify advertising costs for tax purposes. Furthermore, thedetermination of when advertising costs will be regarded as capital of nature isalso determined. A qualitative research approach is applied, including a literaturereview of case law and income tax acts. The contribution of this study is found inthe guidelines set for the classification of advertising costs for tax purposes byusing principles from national and international case law.

  10. On the Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Asriyanti Indah Pratiwi

    2018-01-01

    Full Text Available Sentiment analysis in a movie review is the needs of today lifestyle. Unfortunately, enormous features make the sentiment of analysis slow and less sensitive. Finding the optimum feature selection and classification is still a challenge. In order to handle an enormous number of features and provide better sentiment classification, an information-based feature selection and classification are proposed. The proposed method reduces more than 90% unnecessary features while the proposed classification scheme achieves 96% accuracy of sentiment classification. From the experimental results, it can be concluded that the combination of proposed feature selection and classification achieves the best performance so far.

  11. Improving Classification of Protein Interaction Articles Using Context Similarity-Based Feature Selection.

    Science.gov (United States)

    Chen, Yifei; Sun, Yuxing; Han, Bing-Qing

    2015-01-01

    Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.

  12. Improving settlement type classification of aerial images

    CSIR Research Space (South Africa)

    Mdakane, L

    2014-10-01

    Full Text Available , an automated method can be used to help identify human settlements in a fixed, repeatable and timely manner. The main contribution of this work is to improve generalisation on settlement type classification of aerial imagery. Images acquired at different dates...

  13. Classification of hydrocephalus: critical analysis of classification categories and advantages of "Multi-categorical Hydrocephalus Classification" (Mc HC).

    Science.gov (United States)

    Oi, Shizuo

    2011-10-01

    Hydrocephalus is a complex pathophysiology with disturbed cerebrospinal fluid (CSF) circulation. There are numerous numbers of classification trials published focusing on various criteria, such as associated anomalies/underlying lesions, CSF circulation/intracranial pressure patterns, clinical features, and other categories. However, no definitive classification exists comprehensively to cover the variety of these aspects. The new classification of hydrocephalus, "Multi-categorical Hydrocephalus Classification" (Mc HC), was invented and developed to cover the entire aspects of hydrocephalus with all considerable classification items and categories. Ten categories include "Mc HC" category I: onset (age, phase), II: cause, III: underlying lesion, IV: symptomatology, V: pathophysiology 1-CSF circulation, VI: pathophysiology 2-ICP dynamics, VII: chronology, VII: post-shunt, VIII: post-endoscopic third ventriculostomy, and X: others. From a 100-year search of publication related to the classification of hydrocephalus, 14 representative publications were reviewed and divided into the 10 categories. The Baumkuchen classification graph made from the round o'clock classification demonstrated the historical tendency of deviation to the categories in pathophysiology, either CSF or ICP dynamics. In the preliminary clinical application, it was concluded that "Mc HC" is extremely effective in expressing the individual state with various categories in the past and present condition or among the compatible cases of hydrocephalus along with the possible chronological change in the future.

  14. A novel Neuro-fuzzy classification technique for data mining

    Directory of Open Access Journals (Sweden)

    Soumadip Ghosh

    2014-11-01

    Full Text Available In our study, we proposed a novel Neuro-fuzzy classification technique for data mining. The inputs to the Neuro-fuzzy classification system were fuzzified by applying generalized bell-shaped membership function. The proposed method utilized a fuzzification matrix in which the input patterns were associated with a degree of membership to different classes. Based on the value of degree of membership a pattern would be attributed to a specific category or class. We applied our method to ten benchmark data sets from the UCI machine learning repository for classification. Our objective was to analyze the proposed method and, therefore compare its performance with two powerful supervised classification algorithms Radial Basis Function Neural Network (RBFNN and Adaptive Neuro-fuzzy Inference System (ANFIS. We assessed the performance of these classification methods in terms of different performance measures such as accuracy, root-mean-square error, kappa statistic, true positive rate, false positive rate, precision, recall, and f-measure. In every aspect the proposed method proved to be superior to RBFNN and ANFIS algorithms.

  15. Use of UAV-Borne Spectrometer for Land Cover Classification

    Directory of Open Access Journals (Sweden)

    Sowmya Natesan

    2018-04-01

    Full Text Available Unmanned aerial vehicles (UAV are being used for low altitude remote sensing for thematic land classification using visible light and multi-spectral sensors. The objective of this work was to investigate the use of UAV equipped with a compact spectrometer for land cover classification. The UAV platform used was a DJI Flamewheel F550 hexacopter equipped with GPS and Inertial Measurement Unit (IMU navigation sensors, and a Raspberry Pi processor and camera module. The spectrometer used was the FLAME-NIR, a near-infrared spectrometer for hyperspectral measurements. RGB images and spectrometer data were captured simultaneously. As spectrometer data do not provide continuous terrain coverage, the locations of their ground elliptical footprints were determined from the bundle adjustment solution of the captured images. For each of the spectrometer ground ellipses, the land cover signature at the footprint location was determined to enable the characterization, identification, and classification of land cover elements. To attain a continuous land cover classification map, spatial interpolation was carried out from the irregularly distributed labeled spectrometer points. The accuracy of the classification was assessed using spatial intersection with the object-based image classification performed using the RGB images. Results show that in homogeneous land cover, like water, the accuracy of classification is 78% and in mixed classes, like grass, trees and manmade features, the average accuracy is 50%, thus, indicating the contribution of hyperspectral measurements of low altitude UAV-borne spectrometers to improve land cover classification.

  16. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  17. Bookseller’s Classification: Classification Examples and Criteria of Croatian Booksellers in Sales Catalogs and Book Lists from the Beginning of the 20th Century

    Directory of Open Access Journals (Sweden)

    Nada Topić

    2012-12-01

    Full Text Available The aim of the paper is to conduct research on the topic of ways of bookstore (sales classification of Croatian bookstores from the beginning of the 20th century. By content analysis of the 17 sales lists/catalogs of books from Dubrovnik, Split, Zadar, Karlovac, Zagreb and Osijek, the classification structure has been reconstructed, and the criteria according to which the booksellers offerings have been classified in the early 20th century have been determined. Conducting of the analysis established the following criteria of the bookstore classification: topic/content, form/type of work, type of corpus, genre, language, purpose, publishing series, publisher, time of publication, (new edition, time of publication/purchase, customer's specific interests, number, letter and author. Order of enumeration within specific categories is mostly alphabetic, numeric or according to order of publication. Unlike the library classification and classification systems in general, the problematics of bookstore classification is not very present in the current existing sources. Research studies that focus on the history of bookselling, even if they reveal ways of classification of booksellers offers remain on a descriptive level without any deeper analysis of the criteria or possible reasons of such classification. Therefore, the contribution of the paper is a detailed analysis of a larger pattern of bookstore sales catalogs, and also an attempt of illuminating the criteria and reasons of creating a system of bookstore classification in the defined historical, spatial and time context.

  18. A Novel Texture Classification Procedure by using Association Rules

    Directory of Open Access Journals (Sweden)

    L. Jaba Sheela

    2008-11-01

    Full Text Available Texture can be defined as a local statistical pattern of texture primitives in observer’s domain of interest. Texture classification aims to assign texture labels to unknown textures, according to training samples and classification rules. Association rules have been used in various applications during the past decades. Association rules capture both structural and statistical information, and automatically identify the structures that occur most frequently and relationships that have significant discriminative power. So, association rules can be adapted to capture frequently occurring local structures in textures. This paper describes the usage of association rules for texture classification problem. The performed experimental studies show the effectiveness of the association rules. The overall success rate is about 98%.

  19. The Performance of EEG-P300 Classification using Backpropagation Neural Networks

    Directory of Open Access Journals (Sweden)

    Arjon Turnip

    2013-12-01

    Full Text Available Electroencephalogram (EEG recordings signal provide an important function of brain-computer communication, but the accuracy of their classification is very limited in unforeseeable signal variations relating to artifacts. In this paper, we propose a classification method entailing time-series EEG-P300 signals using backpropagation neural networks to predict the qualitative properties of a subject’s mental tasks by extracting useful information from the highly multivariate non-invasive recordings of brain activity. To test the improvement in the EEG-P300 classification performance (i.e., classification accuracy and transfer rate with the proposed method, comparative experiments were conducted using Bayesian Linear Discriminant Analysis (BLDA. Finally, the result of the experiment showed that the average of the classification accuracy was 97% and the maximum improvement of the average transfer rate is 42.4%, indicating the considerable potential of the using of EEG-P300 for the continuous classification of mental tasks.

  20. Tweet-based Target Market Classification Using Ensemble Method

    Directory of Open Access Journals (Sweden)

    Muhammad Adi Khairul Anshary

    2016-09-01

    Full Text Available Target market classification is aimed at focusing marketing activities on the right targets. Classification of target markets can be done through data mining and by utilizing data from social media, e.g. Twitter. The end result of data mining are learning models that can classify new data. Ensemble methods can improve the accuracy of the models and therefore provide better results. In this study, classification of target markets was conducted on a dataset of 3000 tweets in order to extract features. Classification models were constructed to manipulate the training data using two ensemble methods (bagging and boosting. To investigate the effectiveness of the ensemble methods, this study used the CART (classification and regression tree algorithm for comparison. Three categories of consumer goods (computers, mobile phones and cameras and three categories of sentiments (positive, negative and neutral were classified towards three target-market categories. Machine learning was performed using Weka 3.6.9. The results of the test data showed that the bagging method improved the accuracy of CART with 1.9% (to 85.20%. On the other hand, for sentiment classification, the ensemble methods were not successful in increasing the accuracy of CART. The results of this study may be taken into consideration by companies who approach their customers through social media, especially Twitter.

  1. The effects of shadow removal on across-date settlement type classification of quickbird images

    CSIR Research Space (South Africa)

    Luus, FPS

    2012-07-01

    Full Text Available QuickBird imagery acquired on separate dates may have significant differences in viewing- and illumination geometries, which can negatively impact across-date settlement type classification accuracy. The effect of cast shadows on classification...

  2. Approaches to Substance of Social Infrastructure and to Its Classification

    Directory of Open Access Journals (Sweden)

    Kyrychenko Sergiy О. –

    2016-03-01

    Full Text Available The article is concerned with studying and analyzing approaches to both substance and classification of social infrastructure objects as a specific constellation of subsystems and components. To address the purpose set, the following tasks have been formulated: analysis of existing methods for determining the classification of social infrastructure; classification of the branches of social infrastructure using functional-dedicated approach; formulation of author's own definition of substance of social infrastructure. It has been determined that to date most often a social infrastructure classification is carried out depending on its functional tasks, although there are other approaches to classification. The author's definition of substance of social infrastructure has been formulated as follows: social infrastructure is a body of economy branches (public utilities, management, public safety and environment, socio-economic services, the purpose of which is to impact on reproductive potential and overall conditions of human activity in the spheres of work, everyday living, family, social-political, spiritual and intellectual development as well as life activity.

  3. Classification and Segmentation of Satellite Orthoimagery Using Convolutional Neural Networks

    Directory of Open Access Journals (Sweden)

    Martin Längkvist

    2016-04-01

    Full Text Available The availability of high-resolution remote sensing (HRRS data has opened up the possibility for new interesting applications, such as per-pixel classification of individual objects in greater detail. This paper shows how a convolutional neural network (CNN can be applied to multispectral orthoimagery and a digital surface model (DSM of a small city for a full, fast and accurate per-pixel classification. The predicted low-level pixel classes are then used to improve the high-level segmentation. Various design choices of the CNN architecture are evaluated and analyzed. The investigated land area is fully manually labeled into five categories (vegetation, ground, roads, buildings and water, and the classification accuracy is compared to other per-pixel classification works on other land areas that have a similar choice of categories. The results of the full classification and segmentation on selected segments of the map show that CNNs are a viable tool for solving both the segmentation and object recognition task for remote sensing data.

  4. Meta-language for land use classification systems

    CSIR Research Space (South Africa)

    Cooper, Antony K

    2014-04-01

    Full Text Available This presentation provides an overview of a meta-language for land use classification. It also explains why land use can’t always be determined from imagery, and why land use is not the same as land cover, zoning or planning - though...

  5. General regression and representation model for classification.

    Directory of Open Access Journals (Sweden)

    Jianjun Qian

    Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.

  6. REAL-TIME INTELLIGENT MULTILAYER ATTACK CLASSIFICATION SYSTEM

    Directory of Open Access Journals (Sweden)

    T. Subbhulakshmi

    2014-01-01

    Full Text Available Intrusion Detection Systems (IDS takes the lion’s share of the current security infrastructure. Detection of intrusions is vital for initiating the defensive procedures. Intrusion detection was done by statistical and distance based methods. A threshold value is used in these methods to indicate the level of normalcy. When the network traffic crosses the level of normalcy then above which it is flagged as anomalous. When there are occurrences of new intrusion events which are increasingly a key part of system security, the statistical techniques cannot detect them. To overcome this issue, learning techniques are used which helps in identifying new intrusion activities in a computer system. The objective of the proposed system designed in this paper is to classify the intrusions using an Intelligent Multi Layered Attack Classification System (IMLACS which helps in detecting and classifying the intrusions with improved classification accuracy. The intelligent multi layered approach contains three intelligent layers. The first layer involves Binary Support Vector Machine classification for detecting the normal and attack. The second layer involves neural network classification to classify the attacks into classes of attacks. The third layer involves fuzzy inference system to classify the attacks into various subclasses. The proposed IMLACS can be able to detect an intrusion behavior of the networks since the system contains a three intelligent layer classification and better set of rules. Feature selection is also used to improve the time of detection. The experimental results show that the IMLACS achieves the Classification Rate of 97.31%.

  7. Land-Use and Land-Cover Mapping Using a Gradable Classification Method

    Directory of Open Access Journals (Sweden)

    Keigo Kitada

    2012-05-01

    Full Text Available Conventional spectral-based classification methods have significant limitations in the digital classification of urban land-use and land-cover classes from high-resolution remotely sensed data because of the lack of consideration given to the spatial properties of images. To recognize the complex distribution of urban features in high-resolution image data, texture information consisting of a group of pixels should be considered. Lacunarity is an index used to characterize different texture appearances. It is often reported that the land-use and land-cover in urban areas can be effectively classified using the lacunarity index with high-resolution images. However, the applicability of the maximum-likelihood approach for hybrid analysis has not been reported. A more effective approach that employs the original spectral data and lacunarity index can be expected to improve the accuracy of the classification. A new classification procedure referred to as “gradable classification method” is proposed in this study. This method improves the classification accuracy in incremental steps. The proposed classification approach integrates several classification maps created from original images and lacunarity maps, which consist of lacnarity values, to create a new classification map. The results of this study confirm the suitability of the gradable classification approach, which produced a higher overall accuracy (68% and kappa coefficient (0.64 than those (65% and 0.60, respectively obtained with the maximum-likelihood approach.

  8. The classification of fatty acids of lipids from seeds of Persea ...

    African Journals Online (AJOL)

    PROMOTING ACCESS TO AFRICAN RESEARCH ... Gas liquid chromatographic analyses of Persea grattisima and ... as oil seeds and the fatty-acids of seed lipids could be potential sources of industrial oil. Keywords: Classification, fatty acids, GLC and Lipids. Full Text: EMAIL FREE FULL TEXT EMAIL FREE FULL TEXT

  9. The Reliability of Classifications of Proximal Femoral Fractures with 3-Dimensional Computed Tomography: The New Concept of Comprehensive Classification

    Directory of Open Access Journals (Sweden)

    Hiroaki Kijima

    2014-01-01

    Full Text Available The reliability of proximal femoral fracture classifications using 3DCT was evaluated, and a comprehensive “area classification” was developed. Eleven orthopedists (5–26 years from graduation classified 27 proximal femoral fractures at one hospital from June 2013 to July 2014 based on preoperative images. Various classifications were compared to “area classification.” In “area classification,” the proximal femur is divided into 4 areas with 3 boundary lines: Line-1 is the center of the neck, Line-2 is the border between the neck and the trochanteric zone, and Line-3 links the inferior borders of the greater and lesser trochanters. A fracture only in the first area was classified as a pure first area fracture; one in the first and second area was classified as a 1-2 type fracture. In the same way, fractures were classified as pure 2, 3-4, 1-2-3, and so on. “Area classification” reliability was highest when orthopedists with varying experience classified proximal femoral fractures using 3DCT. Other classifications cannot classify proximal femoral fractures if they exceed each classification’s particular zones. However, fractures that exceed the target zones are “dangerous” fractures. “Area classification” can classify such fractures, and it is therefore useful for selecting osteosynthesis methods.

  10. Binary Classification Method of Social Network Users

    Directory of Open Access Journals (Sweden)

    I. A. Poryadin

    2017-01-01

    Full Text Available The subject of research is a binary classification method of social network users based on the data analysis they have placed. Relevance of the task to gain information about a person by examining the content of his/her pages in social networks is exemplified. The most common approach to its solution is a visual browsing. The order of the regional authority in our country illustrates that its using in school education is needed. The article shows restrictions on the visual browsing of pupil’s pages in social networks as a tool for the teacher and the school psychologist and justifies that a process of social network users’ data analysis should be automated. Explores publications, which describe such data acquisition, processing, and analysis methods and considers their advantages and disadvantages. The article also gives arguments to support a proposal to study the classification method of social network users. One such method is credit scoring, which is used in banks and credit institutions to assess the solvency of clients. Based on the high efficiency of the method there is a proposal for significant expansion of its using in other areas of society. The possibility to use logistic regression as the mathematical apparatus of the proposed method of binary classification has been justified. Such an approach enables taking into account the different types of data extracted from social networks. Among them: the personal user data, information about hobbies, friends, graphic and text information, behaviour characteristics. The article describes a number of existing methods of data transformation that can be applied to solve the problem. An experiment of binary gender-based classification of social network users is described. A logistic model obtained for this example includes multiple logical variables obtained by transforming the user surnames. This experiment confirms the feasibility of the proposed method. Further work is to define a system

  11. Recognizing Cursive Typewritten Text Using Segmentation-Free System

    Directory of Open Access Journals (Sweden)

    Mohammad S. Khorsheed

    2015-01-01

    Full Text Available Feature extraction plays an important role in text recognition as it aims to capture essential characteristics of the text image. Feature extraction algorithms widely range between robust and hard to extract features and noise sensitive and easy to extract features. Among those feature types are statistical features which are derived from the statistical distribution of the image pixels. This paper presents a novel method for feature extraction where simple statistical features are extracted from a one-pixel wide window that slides across the text line. The feature set is clustered in the feature space using vector quantization. The feature vector sequence is then injected to a classification engine for training and recognition purposes. The recognition system is applied to a data corpus which includes cursive Arabic text of more than 600 A4-size sheets typewritten in multiple computer-generated fonts. The system performance is compared to a previously published system from the literature with a similar engine but a different feature set.

  12. Hand eczema classification

    DEFF Research Database (Denmark)

    Diepgen, T L; Andersen, Klaus Ejner; Brandao, F M

    2008-01-01

    of the disease is rarely evidence based, and a classification system for different subdiagnoses of hand eczema is not agreed upon. Randomized controlled trials investigating the treatment of hand eczema are called for. For this, as well as for clinical purposes, a generally accepted classification system...... A classification system for hand eczema is proposed. Conclusions It is suggested that this classification be used in clinical work and in clinical trials....

  13. Lie Group Classification of a Generalized Lane-Emden Type System in Two Dimensions

    Directory of Open Access Journals (Sweden)

    Motlatsi Molati

    2012-01-01

    Full Text Available The aim of this work is to perform a complete Lie symmetry classification of a generalized Lane-Emden type system in two dimensions which models many physical phenomena in biological and physical sciences. The classical approach of group classification is employed for classification. We show that several cases arise in classifying the arbitrary parameters, the forms of which include amongst others the power law nonlinearity, and exponential and quadratic forms.

  14. Classification of the web

    DEFF Research Database (Denmark)

    Mai, Jens Erik

    2004-01-01

    This paper discusses the challenges faced by investigations into the classification of the Web and outlines inquiries that are needed to use principles for bibliographic classification to construct classifications of the Web. This paper suggests that the classification of the Web meets challenges...... that call for inquiries into the theoretical foundation of bibliographic classification theory....

  15. Security classification of information

    Energy Technology Data Exchange (ETDEWEB)

    Quist, A.S.

    1993-04-01

    This document is the second of a planned four-volume work that comprehensively discusses the security classification of information. The main focus of Volume 2 is on the principles for classification of information. Included herein are descriptions of the two major types of information that governments classify for national security reasons (subjective and objective information), guidance to use when determining whether information under consideration for classification is controlled by the government (a necessary requirement for classification to be effective), information disclosure risks and benefits (the benefits and costs of classification), standards to use when balancing information disclosure risks and benefits, guidance for assigning classification levels (Top Secret, Secret, or Confidential) to classified information, guidance for determining how long information should be classified (classification duration), classification of associations of information, classification of compilations of information, and principles for declassifying and downgrading information. Rules or principles of certain areas of our legal system (e.g., trade secret law) are sometimes mentioned to .provide added support to some of those classification principles.

  16. Physio-climatic classification of South Africa's woodland biome

    CSIR Research Space (South Africa)

    Fairbanks, DHK

    2000-07-01

    Full Text Available monthly temperature, total plant-available water balance of soil, elevation, landscape topographic position, and landscape soil fertility were used as input classification variables. The map data were submitted to a factor analysis and varimax axis...

  17. Text Mining in Biomedical Domain with Emphasis on Document Clustering.

    Science.gov (United States)

    Renganathan, Vinaitheerthan

    2017-07-01

    With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.

  18. Odor Classification using Agent Technology

    Directory of Open Access Journals (Sweden)

    Sigeru OMATU

    2014-03-01

    Full Text Available In order to measure and classify odors, Quartz Crystal Microbalance (QCM can be used. In the present study, seven QCM sensors and three different odors are used. The system has been developed as a virtual organization of agents using an agent platform called PANGEA (Platform for Automatic coNstruction of orGanizations of intElligent Agents. This is a platform for developing open multi-agent systems, specifically those including organizational aspects. The main reason for the use of agents is the scalability of the platform, i.e. the way in which it models the services. The system models functionalities as services inside the agents, or as Service Oriented Approach (SOA architecture compliant services using Web Services. This way the adaptation of the odor classification systems with new algorithms, tools and classification techniques is allowed.

  19. Android Malware Classification Using K-Means Clustering Algorithm

    Science.gov (United States)

    Hamid, Isredza Rahmi A.; Syafiqah Khalid, Nur; Azma Abdullah, Nurul; Rahman, Nurul Hidayah Ab; Chai Wen, Chuah

    2017-08-01

    Malware was designed to gain access or damage a computer system without user notice. Besides, attacker exploits malware to commit crime or fraud. This paper proposed Android malware classification approach based on K-Means clustering algorithm. We evaluate the proposed model in terms of accuracy using machine learning algorithms. Two datasets were selected to demonstrate the practicing of K-Means clustering algorithms that are Virus Total and Malgenome dataset. We classify the Android malware into three clusters which are ransomware, scareware and goodware. Nine features were considered for each types of dataset such as Lock Detected, Text Detected, Text Score, Encryption Detected, Threat, Porn, Law, Copyright and Moneypak. We used IBM SPSS Statistic software for data classification and WEKA tools to evaluate the built cluster. The proposed K-Means clustering algorithm shows promising result with high accuracy when tested using Random Forest algorithm.

  20. Combining automatic table classification and relationship extraction in extracting anticancer drug-side effect pairs from full-text articles.

    Science.gov (United States)

    Xu, Rong; Wang, QuanQiu

    2015-02-01

    Anticancer drug-associated side effect knowledge often exists in multiple heterogeneous and complementary data sources. A comprehensive anticancer drug-side effect (drug-SE) relationship knowledge base is important for computation-based drug target discovery, drug toxicity predication and drug repositioning. In this study, we present a two-step approach by combining table classification and relationship extraction to extract drug-SE pairs from a large number of high-profile oncological full-text articles. The data consists of 31,255 tables downloaded from the Journal of Oncology (JCO). We first trained a statistical classifier to classify tables into SE-related and -unrelated categories. We then extracted drug-SE pairs from SE-related tables. We compared drug side effect knowledge extracted from JCO tables to that derived from FDA drug labels. Finally, we systematically analyzed relationships between anti-cancer drug-associated side effects and drug-associated gene targets, metabolism genes, and disease indications. The statistical table classifier is effective in classifying tables into SE-related and -unrelated (precision: 0.711; recall: 0.941; F1: 0.810). We extracted a total of 26,918 drug-SE pairs from SE-related tables with a precision of 0.605, a recall of 0.460, and a F1 of 0.520. Drug-SE pairs extracted from JCO tables is largely complementary to those derived from FDA drug labels; as many as 84.7% of the pairs extracted from JCO tables have not been included a side effect database constructed from FDA drug labels. Side effects associated with anticancer drugs positively correlate with drug target genes, drug metabolism genes, and disease indications. Copyright © 2014 Elsevier Inc. All rights reserved.

  1. Global Optimization Ensemble Model for Classification Methods

    Directory of Open Access Journals (Sweden)

    Hina Anwar

    2014-01-01

    Full Text Available Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity.

  2. Hazard classification methodology

    International Nuclear Information System (INIS)

    Brereton, S.J.

    1996-01-01

    This document outlines the hazard classification methodology used to determine the hazard classification of the NIF LTAB, OAB, and the support facilities on the basis of radionuclides and chemicals. The hazard classification determines the safety analysis requirements for a facility

  3. A Data Mining Classification Approach for Behavioral Malware Detection

    Directory of Open Access Journals (Sweden)

    Monire Norouzi

    2016-01-01

    Full Text Available Data mining techniques have numerous applications in malware detection. Classification method is one of the most popular data mining techniques. In this paper we present a data mining classification approach to detect malware behavior. We proposed different classification methods in order to detect malware based on the feature and behavior of each malware. A dynamic analysis method has been presented for identifying the malware features. A suggested program has been presented for converting a malware behavior executive history XML file to a suitable WEKA tool input. To illustrate the performance efficiency as well as training data and test, we apply the proposed approaches to a real case study data set using WEKA tool. The evaluation results demonstrated the availability of the proposed data mining approach. Also our proposed data mining approach is more efficient for detecting malware and behavioral classification of malware can be useful to detect malware in a behavioral antivirus.

  4. A Classification Framework Applied to Cancer Gene Expression Profiles

    Directory of Open Access Journals (Sweden)

    Hussein Hijazi

    2013-01-01

    Full Text Available Classification of cancer based on gene expression has provided insight into possible treatment strategies. Thus, developing machine learning methods that can successfully distinguish among cancer subtypes or normal versus cancer samples is important. This work discusses supervised learning techniques that have been employed to classify cancers. Furthermore, a two-step feature selection method based on an attribute estimation method (e.g., ReliefF and a genetic algorithm was employed to find a set of genes that can best differentiate between cancer subtypes or normal versus cancer samples. The application of different classification methods (e.g., decision tree, k-nearest neighbor, support vector machine (SVM, bagging, and random forest on 5 cancer datasets shows that no classification method universally outperforms all the others. However, k-nearest neighbor and linear SVM generally improve the classification performance over other classifiers. Finally, incorporating diverse types of genomic data (e.g., protein-protein interaction data and gene expression increase the prediction accuracy as compared to using gene expression alone.

  5. A Comparative Analysis of Classification Algorithms on Diverse Datasets

    Directory of Open Access Journals (Sweden)

    M. Alghobiri

    2018-04-01

    Full Text Available Data mining involves the computational process to find patterns from large data sets. Classification, one of the main domains of data mining, involves known structure generalizing to apply to a new dataset and predict its class. There are various classification algorithms being used to classify various data sets. They are based on different methods such as probability, decision tree, neural network, nearest neighbor, boolean and fuzzy logic, kernel-based etc. In this paper, we apply three diverse classification algorithms on ten datasets. The datasets have been selected based on their size and/or number and nature of attributes. Results have been discussed using some performance evaluation measures like precision, accuracy, F-measure, Kappa statistics, mean absolute error, relative absolute error, ROC Area etc. Comparative analysis has been carried out using the performance evaluation measures of accuracy, precision, and F-measure. We specify features and limitations of the classification algorithms for the diverse nature datasets.

  6. Classification and regression trees

    CERN Document Server

    Breiman, Leo; Olshen, Richard A; Stone, Charles J

    1984-01-01

    The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

  7. Classification of Clouds in Satellite Imagery Using Adaptive Fuzzy Sparse Representation

    Directory of Open Access Journals (Sweden)

    Wei Jin

    2016-12-01

    Full Text Available Automatic cloud detection and classification using satellite cloud imagery have various meteorological applications such as weather forecasting and climate monitoring. Cloud pattern analysis is one of the research hotspots recently. Since satellites sense the clouds remotely from space, and different cloud types often overlap and convert into each other, there must be some fuzziness and uncertainty in satellite cloud imagery. Satellite observation is susceptible to noises, while traditional cloud classification methods are sensitive to noises and outliers; it is hard for traditional cloud classification methods to achieve reliable results. To deal with these problems, a satellite cloud classification method using adaptive fuzzy sparse representation-based classification (AFSRC is proposed. Firstly, by defining adaptive parameters related to attenuation rate and critical membership, an improved fuzzy membership is introduced to accommodate the fuzziness and uncertainty of satellite cloud imagery; secondly, by effective combination of the improved fuzzy membership function and sparse representation-based classification (SRC, atoms in training dictionary are optimized; finally, an adaptive fuzzy sparse representation classifier for cloud classification is proposed. Experiment results on FY-2G satellite cloud image show that, the proposed method not only improves the accuracy of cloud classification, but also has strong stability and adaptability with high computational efficiency.

  8. WHO/ISUP classification of the urothelial tumors of the urinary bladder

    Directory of Open Access Journals (Sweden)

    Zdenka Ovčak

    2005-09-01

    Full Text Available Background: The authors present the current classification of urothelial neoplasms of the urinary bladder. The classification of urothelial tumors of the urinary bladder of 1973 was despite some imperfection relatively successfuly used for more than thirty years. The three grade classification of papillary urothelial tumors without invasion has been based on evaluation of variations in architecture of covering epithelium and tumor cell anaplasia. As reccomended by the International Society of Urological Pathologists (ISUP, the World Health Organisation (WHO accepted the new WHO/ ISUP classification in 1998 that was revised in 2002 and finally published in 2004. With intention to avoid unnecessary diagnosis of cancer in patients having papillary urothelial tumors with rare invasive or metastastatic growth, this classification introduced a new entity, the papillary urothelial neoplasia of low malignant potential (PUNLMP. The additional change in classification was the division of invasive urothelial neoplasms only to low and high grade urothelial carcinomas.Conclusions: The authors’ opinion is that although the old classification is not recommended for use anymore the new one is not solving the elementary reproaches to previous classification such as terminological unsuitability and insufficient scientific reasoning. Our proposed solution in classification of papillary urothelial neoplasms would be the application of criteria analogous to that used in diagnostics of papillary noninvasive tumors of the head and neck or alimentary tract.

  9. Systematic analysis of ocular trauma by a new proposed ocular trauma classification

    Directory of Open Access Journals (Sweden)

    Bhartendu Shukla

    2017-01-01

    Full Text Available Purpose: The current classification of ocular trauma does not incorporate adnexal trauma, injuries that are attributable to a nonmechanical cause and destructive globe injuries. This study proposes a new classification system of ocular trauma which is broader-based to allow for the classification of a wider range of ocular injuries not covered by the current classification. Methods: A clinic-based cross-sectional study to validate the proposed classification. We analyzed 535 cases of ocular injury from January 1, 2012 to February 28, 2012 over a 4-year period in an eye hospital in central India using our proposed classification system and compared it with conventional classification. Results: The new classification system allowed for classification of all 535 cases of ocular injury. The conventional classification was only able to classify 364 of the 535 trauma cases. Injuries involving the adnexa, nonmechanical injuries and destructive globe injuries could not be classified by the conventional classification, thus missing about 33% of cases. Conclusions: Our classification system shows an improvement over existing ocular trauma classification as it allows for the classification of all type of ocular injuries and will allow for better and specific prognostication. This system has the potential to aid communication between physicians and result in better patient care. It can also provide a more authentic, wide spectrum of ocular injuries in correlation with etiology. By including adnexal injuries and nonmechanical injuries, we have been able to classify all 535 cases of trauma. Otherwise, about 30% of cases would have been excluded from the study.

  10. CLASSIFICATION OF LEARNING MANAGEMENT SYSTEMS

    Directory of Open Access Journals (Sweden)

    Yu. B. Popova

    2016-01-01

    Full Text Available Using of information technologies and, in particular, learning management systems, increases opportunities of teachers and students in reaching their goals in education. Such systems provide learning content, help organize and monitor training, collect progress statistics and take into account the individual characteristics of each user. Currently, there is a huge inventory of both paid and free systems are physically located both on college servers and in the cloud, offering different features sets of different licensing scheme and the cost. This creates the problem of choosing the best system. This problem is partly due to the lack of comprehensive classification of such systems. Analysis of more than 30 of the most common now automated learning management systems has shown that a classification of such systems should be carried out according to certain criteria, under which the same type of system can be considered. As classification features offered by the author are: cost, functionality, modularity, keeping the customer’s requirements, the integration of content, the physical location of a system, adaptability training. Considering the learning management system within these classifications and taking into account the current trends of their development, it is possible to identify the main requirements to them: functionality, reliability, ease of use, low cost, support for SCORM standard or Tin Can API, modularity and adaptability. According to the requirements at the Software Department of FITR BNTU under the guidance of the author since 2009 take place the development, the use and continuous improvement of their own learning management system.

  11. An Incremental Classification Algorithm for Mining Data with Feature Space Heterogeneity

    Directory of Open Access Journals (Sweden)

    Yu Wang

    2014-01-01

    Full Text Available Feature space heterogeneity often exists in many real world data sets so that some features are of different importance for classification over different subsets. Moreover, the pattern of feature space heterogeneity might dynamically change over time as more and more data are accumulated. In this paper, we develop an incremental classification algorithm, Supervised Clustering for Classification with Feature Space Heterogeneity (SCCFSH, to address this problem. In our approach, supervised clustering is implemented to obtain a number of clusters such that samples in each cluster are from the same class. After the removal of outliers, relevance of features in each cluster is calculated based on their variations in this cluster. The feature relevance is incorporated into distance calculation for classification. The main advantage of SCCFSH lies in the fact that it is capable of solving a classification problem with feature space heterogeneity in an incremental way, which is favorable for online classification tasks with continuously changing data. Experimental results on a series of data sets and application to a database marketing problem show the efficiency and effectiveness of the proposed approach.

  12. Overview of Four Functional Classification Systems Commonly Used in Cerebral Palsy

    Directory of Open Access Journals (Sweden)

    Andrea Paulson

    2017-04-01

    Full Text Available Cerebral palsy (CP is the most common physical disability in childhood. CP comprises a heterogeneous group of disorders that can result in spasticity, dystonia, muscle contractures, weakness and coordination difficulty that ultimately affects the ability to control movements. Traditionally, CP has been classified using a combination of the motor type and the topographical distribution, as well as subjective severity level. Imprecise terms such as these tell very little about what a person is able to do functionally and can impair clear communication between providers. More recently, classification systems have been created employing a simple ordinal grading system of functional performance. These systems allow a more precise discussion between providers, as well as better subject stratification for research. The goal of this review is to describe four common functional classification systems for cerebral palsy: the Gross Motor Function Classification System (GMFCS, the Manual Ability Classification System (MACS, the Communication Function Classification System (CFCS, and the Eating and Drinking Ability Classification System (EDACS. These measures are all standardized, reliable, and complementary to one another.

  13. The Classification of Hysteria and Related Disorders: Historical and Phenomenological Considerations

    Directory of Open Access Journals (Sweden)

    Carol S. North

    2015-11-01

    Full Text Available This article examines the history of the conceptualization of dissociative, conversion, and somatoform syndromes in relation to one another, chronicles efforts to classify these and other phenomenologically-related psychopathology in the American diagnostic system for mental disorders, and traces the subsequent divergence in opinions of dissenting sectors on classification of these disorders. This article then considers the extensive phenomenological overlap across these disorders in empirical research, and from this foundation presents a new model for the conceptualization of these disorders. The classification of disorders formerly known as hysteria and phenomenologically-related syndromes has long been contentious and unsettled. Examination of the long history of the conceptual difficulties, which remain inherent in existing classification schemes for these disorders, can help to address the continuing controversy. This review clarifies the need for a major conceptual revision of the current classification of these disorders. A new phenomenologically-based classification scheme for these disorders is proposed that is more compatible with the agnostic and atheoretical approach to diagnosis of mental disorders used by the current classification system.

  14. Personal Relationship in Les Murray’s Poem

    Directory of Open Access Journals (Sweden)

    Henriono Nugroho

    2009-01-01

    Full Text Available Stylistics is a linguistic analysis on literary and non-literary texts. This article is concerned with a systemic stylistic analysis on a poem in terms of Systemic Functional Linguistics and Verbal Art Semiotics. It uses library research, qualitative data, documentary study, descriptive method and intrinsic-objective approach. The semantic analysis results in both automatized and foregrounded meanings. Then the automatized meaning produces lexical cohesion and in turn, it produces subject matter. Meanwhile, the foregrounded meaning produces the literary meaning and in turn, it creates theme. Finally, the analysis indicates that the subject matter is about daily works, the literary meaning is about the complete severance and the theme is about personal relationship.

  15. PASTEC: an automatic transposable element classification tool.

    Directory of Open Access Journals (Sweden)

    Claire Hoede

    Full Text Available SUMMARY: The classification of transposable elements (TEs is key step towards deciphering their potential impact on the genome. However, this process is often based on manual sequence inspection by TE experts. With the wealth of genomic sequences now available, this task requires automation, making it accessible to most scientists. We propose a new tool, PASTEC, which classifies TEs by searching for structural features and similarities. This tool outperforms currently available software for TE classification. The main innovation of PASTEC is the search for HMM profiles, which is useful for inferring the classification of unknown TE on the basis of conserved functional domains of the proteins. In addition, PASTEC is the only tool providing an exhaustive spectrum of possible classifications to the order level of the Wicker hierarchical TE classification system. It can also automatically classify other repeated elements, such as SSR (Simple Sequence Repeats, rDNA or potential repeated host genes. Finally, the output of this new tool is designed to facilitate manual curation by providing to biologists with all the evidence accumulated for each TE consensus. AVAILABILITY: PASTEC is available as a REPET module or standalone software (http://urgi.versailles.inra.fr/download/repet/REPET_linux-x64-2.2.tar.gz. It requires a Unix-like system. There are two standalone versions: one of which is parallelized (requiring Sun grid Engine or Torque, and the other of which is not.

  16. Proposal of a new classification scheme for periocular injuries

    Directory of Open Access Journals (Sweden)

    Devi Prasad Mohapatra

    2017-01-01

    Full Text Available Background: Eyelids are important structures and play a role in protecting the globe from trauma, brightness, in maintaining the integrity of tear films and moving the tears towards the lacrimal drainage system and contribute to aesthetic appearance of the face. Ophthalmic trauma is an important cause of morbidity among individuals and has also been responsible for additional cost of healthcare. Periocular trauma involving eyelids and adjacent structures has been found to have increased recently probably due to increased pace of life and increased dependence on machinery. A comprehensive classification of periocular trauma would help in stratifying these injuries as well as study outcomes. Material and Methods: This study was carried out at our institute from June 2015 to Dec 2015. We searched multiple English language databases for existing classification systems for periocular trauma. We designed a system of classification of periocular soft tissue injuries based on clinico-anatomical presentations. This classification was applied prospectively to patients presenting with periocular soft tissue injuries to our department. Results: A comprehensive classification scheme was designed consisting of five types of periocular injuries. A total of 38 eyelid injuries in 34 patients were evaluated in this study. According to the System for Peri-Ocular Trauma (SPOT classification, Type V injuries were most common. SPOT Type II injuries were more common isolated injuries among all zones. Discussion: Classification systems are necessary in order to provide a framework in which to scientifically study the etiology, pathogenesis, and treatment of diseases in an orderly fashion. The SPOT classification has taken into account the periocular soft tissue injuries i.e., upper eyelid, lower eyelid, medial and lateral canthus injuries., based on observed clinico-anatomical patterns of eyelid injuries. Conclusion: The SPOT classification seems to be a reliable

  17. Data classification based on the hybrid intellectual technology

    Directory of Open Access Journals (Sweden)

    Demidova Liliya

    2018-01-01

    Full Text Available In this paper the data classification technique, implying the consistent application of the SVM and Parzen classifiers, has been suggested. The Parser classifier applies to data which can be both correctly and erroneously classified using the SVM classifier, and are located in the experimentally defined subareas near the hyperplane which separates the classes. A herewith, the SVM classifier is used with the default parameters values, and the optimal parameters values of the Parser classifier are determined using the genetic algorithm. The experimental results confirming the effectiveness of the proposed hybrid intellectual data classification technology have been presented.

  18. A Cognitive Computing Approach for Classification of Complaints in the Insurance Industry

    Science.gov (United States)

    Forster, J.; Entrup, B.

    2017-10-01

    In this paper we present and evaluate a cognitive computing approach for classification of dissatisfaction and four complaint specific complaint classes in correspondence documents between insurance clients and an insurance company. A cognitive computing approach includes the combination classical natural language processing methods, machine learning algorithms and the evaluation of hypothesis. The approach combines a MaxEnt machine learning algorithm with language modelling, tf-idf and sentiment analytics to create a multi-label text classification model. The result is trained and tested with a set of 2500 original insurance communication documents written in German, which have been manually annotated by the partnering insurance company. With a F1-Score of 0.9, a reliable text classification component has been implemented and evaluated. A final outlook towards a cognitive computing insurance assistant is given in the end.

  19. Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor

    Directory of Open Access Journals (Sweden)

    Chang Xu

    2018-05-01

    Full Text Available This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs. Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.

  20. Archival classification: new usage scenarios among semantic web and traditio of digital samples

    Directory of Open Access Journals (Sweden)

    Alessandro Alfier

    2017-05-01

    Full Text Available Starting from the acknowledgement of the basic purpose assigned by tradition to classification within documents management, the article faces the issues related to new needs and usage, related to the digital scenarios, that would allow classification to consolidate its tradition of effectiveness in a new digital environment. The key point of the article is represented by the in-depth analysis of the possible synergies between classification-related activities and the International Standard for Describing Functions (ISDF, developed by ICA in 2007. The article highlights how an approach to classification elaborated from the ISDF perspective allows classification itself to enrich from purposes and semantic web related usage, and with the traditio of digital documents.

  1. THE LOW BACKSCATTERING TARGETS CLASSIFICATION IN URBAN AREAS

    Directory of Open Access Journals (Sweden)

    L. Shi

    2012-07-01

    Full Text Available The Polarimetric and Interferometric Synthetic Aperture Radar (POLINSAR is widely used in urban area nowadays. Because of the physical and geometric sensitivity, the POLINSAR is suitable for the city classification, power-lines detection, building extraction, etc. As the new X-band POLINSAR radar, the china prototype airborne system, XSAR works with high spatial resolution in azimuth (0.1 m and slant range (0.4 m. In land applications, SAR image classification is a useful tool to distinguish the interesting area and obtain the target information. The bare soil, the cement road, the water and the building shadow are common scenes in the urban area. As it always exists low backscattering sign objects (LBO with the similar scattering mechanism (all odd bounce except for shadow in the XSAR images, classes are usually confused in Wishart-H-Alpha and Freeman-Durden methods. It is very hard to distinguish those targets only using the general information. To overcome the shortage, this paper explores an improved algorithm for LBO refined classification based on the Pre-Classification in urban areas. Firstly, the Pre-Classification is applied in the polarimetric datum and the mixture class is marked which contains LBO. Then, the polarimetric covariance matrix C3 is re-estimated on the Pre-Classification results to get more reliable results. Finally, the occurrence space which combining the entropy and the phase-diff standard deviation between HH and VV channel is used to refine the Pre-Classification results. The XSAR airborne experiments show the improved method is potential to distinguish the mixture classes in the low backscattering objects.

  2. Using Shakespeare's Sotto Voce to Determine True Identity From Text

    Science.gov (United States)

    Kernot, David; Bossomaier, Terry; Bradbury, Roger

    2018-01-01

    Little is known of the private life of William Shakespeare, but he is famous for his collection of plays and poems, even though many of the works attributed to him were published anonymously. Determining the identity of Shakespeare has fascinated scholars for 400 years, and four significant figures in English literary history have been suggested as likely alternatives to Shakespeare for some disputed works: Bacon, de Vere, Stanley, and Marlowe. A myriad of computational and statistical tools and techniques have been used to determine the true authorship of his works. Many of these techniques rely on basic statistical correlations, word counts, collocated word groups, or keyword density, but no one method has been decided on. We suggest that an alternative technique that uses word semantics to draw on personality can provide an accurate profile of a person. To test this claim, we analyse the works of Shakespeare, Christopher Marlowe, and Elizabeth Cary. We use Word Accumulation Curves, Hierarchical Clustering overlays, Principal Component Analysis, and Linear Discriminant Analysis techniques in combination with RPAS, a multi-faceted text analysis approach that draws on a writer's personality, or self to identify subtle characteristics within a person's writing style. Here we find that RPAS can separate the known authored works of Shakespeare from Marlowe and Cary. Further, it separates their contested works, works suspected of being written by others. While few authorship identification techniques identify self from the way a person writes, we demonstrate that these stylistic characteristics are as applicable 400 years ago as they are today and have the potential to be used within cyberspace for law enforcement purposes. PMID:29599734

  3. Classification, disease, and diagnosis.

    Science.gov (United States)

    Jutel, Annemarie

    2011-01-01

    Classification shapes medicine and guides its practice. Understanding classification must be part of the quest to better understand the social context and implications of diagnosis. Classifications are part of the human work that provides a foundation for the recognition and study of illness: deciding how the vast expanse of nature can be partitioned into meaningful chunks, stabilizing and structuring what is otherwise disordered. This article explores the aims of classification, their embodiment in medical diagnosis, and the historical traditions of medical classification. It provides a brief overview of the aims and principles of classification and their relevance to contemporary medicine. It also demonstrates how classifications operate as social framing devices that enable and disable communication, assert and refute authority, and are important items for sociological study.

  4. Classification of Hyperspectral Images Using Kernel Fully Constrained Least Squares

    Directory of Open Access Journals (Sweden)

    Jianjun Liu

    2017-11-01

    Full Text Available As a widely used classifier, sparse representation classification (SRC has shown its good performance for hyperspectral image classification. Recent works have highlighted that it is the collaborative representation mechanism under SRC that makes SRC a highly effective technique for classification purposes. If the dimensionality and the discrimination capacity of a test pixel is high, other norms (e.g., ℓ 2 -norm can be used to regularize the coding coefficients, except for the sparsity ℓ 1 -norm. In this paper, we show that in the kernel space the nonnegative constraint can also play the same role, and thus suggest the investigation of kernel fully constrained least squares (KFCLS for hyperspectral image classification. Furthermore, in order to improve the classification performance of KFCLS by incorporating spatial-spectral information, we investigate two kinds of spatial-spectral methods using two regularization strategies: (1 the coefficient-level regularization strategy, and (2 the class-level regularization strategy. Experimental results conducted on four real hyperspectral images demonstrate the effectiveness of the proposed KFCLS, and show which way to incorporate spatial-spectral information efficiently in the regularization framework.

  5. Comparison between Possibilistic c-Means (PCM and Artificial Neural Network (ANN Classification Algorithms in Land use/ Land cover Classification

    Directory of Open Access Journals (Sweden)

    Ganchimeg Ganbold

    2017-03-01

    Full Text Available There are several statistical classification algorithms available for landuse/land cover classification. However, each has a certain bias orcompromise. Some methods like the parallel piped approach in supervisedclassification, cannot classify continuous regions within a feature. Onthe other hand, while unsupervised classification method takes maximumadvantage of spectral variability in an image, the maximally separableclusters in spectral space may not do much for our perception of importantclasses in a given study area. In this research, the output of an ANNalgorithm was compared with the Possibilistic c-Means an improvementof the fuzzy c-Means on both moderate resolutions Landsat8 and a highresolution Formosat 2 images. The Formosat 2 image comes with an8m spectral resolution on the multispectral data. This multispectral imagedata was resampled to 10m in order to maintain a uniform ratio of1:3 against Landsat 8 image. Six classes were chosen for analysis including:Dense forest, eucalyptus, water, grassland, wheat and riverine sand. Using a standard false color composite (FCC, the six features reflecteddifferently in the infrared region with wheat producing the brightestpixel values. Signature collection per class was therefore easily obtainedfor all classifications. The output of both ANN and FCM, were analyzedseparately for accuracy and an error matrix generated to assess the qualityand accuracy of the classification algorithms. When you compare theresults of the two methods on a per-class-basis, ANN had a crisperoutput compared to PCM which yielded clusters with pixels especiallyon the moderate resolution Landsat 8 imagery.

  6. Feature generation and representations for protein-protein interaction classification.

    Science.gov (United States)

    Lan, Man; Tan, Chew Lim; Su, Jian

    2009-10-01

    Automatic detecting protein-protein interaction (PPI) relevant articles is a crucial step for large-scale biological database curation. The previous work adopted POS tagging, shallow parsing and sentence splitting techniques, but they achieved worse performance than the simple bag-of-words representation. In this paper, we generated and investigated multiple types of feature representations in order to further improve the performance of PPI text classification task. Besides the traditional domain-independent bag-of-words approach and the term weighting methods, we also explored other domain-dependent features, i.e. protein-protein interaction trigger keywords, protein named entities and the advanced ways of incorporating Natural Language Processing (NLP) output. The integration of these multiple features has been evaluated on the BioCreAtIvE II corpus. The experimental results showed that both the advanced way of using NLP output and the integration of bag-of-words and NLP output improved the performance of text classification. Specifically, in comparison with the best performance achieved in the BioCreAtIvE II IAS, the feature-level and classifier-level integration of multiple features improved the performance of classification 2.71% and 3.95%, respectively.

  7. A classification system for tableting behaviors of binary powder mixtures

    Directory of Open Access Journals (Sweden)

    Changquan Calvin Sun

    2016-08-01

    Full Text Available The ability to predict tableting properties of a powder mixture from individual components is of both fundamental and practical importance to the efficient formulation development of tablet products. A common tableting classification system (TCS of binary powder mixtures facilitates the systematic development of new knowledge in this direction. Based on the dependence of tablet tensile strength on weight fraction in a binary mixture, three main types of tableting behavior are identified. Each type is further divided to arrive at a total of 15 sub-classes. The proposed classification system lays a framework for a better understanding of powder interactions during compaction. Potential applications and limitations of this classification system are discussed.

  8. Parallelizing Gene Expression Programming Algorithm in Enabling Large-Scale Classification

    Directory of Open Access Journals (Sweden)

    Lixiong Xu

    2017-01-01

    Full Text Available As one of the most effective function mining algorithms, Gene Expression Programming (GEP algorithm has been widely used in classification, pattern recognition, prediction, and other research fields. Based on the self-evolution, GEP is able to mine an optimal function for dealing with further complicated tasks. However, in big data researches, GEP encounters low efficiency issue due to its long time mining processes. To improve the efficiency of GEP in big data researches especially for processing large-scale classification tasks, this paper presents a parallelized GEP algorithm using MapReduce computing model. The experimental results show that the presented algorithm is scalable and efficient for processing large-scale classification tasks.

  9. An evaluation of classification algorithms for intrusion detection ...

    African Journals Online (AJOL)

    An evaluation of classification algorithms for intrusion detection. ... Log in or Register to get access to full text downloads. ... Most of the available IDSs use all the 41 features in the network to evaluate and search for intrusive pattern in which ...

  10. Five-way smoking status classification using text hot-spot identification and error-correcting output codes.

    Science.gov (United States)

    Cohen, Aaron M

    2008-01-01

    We participated in the i2b2 smoking status classification challenge task. The purpose of this task was to evaluate the ability of systems to automatically identify patient smoking status from discharge summaries. Our submission included several techniques that we compared and studied, including hot-spot identification, zero-vector filtering, inverse class frequency weighting, error-correcting output codes, and post-processing rules. We evaluated our approaches using the same methods as the i2b2 task organizers, using micro- and macro-averaged F1 as the primary performance metric. Our best performing system achieved a micro-F1 of 0.9000 on the test collection, equivalent to the best performing system submitted to the i2b2 challenge. Hot-spot identification, zero-vector filtering, classifier weighting, and error correcting output coding contributed additively to increased performance, with hot-spot identification having by far the largest positive effect. High performance on automatic identification of patient smoking status from discharge summaries is achievable with the efficient and straightforward machine learning techniques studied here.

  11. English for Science and Technology - Stylistics and Methods

    DEFF Research Database (Denmark)

    Mousten, Birthe

    The book covers basic methods for summarizing and editing of EST writing (English for Science and Technology). In addition, translation of basically technically oriented texts is covered with a view to an evaluation of formality, complexity and audience recognition in connection with different text...

  12. Standard classification: Physics

    International Nuclear Information System (INIS)

    1977-01-01

    This is a draft standard classification of physics. The conception is based on the physics part of the systematic catalogue of the Bayerische Staatsbibliothek and on the classification given in standard textbooks. The ICSU-AB classification now used worldwide by physics information services was not taken into account. (BJ) [de

  13. Progress in the diagnosis and classification of pituitary adenomas

    Directory of Open Access Journals (Sweden)

    Luis V Syro

    2015-06-01

    Full Text Available Pituitary adenomas are common neoplasms. Their classification is based upon size, invasion of adjacent structures, sporadic or familial cases, biochemical activity, clinical manifestations, morphological characteristics, response to treatment and recurrence. Although they are considered benign tumors, some of them are difficult to treat due to their tendency to recur, despite standardized treatment. Functional tumors present other challenges for normalizing their biochemical activity. Novel approaches for early diagnosis as well as different perspectives on classification may help to identify subgroups of patients with similar characteristics, creating opportunities to match each patient with the best personalized treatment option. In this paper we present the progress in the diagnosis and classification of different subgroups of patients with pituitary tumors that may be managed with specific considerations according to their tumor subtype.

  14. CLINICAL AND MORPHOLOGICAL CLASSIFICATION OF CEREBRAL INTRAVENTRICULAR HEMORRHAGES

    Directory of Open Access Journals (Sweden)

    V. V. Vlasyuk

    2013-01-01

    Full Text Available Inconsistency of the current classification of cerebral intraventricular hemorrhages is discussed in the article. The author explains divergence of including of the subependymal (1st stage and intracerebral (4th stage hemorrhages into this classification. A new classification of cerebral intraventricular hemorrhages including their origin, phases and stages is offered. The most common origin of intraventricular hemorrhages is subependymal hemorrhage (82,2%. Two phases of hemorrhage were distinguished: bleeding phase and resorption phase. Stages of intraventricular hemorrhages reflecting the blood movement after the onset of bleeding are the following: 1 — infill of the up to ½ of the lateral ventricles without their enlargement; 2 — infill of more than ½ of the lateral ventricles with their enlargement; 3 — infill of the IV ventricle, of the cerebellomedullary cistern and its dislocation into the subarachnoid space of the cerebellum, pons varolii, medulla oblongata and spinal cord.

  15. Classification of Franchise Networks in the Retail Trade

    Directory of Open Access Journals (Sweden)

    Grygorenko Tetyana M.

    2016-11-01

    Full Text Available The article clarifies the definitions of the concepts of «franchise network», «franchise trade network», «franchise retail network», which is substantiated by the lack of a unified approach to interpretation of these concepts. The classification of franchise networks in the retail trade taking into account peculiarities in the operation of this sub-sector of the market economy is developed; classification attributes are identified and types of franchise retail chains are characterized. The proposed classification of franchise retail networks is adapted to the economic situation in Ukraine and specifics of the national franchise relations. It will facilitate a deeper understanding of the essence of the formation and operation of franchise retail chains and also help Ukrainian entrepreneurs to justify choosing the most suitable for them franchising model and allow to establish such a network with regard to various attributes using a complex approach.

  16. Vision-Based Perception and Classification of Mosquitoes Using Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Masataka Fuchida

    2017-01-01

    Full Text Available The need for a novel automated mosquito perception and classification method is becoming increasingly essential in recent years, with steeply increasing number of mosquito-borne diseases and associated casualties. There exist remote sensing and GIS-based methods for mapping potential mosquito inhabitants and locations that are prone to mosquito-borne diseases, but these methods generally do not account for species-wise identification of mosquitoes in closed-perimeter regions. Traditional methods for mosquito classification involve highly manual processes requiring tedious sample collection and supervised laboratory analysis. In this research work, we present the design and experimental validation of an automated vision-based mosquito classification module that can deploy in closed-perimeter mosquito inhabitants. The module is capable of identifying mosquitoes from other bugs such as bees and flies by extracting the morphological features, followed by support vector machine-based classification. In addition, this paper presents the results of three variants of support vector machine classifier in the context of mosquito classification problem. This vision-based approach to the mosquito classification problem presents an efficient alternative to the conventional methods for mosquito surveillance, mapping and sample image collection. Experimental results involving classification between mosquitoes and a predefined set of other bugs using multiple classification strategies demonstrate the efficacy and validity of the proposed approach with a maximum recall of 98%.

  17. A Classification System for Hospital-Based Infection Outbreaks

    Directory of Open Access Journals (Sweden)

    Paul S. Ganney

    2010-01-01

    Full Text Available Outbreaks of infection within semi-closed environments such as hospitals, whether inherent in the environment (such as Clostridium difficile (C.Diff or Methicillinresistant Staphylococcus aureus (MRSA or imported from the wider community (such as Norwalk-like viruses (NLVs, are difficult to manage. As part of our work on modelling such outbreaks, we have developed a classification system to describe the impact of a particular outbreak upon an organization. This classification system may then be used in comparing appropriate computer models to real outbreaks, as well as in comparing different real outbreaks in, for example, the comparison of differing management and containment techniques and strategies. Data from NLV outbreaks in the Hull and East Yorkshire Hospitals NHS Trust (the Trust over several previous years are analysed and classified, both for infection within staff (where the end of infection date may not be known and within patients (where it generally is known. A classification system consisting of seven elements is described, along with a goodness-of-fit method for comparing a new classification to previously known ones, for use in evaluating a simulation against history and thereby determining how ‘realistic’ (or otherwise it is.

  18. A simplified classification system for partially edentulous spaces

    Directory of Open Access Journals (Sweden)

    Bhandari Aruna J, Bhandari Akshay J

    2014-04-01

    Full Text Available Background: There is no single universally employed classification system that will specify the exact edentulous situation. Several classification systems exist to group the situation and avoid confusion. Classifications based on edentulous areas, finished restored prostheses, type of direct retainers or fulcrum lines are there. Some are based depending on the placement of the implants. Widely accepted Kennedy Applegate classification does not give any idea about length, span or number of teeth missing. Rule 6 governing the application of Kennedy method states that additional edentulous areas are referred as modification number 1,2 etc. Rule 7 states that extent of the modification is not considered; only the number of edentulous areas is considered. Hence there is a need to modify the Kennedy –Applegate System. Aims: This new classification system is an attempt to modify Kennedy –Applegate System so as to give the exact idea about missing teeth, space, span, side and areas of partially edentulous arches. Methods and Material: This system will provide the information regarding Maxillary or Mandibular partially edentulous arches, Left or Right side, length of the edentulous space, number of teeth missing and whether there will be tooth borne or tooth – tissue borne prosthesis. Conclusions: This classification is easy for application, communication and will also help to design the removable cast partial denture in a better logical and systematic way. Also, this system will give the idea of the edentulous status and the number of missing teeth in fixed, hybrid or implant prosthesis.

  19. The paradox of atheoretical classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2016-01-01

    A distinction can be made between “artificial classifications” and “natural classifications,” where artificial classifications may adequately serve some limited purposes, but natural classifications are overall most fruitful by allowing inference and thus many different purposes. There is strong...... support for the view that a natural classification should be based on a theory (and, of course, that the most fruitful theory provides the most fruitful classification). Nevertheless, atheoretical (or “descriptive”) classifications are often produced. Paradoxically, atheoretical classifications may...... be very successful. The best example of a successful “atheoretical” classification is probably the prestigious Diagnostic and Statistical Manual of Mental Disorders (DSM) since its third edition from 1980. Based on such successes one may ask: Should the claim that classifications ideally are natural...

  20. EFFECTIVE MULTI-RESOLUTION TRANSFORM IDENTIFICATION FOR CHARACTERIZATION AND CLASSIFICATION OF TEXTURE GROUPS

    Directory of Open Access Journals (Sweden)

    S. Arivazhagan

    2011-11-01

    Full Text Available Texture classification is important in applications of computer image analysis for characterization or classification of images based on local spatial variations of intensity or color. Texture can be defined as consisting of mutually related elements. This paper proposes an experimental approach for identification of suitable multi-resolution transform for characterization and classification of different texture groups based on statistical and co-occurrence features derived from multi-resolution transformed sub bands. The statistical and co-occurrence feature sets are extracted for various multi-resolution transforms such as Discrete Wavelet Transform (DWT, Stationary Wavelet Transform (SWT, Double Density Wavelet Transform (DDWT and Dual Tree Complex Wavelet Transform (DTCWT and then, the transform that maximizes the texture classification performance for the particular texture group is identified.

  1. A review on fault classification methodologies in power transmission systems: Part-II

    Directory of Open Access Journals (Sweden)

    Avagaddi Prasad

    2018-05-01

    Full Text Available The countless extent of power systems and applications requires the improvement in suitable techniques for the fault classification in power transmission systems, to increase the efficiency of the systems and to avoid major damages. For this purpose, the technical literature proposes a large number of methods. The paper analyzes the technical literature, summarizing the most important methods that can be applied to fault classification methodologies in power transmission systems.The part 2 of the article is named “A review on fault classification methodologies in power transmission systems”. In this part 2 we discussed the advanced technologies developed by various researchers for fault classification in power transmission systems. Keywords: Transmission line protection, Protective relaying, Soft computing techniques

  2. CREST--classification resources for environmental sequence tags.

    Directory of Open Access Journals (Sweden)

    Anders Lanzén

    Full Text Available Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags, a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3 from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.

  3. Combining Machine Learning and Natural Language Processing to Assess Literary Text Comprehension

    Science.gov (United States)

    Balyan, Renu; McCarthy, Kathryn S.; McNamara, Danielle S.

    2017-01-01

    This study examined how machine learning and natural language processing (NLP) techniques can be leveraged to assess the interpretive behavior that is required for successful literary text comprehension. We compared the accuracy of seven different machine learning classification algorithms in predicting human ratings of student essays about…

  4. Transporter Classification Database (TCDB)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Transporter Classification Database details a comprehensive classification system for membrane transport proteins known as the Transporter Classification (TC)...

  5. An evaluation of classification systems for stillbirth

    Directory of Open Access Journals (Sweden)

    Pattinson Robert

    2009-06-01

    Full Text Available Abstract Background Audit and classification of stillbirths is an essential part of clinical practice and a crucial step towards stillbirth prevention. Due to the limitations of the ICD system and lack of an international approach to an acceptable solution, numerous disparate classification systems have emerged. We assessed the performance of six contemporary systems to inform the development of an internationally accepted approach. Methods We evaluated the following systems: Amended Aberdeen, Extended Wigglesworth; PSANZ-PDC, ReCoDe, Tulip and CODAC. Nine teams from 7 countries applied the classification systems to cohorts of stillbirths from their regions using 857 stillbirth cases. The main outcome measures were: the ability to retain the important information about the death using the InfoKeep rating; the ease of use according to the Ease rating (both measures used a five-point scale with a score Results InfoKeep scores were significantly different across the classifications (p ≤ 0.01 due to low scores for Wigglesworth and Aberdeen. CODAC received the highest mean (SD score of 3.40 (0.73 followed by PSANZ-PDC, ReCoDe and Tulip [2.77 (1.00, 2.36 (1.21, 1.92 (1.24 respectively]. Wigglesworth and Aberdeen resulted in a high proportion of unexplained stillbirths and CODAC and Tulip the lowest. While Ease scores were different (p ≤ 0.01, all systems received satisfactory scores; CODAC received the highest score. Aberdeen and Wigglesworth showed poor agreement with kappas of 0.35 and 0.25 respectively. Tulip performed best with a kappa of 0.74. The remainder had good to fair agreement. Conclusion The Extended Wigglesworth and Amended Aberdeen systems cannot be recommended for classification of stillbirths. Overall, CODAC performed best with PSANZ-PDC and ReCoDe performing well. Tulip was shown to have the best agreement and a low proportion of unexplained stillbirths. The virtues of these systems need to be considered in the development of an

  6. A Spectral-Texture Kernel-Based Classification Method for Hyperspectral Images

    Directory of Open Access Journals (Sweden)

    Yi Wang

    2016-11-01

    Full Text Available Classification of hyperspectral images always suffers from high dimensionality and very limited labeled samples. Recently, the spectral-spatial classification has attracted considerable attention and can achieve higher classification accuracy and smoother classification maps. In this paper, a novel spectral-spatial classification method for hyperspectral images by using kernel methods is investigated. For a given hyperspectral image, the principle component analysis (PCA transform is first performed. Then, the first principle component of the input image is segmented into non-overlapping homogeneous regions by using the entropy rate superpixel (ERS algorithm. Next, the local spectral histogram model is applied to each homogeneous region to obtain the corresponding texture features. Because this step is performed within each homogenous region, instead of within a fixed-size image window, the obtained local texture features in the image are more accurate, which can effectively benefit the improvement of classification accuracy. In the following step, a contextual spectral-texture kernel is constructed by combining spectral information in the image and the extracted texture information using the linearity property of the kernel methods. Finally, the classification map is achieved by the support vector machines (SVM classifier using the proposed spectral-texture kernel. Experiments on two benchmark airborne hyperspectral datasets demonstrate that our method can effectively improve classification accuracies, even though only a very limited training sample is available. Specifically, our method can achieve from 8.26% to 15.1% higher in terms of overall accuracy than the traditional SVM classifier. The performance of our method was further compared to several state-of-the-art classification methods of hyperspectral images using objective quantitative measures and a visual qualitative evaluation.

  7. IMPROVING CLASSIFICATIONS OF ECONOMIC SCIENCES IN A THESAURUS

    Directory of Open Access Journals (Sweden)

    Sergey Vladimirovich Lesnikov

    2013-09-01

    Full Text Available The goal is to study thesaurus as an instrument to define the classification of economic sciences, to adapt their classification to the increased information flow, to increase accuracy of allocation of information resources with consideration of the users’ needs, to suggest making alterations in the classification of economic sciences made by the Institute of Scientific Information for Social Sciences of the Russian Academy of Sciences (INION RAN in 2001.The authors see the classification of economic sciences as a product of social communications theory – a differentiated aspect of social research. Modern science is subdivided into various aspects with varied subjects and methods. The latter overlap and form a hierarchy of concepts in science within the same research subject. The authors stress the importance of information retrieval systems for developing scientific knowledge. Information retrieval systems can immediately deliver data from different areas of science to the user who can then integrate the information and obtain a vivid picture of the research subject. Search engines and rubricators are becoming increasingly important as there is a tendency to isolated thinking with many Internet users.The authors have devised a certain approach to using the thesaurus as the means of sciences classification and as a hyper language of science. The suggested methodological approach to structuring terms and notions via thesaurus have been tested at Syktyvkar State University and Syktyvkar branch of Saint-Petersburg Economic University.Methods: deduction, induction, analysis, synthesis, abstraction technique, classification.Results: there have been defined stages and main sections of the information-retrieval thesaurus of the hyperlanguage of economic science on the basis of existing classification systems of scientific knowledge.Scope of application of results: library services, information technology, education.DOI: http://dx.doi.org/10.12731/2218-7405-2013-8-22

  8. Classification of non-performing loans portfolio using Multilayer Perceptron artificial neural networks

    Directory of Open Access Journals (Sweden)

    Flávio Clésio Silva de Souza

    2014-06-01

    Full Text Available The purpose of the present research is to apply a Multilayer Perceptron (MLP neural network technique to create classification models from a portfolio of Non-Performing Loans (NPLs to classify this type of credit derivative. These credit derivatives are characterized as the amount of loans that were not paid and are already overdue more than 90 days. Since these titles are, because of legislative motives, moved by losses, Credit Rights Investment Funds (FDIC performs the purchase of these debts and the recovery of the credits. Using the Multilayer Perceptron (MLP architecture of Artificial Neural Network (ANN, classification models regarding the posterior recovery of these debts were created. To evaluate the performance of the models, evaluation metrics of classification relating to the neural networks with different architectures were presented. The results of the classifications were satisfactory, given the classification models were successful in the presented economics costs structure.

  9. Monitoring nanotechnology using patent classifications: an overview and comparison of nanotechnology classification schemes

    Energy Technology Data Exchange (ETDEWEB)

    Jürgens, Björn, E-mail: bjurgens@agenciaidea.es [Agency of Innovation and Development of Andalusia, CITPIA PATLIB Centre (Spain); Herrero-Solana, Victor, E-mail: victorhs@ugr.es [University of Granada, SCImago-UGR (SEJ036) (Spain)

    2017-04-15

    Patents are an essential information source used to monitor, track, and analyze nanotechnology. When it comes to search nanotechnology-related patents, a keyword search is often incomplete and struggles to cover such an interdisciplinary discipline. Patent classification schemes can reveal far better results since they are assigned by experts who classify the patent documents according to their technology. In this paper, we present the most important classifications to search nanotechnology patents and analyze how nanotechnology is covered in the main patent classification systems used in search systems nowadays: the International Patent Classification (IPC), the United States Patent Classification (USPC), and the Cooperative Patent Classification (CPC). We conclude that nanotechnology has a significantly better patent coverage in the CPC since considerable more nanotechnology documents were retrieved than by using other classifications, and thus, recommend its use for all professionals involved in nanotechnology patent searches.

  10. Monitoring nanotechnology using patent classifications: an overview and comparison of nanotechnology classification schemes

    International Nuclear Information System (INIS)

    Jürgens, Björn; Herrero-Solana, Victor

    2017-01-01

    Patents are an essential information source used to monitor, track, and analyze nanotechnology. When it comes to search nanotechnology-related patents, a keyword search is often incomplete and struggles to cover such an interdisciplinary discipline. Patent classification schemes can reveal far better results since they are assigned by experts who classify the patent documents according to their technology. In this paper, we present the most important classifications to search nanotechnology patents and analyze how nanotechnology is covered in the main patent classification systems used in search systems nowadays: the International Patent Classification (IPC), the United States Patent Classification (USPC), and the Cooperative Patent Classification (CPC). We conclude that nanotechnology has a significantly better patent coverage in the CPC since considerable more nanotechnology documents were retrieved than by using other classifications, and thus, recommend its use for all professionals involved in nanotechnology patent searches.

  11. Single-labelled music genre classification using content-based features

    CSIR Research Space (South Africa)

    Ajoodha, R

    2015-11-01

    Full Text Available In this paper we use content-based features to perform automatic classification of music pieces into genres. We categorise these features into four groups: features extracted from the Fourier transform’s magnitude spectrum, features designed...

  12. Current Trends in the Molecular Classification of Renal Neoplasms

    Directory of Open Access Journals (Sweden)

    Andrew N. Young

    2006-01-01

    Full Text Available Renal cell carcinoma (RCC is the most common form of kidney cancer in adults. RCC is a significant challenge for pathologic diagnosis and clinical management. The primary approach to diagnosis is by light microscopy, using the World Health Organization (WHO classification system, which defines histopathologic tumor subtypes with distinct clinical behavior and underlying genetic mutations. However, light microscopic diagnosis of RCC subtypes is often difficult due to variable histology. In addition, the clinical behavior of RCC is highly variable and therapeutic response rates are poor. Few clinical assays are available to predict outcome in RCC or correlate behavior with histology. Therefore, novel RCC classification systems based on gene expression should be useful for diagnosis, prognosis, and treatment. Recent microarray studies have shown that renal tumors are characterized by distinct gene expression profiles, which can be used to discover novel diagnostic and prognostic biomarkers. Here, we review clinical features of kidney cancer, the WHO classification system, and the growing role of molecular classification for diagnosis, prognosis, and therapy of this disease.

  13. Domain Adaptation for Opinion Classification: A Self-Training Approach

    Directory of Open Access Journals (Sweden)

    Yu, Ning

    2013-03-01

    Full Text Available Domain transfer is a widely recognized problem for machine learning algorithms because models built upon one data domain generally do not perform well in another data domain. This is especially a challenge for tasks such as opinion classification, which often has to deal with insufficient quantities of labeled data. This study investigates the feasibility of self-training in dealing with the domain transfer problem in opinion classification via leveraging labeled data in non-target data domain(s and unlabeled data in the target-domain. Specifically, self-training is evaluated for effectiveness in sparse data situations and feasibility for domain adaptation in opinion classification. Three types of Web content are tested: edited news articles, semi-structured movie reviews, and the informal and unstructured content of the blogosphere. Findings of this study suggest that, when there are limited labeled data, self-training is a promising approach for opinion classification, although the contributions vary across data domains. Significant improvement was demonstrated for the most challenging data domain-the blogosphere-when a domain transfer-based self-training strategy was implemented.

  14. A systematic literature review of automated clinical coding and classification systems.

    Science.gov (United States)

    Stanfill, Mary H; Williams, Margaret; Fenton, Susan H; Jenders, Robert A; Hersh, William R

    2010-01-01

    Clinical coding and classification processes transform natural language descriptions in clinical text into data that can subsequently be used for clinical care, research, and other purposes. This systematic literature review examined studies that evaluated all types of automated coding and classification systems to determine the performance of such systems. Studies indexed in Medline or other relevant databases prior to March 2009 were considered. The 113 studies included in this review show that automated tools exist for a variety of coding and classification purposes, focus on various healthcare specialties, and handle a wide variety of clinical document types. Automated coding and classification systems themselves are not generalizable, nor are the results of the studies evaluating them. Published research shows these systems hold promise, but these data must be considered in context, with performance relative to the complexity of the task and the desired outcome.

  15. Improving the Computational Performance of Ontology-Based Classification Using Graph Databases

    Directory of Open Access Journals (Sweden)

    Thomas J. Lampoltshammer

    2015-07-01

    Full Text Available The increasing availability of very high-resolution remote sensing imagery (i.e., from satellites, airborne laser scanning, or aerial photography represents both a blessing and a curse for researchers. The manual classification of these images, or other similar geo-sensor data, is time-consuming and leads to subjective and non-deterministic results. Due to this fact, (semi- automated classification approaches are in high demand in affected research areas. Ontologies provide a proper way of automated classification for various kinds of sensor data, including remotely sensed data. However, the processing of data entities—so-called individuals—is one of the most cost-intensive computational operations within ontology reasoning. Therefore, an approach based on graph databases is proposed to overcome the issue of a high time consumption regarding the classification task. The introduced approach shifts the classification task from the classical Protégé environment and its common reasoners to the proposed graph-based approaches. For the validation, the authors tested the approach on a simulation scenario based on a real-world example. The results demonstrate a quite promising improvement of classification speed—up to 80,000 times faster than the Protégé-based approach.

  16. EEG Eye State Identification Using Incremental Attribute Learning with Time-Series Classification

    Directory of Open Access Journals (Sweden)

    Ting Wang

    2014-01-01

    Full Text Available Eye state identification is a kind of common time-series classification problem which is also a hot spot in recent research. Electroencephalography (EEG is widely used in eye state classification to detect human's cognition state. Previous research has validated the feasibility of machine learning and statistical approaches for EEG eye state classification. This paper aims to propose a novel approach for EEG eye state identification using incremental attribute learning (IAL based on neural networks. IAL is a novel machine learning strategy which gradually imports and trains features one by one. Previous studies have verified that such an approach is applicable for solving a number of pattern recognition problems. However, in these previous works, little research on IAL focused on its application to time-series problems. Therefore, it is still unknown whether IAL can be employed to cope with time-series problems like EEG eye state classification. Experimental results in this study demonstrates that, with proper feature extraction and feature ordering, IAL can not only efficiently cope with time-series classification problems, but also exhibit better classification performance in terms of classification error rates in comparison with conventional and some other approaches.

  17. APPLICATION OF SENSOR FUSION TO IMPROVE UAV IMAGE CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    S. Jabari

    2017-08-01

    Full Text Available Image classification is one of the most important tasks of remote sensing projects including the ones that are based on using UAV images. Improving the quality of UAV images directly affects the classification results and can save a huge amount of time and effort in this area. In this study, we show that sensor fusion can improve image quality which results in increasing the accuracy of image classification. Here, we tested two sensor fusion configurations by using a Panchromatic (Pan camera along with either a colour camera or a four-band multi-spectral (MS camera. We use the Pan camera to benefit from its higher sensitivity and the colour or MS camera to benefit from its spectral properties. The resulting images are then compared to the ones acquired by a high resolution single Bayer-pattern colour camera (here referred to as HRC. We assessed the quality of the output images by performing image classification tests. The outputs prove that the proposed sensor fusion configurations can achieve higher accuracies compared to the images of the single Bayer-pattern colour camera. Therefore, incorporating a Pan camera on-board in the UAV missions and performing image fusion can help achieving higher quality images and accordingly higher accuracy classification results.

  18. An Efficient Ensemble Learning Method for Gene Microarray Classification

    Directory of Open Access Journals (Sweden)

    Alireza Osareh

    2013-01-01

    Full Text Available The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. However, it has been also revealed that the basic classification techniques have intrinsic drawbacks in achieving accurate gene classification and cancer diagnosis. On the other hand, classifier ensembles have received increasing attention in various applications. Here, we address the gene classification issue using RotBoost ensemble methodology. This method is a combination of Rotation Forest and AdaBoost techniques which in turn preserve both desirable features of an ensemble architecture, that is, accuracy and diversity. To select a concise subset of informative genes, 5 different feature selection algorithms are considered. To assess the efficiency of the RotBoost, other nonensemble/ensemble techniques including Decision Trees, Support Vector Machines, Rotation Forest, AdaBoost, and Bagging are also deployed. Experimental results have revealed that the combination of the fast correlation-based feature selection method with ICA-based RotBoost ensemble is highly effective for gene classification. In fact, the proposed method can create ensemble classifiers which outperform not only the classifiers produced by the conventional machine learning but also the classifiers generated by two widely used conventional ensemble learning methods, that is, Bagging and AdaBoost.

  19. Hindi vowel classification using QCN-MFCC features

    Directory of Open Access Journals (Sweden)

    Shipra Mishra

    2016-09-01

    Full Text Available In presence of environmental noise, speakers tend to emphasize their vocal effort to improve the audibility of voice. This involuntary adjustment is known as Lombard effect (LE. Due to LE the signal to noise ratio of speech increases, but at the same time the loudness, pitch and duration of phonemes changes. Hence, accuracy of automatic speech recognition systems degrades. In this paper, the effect of unsupervised equalization of Lombard effect is investigated for Hindi vowel classification task using Hindi database designed at TIFR Mumbai, India. Proposed Quantile-based Dynamic Cepstral Normalization MFCC (QCN-MFCC along with baseline MFCC features have been used for vowel classification. Hidden Markov Model (HMM is used as classifier. It is observed that QCN-MFCC features have given a maximum improvement of 5.97% and 5% over MFCC features for context-dependent and context-independent cases respectively. It is also observed that QCN-MFCC features have given improvement of 13% and 11.5% over MFCC features for context-dependent and context-independent classification of mid vowels.

  20. CLASS-PAIR-GUIDED MULTIPLE KERNEL LEARNING OF INTEGRATING HETEROGENEOUS FEATURES FOR CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    Q. Wang

    2017-10-01

    Full Text Available In recent years, many studies on remote sensing image classification have shown that using multiple features from different data sources can effectively improve the classification accuracy. As a very powerful means of learning, multiple kernel learning (MKL can conveniently be embedded in a variety of characteristics. The conventional combined kernel learned by MKL can be regarded as the compromise of all basic kernels for all classes in classification. It is the best of the whole, but not optimal for each specific class. For this problem, this paper proposes a class-pair-guided MKL method to integrate the heterogeneous features (HFs from multispectral image (MSI and light detection and ranging (LiDAR data. In particular, the one-against-one strategy is adopted, which converts multiclass classification problem to a plurality of two-class classification problem. Then, we select the best kernel from pre-constructed basic kernels set for each class-pair by kernel alignment (KA in the process of classification. The advantage of the proposed method is that only the best kernel for the classification of any two classes can be retained, which leads to greatly enhanced discriminability. Experiments are conducted on two real data sets, and the experimental results show that the proposed method achieves the best performance in terms of classification accuracies in integrating the HFs for classification when compared with several state-of-the-art algorithms.

  1. Deteksi Penyakit Dengue Hemorrhagic Fever dengan Pendekatan One Class Classification

    Directory of Open Access Journals (Sweden)

    Zida Ziyan Azkiya

    2017-10-01

    Full Text Available Two class classification problem maps input into two target classes. In certain cases, training data is available only in the form of a single class, as in the case of Dengue Hemorrhagic Fever (DHF patients, where only data of positive patients is available. In this paper, we report our experiment in building a classification model for detecting DHF infection using One Class Classification (OCC approach. Data from this study is sourced from laboratory tests of patients with dengue fever. The OCC methods compared are One-Class Support Vector Machine and One-Class K-Means. The result shows SVM method obtained precision value = 1.0, recall = 0.993, f-1 score = 0.997, and accuracy of 99.7% while the K-Means method obtained precision value = 0.901, recall = 0.973, f- 1 score = 0.936, and accuracy of 93.3%. This indicates that the SVM method is slightly superior to K-Means for One-Class Classification of DHF patients.

  2. BioNames: linking taxonomy, texts, and trees

    Directory of Open Access Journals (Sweden)

    Roderic D.M. Page

    2013-10-01

    Full Text Available BioNames is a web database of taxonomic names for animals, linked to the primary literature and, wherever possible, to phylogenetic trees. It aims to provide a taxonomic “dashboard” where at a glance we can see a summary of the taxonomic and phylogenetic information we have for a given taxon and hence provide a quick answer to the basic question “what is this taxon?” BioNames combines classifications from the Global Biodiversity Information Facility (GBIF and GenBank, images from the Encyclopedia of Life (EOL, animal names from the Index of Organism Names (ION, and bibliographic data from multiple sources including the Biodiversity Heritage Library (BHL and CrossRef. The user interface includes display of full text articles, interactive timelines of taxonomic publications, and zoomable phylogenies. It is available at http://bionames.org.

  3. Artificial neural network classification using a minimal training set - Comparison to conventional supervised classification

    Science.gov (United States)

    Hepner, George F.; Logan, Thomas; Ritter, Niles; Bryant, Nevin

    1990-01-01

    Recent research has shown an artificial neural network (ANN) to be capable of pattern recognition and the classification of image data. This paper examines the potential for the application of neural network computing to satellite image processing. A second objective is to provide a preliminary comparison and ANN classification. An artificial neural network can be trained to do land-cover classification of satellite imagery using selected sites representative of each class in a manner similar to conventional supervised classification. One of the major problems associated with recognition and classifications of pattern from remotely sensed data is the time and cost of developing a set of training sites. This reseach compares the use of an ANN back propagation classification procedure with a conventional supervised maximum likelihood classification procedure using a minimal training set. When using a minimal training set, the neural network is able to provide a land-cover classification superior to the classification derived from the conventional classification procedure. This research is the foundation for developing application parameters for further prototyping of software and hardware implementations for artificial neural networks in satellite image and geographic information processing.

  4. Small-scale classification schemes

    DEFF Research Database (Denmark)

    Hertzum, Morten

    2004-01-01

    Small-scale classification schemes are used extensively in the coordination of cooperative work. This study investigates the creation and use of a classification scheme for handling the system requirements during the redevelopment of a nation-wide information system. This requirements...... classification inherited a lot of its structure from the existing system and rendered requirements that transcended the framework laid out by the existing system almost invisible. As a result, the requirements classification became a defining element of the requirements-engineering process, though its main...... effects remained largely implicit. The requirements classification contributed to constraining the requirements-engineering process by supporting the software engineers in maintaining some level of control over the process. This way, the requirements classification provided the software engineers...

  5. CLASSIFICATION OF VISUAL IMPAIRED CHILDREN ON THE BASIS OF USAGE OF INFORMATION-COMMUNICATION TECHNOLOGIES USE IN EDUCATION.

    Directory of Open Access Journals (Sweden)

    K.O. Kosova

    2010-11-01

    Full Text Available The issues of visual impaired children systematical classification development are discussed in this article. New classification should connect visual abilities with using in education of specific Software and Hardware.

  6. Modified Mahalanobis Taguchi System for Imbalance Data Classification

    Directory of Open Access Journals (Sweden)

    Mahmoud El-Banna

    2017-01-01

    Full Text Available The Mahalanobis Taguchi System (MTS is considered one of the most promising binary classification algorithms to handle imbalance data. Unfortunately, MTS lacks a method for determining an efficient threshold for the binary classification. In this paper, a nonlinear optimization model is formulated based on minimizing the distance between MTS Receiver Operating Characteristics (ROC curve and the theoretical optimal point named Modified Mahalanobis Taguchi System (MMTS. To validate the MMTS classification efficacy, it has been benchmarked with Support Vector Machines (SVMs, Naive Bayes (NB, Probabilistic Mahalanobis Taguchi Systems (PTM, Synthetic Minority Oversampling Technique (SMOTE, Adaptive Conformal Transformation (ACT, Kernel Boundary Alignment (KBA, Hidden Naive Bayes (HNB, and other improved Naive Bayes algorithms. MMTS outperforms the benchmarked algorithms especially when the imbalance ratio is greater than 400. A real life case study on manufacturing sector is used to demonstrate the applicability of the proposed model and to compare its performance with Mahalanobis Genetic Algorithm (MGA.

  7. Multiview Discriminative Geometry Preserving Projection for Image Classification

    Directory of Open Access Journals (Sweden)

    Ziqiang Wang

    2014-01-01

    Full Text Available In many image classification applications, it is common to extract multiple visual features from different views to describe an image. Since different visual features have their own specific statistical properties and discriminative powers for image classification, the conventional solution for multiple view data is to concatenate these feature vectors as a new feature vector. However, this simple concatenation strategy not only ignores the complementary nature of different views, but also ends up with “curse of dimensionality.” To address this problem, we propose a novel multiview subspace learning algorithm in this paper, named multiview discriminative geometry preserving projection (MDGPP for feature extraction and classification. MDGPP can not only preserve the intraclass geometry and interclass discrimination information under a single view, but also explore the complementary property of different views to obtain a low-dimensional optimal consensus embedding by using an alternating-optimization-based iterative algorithm. Experimental results on face recognition and facial expression recognition demonstrate the effectiveness of the proposed algorithm.

  8. A Two-Level Sound Classification Platform for Environmental Monitoring

    Directory of Open Access Journals (Sweden)

    Stelios A. Mitilineos

    2018-01-01

    Full Text Available STORM is an ongoing European research project that aims at developing an integrated platform for monitoring, protecting, and managing cultural heritage sites through technical and organizational innovation. Part of the scheduled preventive actions for the protection of cultural heritage is the development of wireless acoustic sensor networks (WASNs that will be used for assessing the impact of human-generated activities as well as for monitoring potentially hazardous environmental phenomena. Collected sound samples will be forwarded to a central server where they will be automatically classified in a hierarchical manner; anthropogenic and environmental activity will be monitored, and stakeholders will be alarmed in the case of potential malevolent behavior or natural phenomena like excess rainfall, fire, gale, high tides, and waves. Herein, we present an integrated platform that includes sound sample denoising using wavelets, feature extraction from sound samples, Gaussian mixture modeling of these features, and a powerful two-layer neural network for automatic classification. We contribute to previous work by extending the proposed classification platform to perform low-level classification too, i.e., classify sounds to further subclasses that include airplane, car, and pistol sounds for the anthropogenic sound class; bird, dog, and snake sounds for the biophysical sound class; and fire, waterfall, and gale for the geophysical sound class. Classification results exhibit outstanding classification accuracy in both high-level and low-level classification thus demonstrating the feasibility of the proposed approach.

  9. Automated classification of mouse pup isolation syllables: from cluster analysis to an Excel based ‘mouse pup syllable classification calculator’

    Directory of Open Access Journals (Sweden)

    Jasmine eGrimsley

    2013-01-01

    Full Text Available Mouse pups vocalize at high rates when they are cold or isolated from the nest. The proportions of each syllable type produced carry information about disease state and are being used as behavioral markers for the internal state of animals. Manual classifications of these vocalizations identified ten syllable types based on their spectro-temporal features. However, manual classification of mouse syllables is time consuming and vulnerable to experimenter bias. This study uses an automated cluster analysis to identify acoustically distinct syllable types produced by CBA/CaJ mouse pups, and then compares the results to prior manual classification methods. The cluster analysis identified two syllable types, based on their frequency bands, that have continuous frequency-time structure, and two syllable types featuring abrupt frequency transitions. Although cluster analysis computed fewer syllable types than manual classification, the clusters represented well the probability distributions of the acoustic features within syllables. These probability distributions indicate that some of the manually classified syllable types are not statistically distinct. The characteristics of the four classified clusters were used to generate a Microsoft Excel-based mouse syllable classifier that rapidly categorizes syllables, with over a 90% match, into the syllable types determined by cluster analysis.

  10. Cataract in small animals: classification and treatment

    Directory of Open Access Journals (Sweden)

    Fahiano Montiani Ferreira

    1997-02-01

    Full Text Available Cataract means any opacity present in the lens, lens capsule or both. The opacities may vary in size, location, shape and rate of progression. By slit-lamp biomicroscopy it is possible to examine them with precision, determining its exact location and peculiarities, resulting in a safe, accurate diagnosis. Due to its variable origin and appearance, several methods of classification have been used. Classification by aetiology, grade of maturity, location and age of the patients are presented in this review. Surgical removal is the only effective therapy for this disease. Among the surgical techniques available to this day, endocapsular phacoemulsification excells for its better results, despite of its high cost, if compared to classical intra and extra capsular facectomies.

  11. GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION

    Directory of Open Access Journals (Sweden)

    N. Jamshidpour

    2017-09-01

    Full Text Available Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.

  12. OGIRISI: a New Journal of African Studies - Vol 5 (2008)

    African Journals Online (AJOL)

    A Stylistic Analysis Of The Language Of Political Campaigns In Nigeria: Evidence From The 2007 General Elections · EMAIL FREE FULL TEXT EMAIL FREE FULL TEXT · DOWNLOAD FULL TEXT DOWNLOAD FULL TEXT. VE Omozuwa, EUC Ezejideaku, 40-54. http://dx.doi.org/10.4314/og.v5i1.52327 ...

  13. Gynecomastia Classification for Surgical Management: A Systematic Review and Novel Classification System.

    Science.gov (United States)

    Waltho, Daniel; Hatchell, Alexandra; Thoma, Achilleas

    2017-03-01

    Gynecomastia is a common deformity of the male breast, where certain cases warrant surgical management. There are several surgical options, which vary depending on the breast characteristics. To guide surgical management, several classification systems for gynecomastia have been proposed. A systematic review was performed to (1) identify all classification systems for the surgical management of gynecomastia, and (2) determine the adequacy of these classification systems to appropriately categorize the condition for surgical decision-making. The search yielded 1012 articles, and 11 articles were included in the review. Eleven classification systems in total were ascertained, and a total of 10 unique features were identified: (1) breast size, (2) skin redundancy, (3) breast ptosis, (4) tissue predominance, (5) upper abdominal laxity, (6) breast tuberosity, (7) nipple malposition, (8) chest shape, (9) absence of sternal notch, and (10) breast skin elasticity. On average, classification systems included two or three of these features. Breast size and ptosis were the most commonly included features. Based on their review of the current classification systems, the authors believe the ideal classification system should be universal and cater to all causes of gynecomastia; be surgically useful and easy to use; and should include a comprehensive set of clinically appropriate patient-related features, such as breast size, breast ptosis, tissue predominance, and skin redundancy. None of the current classification systems appears to fulfill these criteria.

  14. Clay-illuvial soils in the Polish and international soil classifications

    Directory of Open Access Journals (Sweden)

    Kabała Cezary

    2015-12-01

    Full Text Available Soil with a clay-illuvial subsurface horizon are the most widespread soil type in Poland and significantly differ in morphology and properties developed under variable environmental conditions. Despite the long history of investigations, the rules of classification and cartography of clay-illuvial soils have been permanently discussed and modified. The distinction of clay-illuvial soils into three soil types, introduced to the Polish soil classification in 2011, has been criticized as excessively extended, non-coherent with the other parts and rules of the classification, hard to introduce in soil cartography and poorly correlated with the international soil classifications. One type of clay-illuvial soils (“gleby płowe” was justified and recommended to reintroduce in soil classification in Poland, as well as 10 soil subtypes listed in a hierarchical order. The subtypes may be combined if the soil has diagnostic features of more than one soil subtypes. Clear rules of soil name generalization (reduction of subtype number for one soil were suggested for soil cartography on various scales. One of the most important among the distinguished soil sub-types are the “eroded” or “truncated” clay-illuvial soils.

  15. Agent Collaborative Target Localization and Classification in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Sheng Wang

    2007-07-01

    Full Text Available Wireless sensor networks (WSNs are autonomous networks that have beenfrequently deployed to collaboratively perform target localization and classification tasks.Their autonomous and collaborative features resemble the characteristics of agents. Suchsimilarities inspire the development of heterogeneous agent architecture for WSN in thispaper. The proposed agent architecture views WSN as multi-agent systems and mobileagents are employed to reduce in-network communication. According to the architecture,an energy based acoustic localization algorithm is proposed. In localization, estimate oftarget location is obtained by steepest descent search. The search algorithm adapts tomeasurement environments by dynamically adjusting its termination condition. With theagent architecture, target classification is accomplished by distributed support vectormachine (SVM. Mobile agents are employed for feature extraction and distributed SVMlearning to reduce communication load. Desirable learning performance is guaranteed bycombining support vectors and convex hull vectors. Fusion algorithms are designed tomerge SVM classification decisions made from various modalities. Real world experimentswith MICAz sensor nodes are conducted for vehicle localization and classification.Experimental results show the proposed agent architecture remarkably facilitates WSNdesigns and algorithm implementation. The localization and classification algorithms alsoprove to be accurate and energy efficient.

  16. A Soft Intelligent Risk Evaluation Model for Credit Scoring Classification

    Directory of Open Access Journals (Sweden)

    Mehdi Khashei

    2015-09-01

    Full Text Available Risk management is one of the most important branches of business and finance. Classification models are the most popular and widely used analytical group of data mining approaches that can greatly help financial decision makers and managers to tackle credit risk problems. However, the literature clearly indicates that, despite proposing numerous classification models, credit scoring is often a difficult task. On the other hand, there is no universal credit-scoring model in the literature that can be accurately and explanatorily used in all circumstances. Therefore, the research for improving the efficiency of credit-scoring models has never stopped. In this paper, a hybrid soft intelligent classification model is proposed for credit-scoring problems. In the proposed model, the unique advantages of the soft computing techniques are used in order to modify the performance of the traditional artificial neural networks in credit scoring. Empirical results of Australian credit card data classifications indicate that the proposed hybrid model outperforms its components, and also other classification models presented for credit scoring. Therefore, the proposed model can be considered as an appropriate alternative tool for binary decision making in business and finance, especially in high uncertainty conditions.

  17. [The inadequacy of official classification of work accidents in Brazil].

    Science.gov (United States)

    Cordeiro, Ricardo

    2018-02-19

    Traditionally, work accidents in Brazil have been categorized in government documents and legal and academic texts as typical work accidents and commuting accidents. Given the increase in urban violence and the increasingly precarious work conditions in recent decades, this article addresses the conceptual inadequacy of this classification and its implications for the underestimation of work accidents in the country. An alternative classification is presented as an example and a contribution to the discussion on the improvement of statistics on work-related injuries in Brazil.

  18. Mining protein function from text using term-based support vector machines

    Science.gov (United States)

    Rice, Simon B; Nenadic, Goran; Stapley, Benjamin J

    2005-01-01

    Background Text mining has spurred huge interest in the domain of biology. The goal of the BioCreAtIvE exercise was to evaluate the performance of current text mining systems. We participated in Task 2, which addressed assigning Gene Ontology terms to human proteins and selecting relevant evidence from full-text documents. We approached it as a modified form of the document classification task. We used a supervised machine-learning approach (based on support vector machines) to assign protein function and select passages that support the assignments. As classification features, we used a protein's co-occurring terms that were automatically extracted from documents. Results The results evaluated by curators were modest, and quite variable for different problems: in many cases we have relatively good assignment of GO terms to proteins, but the selected supporting text was typically non-relevant (precision spanning from 3% to 50%). The method appears to work best when a substantial set of relevant documents is obtained, while it works poorly on single documents and/or short passages. The initial results suggest that our approach can also mine annotations from text even when an explicit statement relating a protein to a GO term is absent. Conclusion A machine learning approach to mining protein function predictions from text can yield good performance only if sufficient training data is available, and significant amount of supporting data is used for prediction. The most promising results are for combined document retrieval and GO term assignment, which calls for the integration of methods developed in BioCreAtIvE Task 1 and Task 2. PMID:15960835

  19. Experiments on Classification of Electroencephalography (EEG Signals in Imagination of Direction using Stacked Autoencoder

    Directory of Open Access Journals (Sweden)

    Kenta Tomonaga

    2017-08-01

    Full Text Available This paper presents classification methods for electroencephalography (EEG signals in imagination of direction measured by a portable EEG headset. In the authorsr previous studies, principal component analysis extracted significant features from EEG signals to construct neural network classifiers. To improve the performance, the authors have implemented a Stacked Autoencoder (SAE for the classification. The SAE carries out feature extraction and classification in a form of multi-layered neural network. Experimental results showed that the SAE outperformed the previous classifiers.

  20. Mineral resources of Slovakia, questions of classification and valuation

    Directory of Open Access Journals (Sweden)

    Baláž Peter

    1999-06-01

    Full Text Available According to the Constitution of Slovak Republic, mineral resources of Slovakia are in the ownership of Slovak Republic. In 1997, 721 exclusive mineral deposits of mineral fuels, metals and industrial minerals were registered in Slovakia. The classification for economic and uneconomic reserves/resources requires an annual updating, concerning changes of market mineral prices and mine production costs. In terms of economic valuation of mineral resources, a new United Nations international classification for reserves/resources appears as a perspective alternative. Changes of geological and mining legislation are necessary for real valuation of Slovak mineral resources.

  1. Supervised Self-Organizing Classification of Superresolution ISAR Images: An Anechoic Chamber Experiment

    Directory of Open Access Journals (Sweden)

    Radoi Emanuel

    2006-01-01

    Full Text Available The problem of the automatic classification of superresolution ISAR images is addressed in the paper. We describe an anechoic chamber experiment involving ten-scale-reduced aircraft models. The radar images of these targets are reconstructed using MUSIC-2D (multiple signal classification method coupled with two additional processing steps: phase unwrapping and symmetry enhancement. A feature vector is then proposed including Fourier descriptors and moment invariants, which are calculated from the target shape and the scattering center distribution extracted from each reconstructed image. The classification is finally performed by a new self-organizing neural network called SART (supervised ART, which is compared to two standard classifiers, MLP (multilayer perceptron and fuzzy KNN ( nearest neighbors. While the classification accuracy is similar, SART is shown to outperform the two other classifiers in terms of training speed and classification speed, especially for large databases. It is also easier to use since it does not require any input parameter related to its structure.

  2. CLASSIFICATION OF CROPLANDS THROUGH FUSION OF OPTICAL AND SAR TIME SERIES DATA

    Directory of Open Access Journals (Sweden)

    S. Park

    2016-06-01

    Full Text Available Many satellite sensors including Landsat series have been extensively used for land cover classification. Studies have been conducted to mitigate classification problems associated with the use of single data (e.g., such as cloud contamination through multi-sensor data fusion and the use of time series data. This study investigated two areas with different environment and climate conditions: one in South Korea and the other in US. Cropland classification was conducted by using multi-temporal Landsat 5, Radarsat-1 and digital elevation models (DEM based on two machine learning approaches (i.e., random forest and support vector machines. Seven classification scenarios were examined and evaluated through accuracy assessment. Results show that SVM produced the best performance (overall accuracy of 93.87% when using all temporal and spectral data as input variables. Normalized Difference Water Index (NDWI, SAR backscattering, and Normalized Difference Vegetation Index (NDVI were identified as more contributing variables than the others for cropland classification.

  3. A Novel Vehicle Classification Using Embedded Strain Gauge Sensors

    Directory of Open Access Journals (Sweden)

    Qi Wang

    2008-11-01

    Full Text Available Abstract: This paper presents a new vehicle classification and develops a traffic monitoring detector to provide reliable vehicle classification to aid traffic management systems. The basic principle of this approach is based on measuring the dynamic strain caused by vehicles across pavement to obtain the corresponding vehicle parameters – wheelbase and number of axles – to then accurately classify the vehicle. A system prototype with five embedded strain sensors was developed to validate the accuracy and effectiveness of the classification method. According to the special arrangement of the sensors and the different time a vehicle arrived at the sensors one can estimate the vehicle’s speed accurately, corresponding to the estimated vehicle wheelbase and number of axles. Because of measurement errors and vehicle characteristics, there is a lot of overlap between vehicle wheelbase patterns. Therefore, directly setting up a fixed threshold for vehicle classification often leads to low-accuracy results. Using the machine learning pattern recognition method to deal with this problem is believed as one of the most effective tools. In this study, support vector machines (SVMs were used to integrate the classification features extracted from the strain sensors to automatically classify vehicles into five types, ranging from small vehicles to combination trucks, along the lines of the Federal Highway Administration vehicle classification guide. Test bench and field experiments will be introduced in this paper. Two support vector machines classification algorithms (one-against-all, one-against-one are used to classify single sensor data and multiple sensor combination data. Comparison of the two classification method results shows that the classification accuracy is very close using single data or multiple data. Our results indicate that using multiclass SVM-based fusion multiple sensor data significantly improves

  4. Noi restituiri enesciene

    Directory of Open Access Journals (Sweden)

    Țăranu

    2017-10-01

    Full Text Available The article refers to George Enescu’s early creation or, more precisely, to what is preserved from this section in the archives of the George Enescu Museum in Bucharest. ”The fantasy for violin and orchestra”, which dates from 1896–1897 is an unsigned and unfinished score, without a well-defined stylistic individuality, the first part of ”Concert for piano and orchestra” (1897 still tending to a classic paradigm, with certain post-Brahmsian echoes, the variants of a ”Suite Roumaine” (1896, to be considered for the presence of folkloric ”engrams”, the ”Piano Suite in Four Hands” (1898, stylistically representing the academic area and finally, ”Pastorale Fantezie pour petit orchestre” (1899, which take place on more personal stylistic coordinates come one by one into the attention of the author. George Enescu’s school works denote a sovereign dominance of classical forms, an admirable mastery in symphonic construction, a remarkable feeling of the features of concert and chamber music. The research highlights the evolution of the young composer in time, his stylistic maturity, the gradual change of his creative optics: from the preoccupation for the academic exigencies to his own language, marked by elements of the Romanian folklore and announcing the appearance of the two ”Romanian Rhapsodies”

  5. Investigating and Annotating the Role of Citation in Biomedical Full-Text Articles.

    Science.gov (United States)

    Yu, Hong; Agarwal, Shashank; Frid, Nadya

    2009-11-01

    Citations are ubiquitous in scientific articles and play important roles for representing the semantic content of a full-text biomedical article. In this work, we manually examined full-text biomedical articles to analyze the semantic content of citations in full-text biomedical articles. After developing a citation relation schema and annotation guideline, our pilot annotation results show an overall agreement of 0.71, and here we report on the research challenges and the lessons we've learned while trying to overcome them. Our work is a first step toward automatic citation classification in full-text biomedical articles, which may contribute to many text mining tasks, including information retrieval, extraction, summarization, and question answering.

  6. A 38 million words Dutch text corpus and its users | Kruyt | Lexikos

    African Journals Online (AJOL)

    In August 1996, the 38 Million Words Corpus was available for consultation by the international research community. The present paper reports on the characteristics of this corpus (design, text classification, linguistic annotation) and on its use, both in dictionary projects and in linguistic research. In spite of limitations with ...

  7. ASIST SIG/CR Classification Workshop 2000: Classification for User Support and Learning.

    Science.gov (United States)

    Soergel, Dagobert

    2001-01-01

    Reports on papers presented at the 62nd Annual Meeting of ASIST (American Society for Information Science and Technology) for the Special Interest Group in Classification Research (SIG/CR). Topics include types of knowledge; developing user-oriented classifications, including domain analysis; classification in the user interface; and automatic…

  8. Couinaud's classification v.s. Cho's classification. Their feasibility in the right hepatic lobe

    International Nuclear Information System (INIS)

    Shioyama, Yasukazu; Ikeda, Hiroaki; Sato, Motohito; Yoshimi, Fuyo; Kishi, Kazushi; Sato, Morio; Kimura, Masashi

    2008-01-01

    The objective of this study was to investigate if the new classification system proposed by Cho is feasible to clinical usage comparing with the classical Couinaud's one. One hundred consecutive cases of abdominal CT were studied using a 64 or an 8 slice multislice CT and created three dimensional portal vein images for analysis by the Workstation. We applied both Cho's classification and the classical Couinaud's one for each cases according to their definitions. Three diagnostic radiologists assessed their feasibility as category one (unable to classify) to five (clear to classify with total suit with the original classification criteria). And in each cases, we tried to judge whether Cho's or the classical Couinaud' classification could more easily transmit anatomical information. Analyzers could classified portal veins clearly (category 5) in 77 to 80% of cases and clearly (category 5) or almost clearly (category 4) in 86-93% along with both classifications. In the feasibility of classification, there was no statistically significant difference between two classifications. In 15 cases we felt that using Couinaud's classification is more convenient for us to transmit anatomical information to physicians than using Cho's one, because in these cases we noticed two large portal veins ramify from right main portal vein cranialy and caudaly and then we could not classify P5 as a branch of antero-ventral segment (AVS). Conversely in 17 cases we felt Cho's classification is more convenient because we could not divide right posterior branch as P6 and P7 and in these cases the right posterior portal vein ramified to several small branches. The anterior fissure vein was clearly noticed in only 60 cases. Comparing the classical Couinaud's classification and Cho's one in feasility of classification, there was no statistically significant difference. We propose we routinely report hepatic anatomy with the classical Couinauds classification and in the preoperative cases we

  9. Werner State Structure and Entanglement Classification

    Directory of Open Access Journals (Sweden)

    David W. Lyons

    2012-01-01

    Full Text Available We present applications of the representation theory of Lie groups to the analysis of structure and local unitary classification of Werner states, sometimes called the decoherence-free states, which are states of n quantum bits left unchanged by local transformations that are the same on each particle. We introduce a multiqubit generalization of the singlet state and a construction that assembles these qubits into Werner states.

  10. Odontoid fracture that is not listed in the existing classifications A new subtype of odontoid fracture: case report

    Directory of Open Access Journals (Sweden)

    Adam D.

    2016-03-01

    Full Text Available Background: There is a significant variety of odontoid fracture classifications along with corresponding treatment strategies. There are though cases which cannot be framed within the existing classifications.

  11. Pathological Bases for a Robust Application of Cancer Molecular Classification

    Directory of Open Access Journals (Sweden)

    Salvador J. Diaz-Cano

    2015-04-01

    Full Text Available Any robust classification system depends on its purpose and must refer to accepted standards, its strength relying on predictive values and a careful consideration of known factors that can affect its reliability. In this context, a molecular classification of human cancer must refer to the current gold standard (histological classification and try to improve it with key prognosticators for metastatic potential, staging and grading. Although organ-specific examples have been published based on proteomics, transcriptomics and genomics evaluations, the most popular approach uses gene expression analysis as a direct correlate of cellular differentiation, which represents the key feature of the histological classification. RNA is a labile molecule that varies significantly according with the preservation protocol, its transcription reflect the adaptation of the tumor cells to the microenvironment, it can be passed through mechanisms of intercellular transference of genetic information (exosomes, and it is exposed to epigenetic modifications. More robust classifications should be based on stable molecules, at the genetic level represented by DNA to improve reliability, and its analysis must deal with the concept of intratumoral heterogeneity, which is at the origin of tumor progression and is the byproduct of the selection process during the clonal expansion and progression of neoplasms. The simultaneous analysis of multiple DNA targets and next generation sequencing offer the best practical approach for an analytical genomic classification of tumors.

  12. The Improvement of Land Cover Classification by Thermal Remote Sensing

    Directory of Open Access Journals (Sweden)

    Liya Sun

    2015-06-01

    Full Text Available Land cover classification has been widely investigated in remote sensing for agricultural, ecological and hydrological applications. Landsat images with multispectral bands are commonly used to study the numerous classification methods in order to improve the classification accuracy. Thermal remote sensing provides valuable information to investigate the effectiveness of the thermal bands in extracting land cover patterns. k-NN and Random Forest algorithms were applied to both the single Landsat 8 image and the time series Landsat 4/5 images for the Attert catchment in the Grand Duchy of Luxembourg, trained and validated by the ground-truth reference data considering the three level classification scheme from COoRdination of INformation on the Environment (CORINE using the 10-fold cross validation method. The accuracy assessment showed that compared to the visible and near infrared (VIS/NIR bands, the time series of thermal images alone can produce comparatively reliable land cover maps with the best overall accuracy of 98.7% to 99.1% for Level 1 classification and 93.9% to 96.3% for the Level 2 classification. In addition, the combination with the thermal band improves the overall accuracy by 5% and 6% for the single Landsat 8 image in Level 2 and Level 3 category and provides the best classified results with all seven bands for the time series of Landsat TM images.

  13. A new circulation type classification based upon Lagrangian air trajectories

    Directory of Open Access Journals (Sweden)

    Alexandre M. Ramos

    2014-10-01

    Full Text Available A new classification method of the large-scale circulation characteristic for a specific target area (NW Iberian Peninsula is presented, based on the analysis of 90-h backward trajectories arriving in this area calculated with the 3-D Lagrangian particle dispersion model FLEXPART. A cluster analysis is applied to separate the backward trajectories in up to five representative air streams for each day. Specific measures are then used to characterise the distinct air streams (e.g., curvature of the trajectories, cyclonic or anticyclonic flow, moisture evolution, origin and length of the trajectories. The robustness of the presented method is demonstrated in comparison with the Eulerian Lamb weather type classification.A case study of the 2003 heatwave is discussed in terms of the new Lagrangian circulation and the Lamb weather type classifications. It is shown that the new classification method adds valuable information about the pertinent meteorological conditions, which are missing in an Eulerian approach. The new method is climatologically evaluated for the five-year time period from December 1999 to November 2004. The ability of the method to capture the inter-seasonal circulation variability in the target region is shown. Furthermore, the multi-dimensional character of the classification is shortly discussed, in particular with respect to inter-seasonal differences. Finally, the relationship between the new Lagrangian classification and the precipitation in the target area is studied.

  14. LOCAL WEATHER CLASSIFICATIONS FOR ENVIRONMENTAL APPLICATIONS

    Directory of Open Access Journals (Sweden)

    Katarzyna PIOTROWICZ

    2013-03-01

    Full Text Available Two approaches of local weather type definitions are presented and illustrated for selected stations of Poland and Hungary. The subjective classification, continuing long traditions, especially in Poland, relies on diurnal values of local weather elements. The main types are defined according to temperature with some sub-types considering relative sunshine duration, diurnal precipitation totals, relative humidity and wind speed. The classification does not make a difference between the seasons of the year, but the occurrence of the classes obviously reflects the annual cycle. Another important feature of this classification is that only a minor part of the theoretically possible combination of the various types and sub-types occurs in all stations of both countries. The objective version of the classification starts from ten possible weather element which are reduced to four according to factor analysis, based on strong correlation between the elements. This analysis yields 3 to 4 factors depending on the specific criteria of selection. The further cluster analysis uses four selected weather elements belonging to different rotated factors. They are the diurnal mean values of temperature, of relative humidity, of cloudiness and of wind speed. From the possible ways of hierarchical cluster analysis (i.e. no a priori assumption on the number of classes, the method of furthest neighbours is selected, indicating the arguments of this decision in the paper. These local weather types are important tools in understanding the role of weather in various environmental indicators, in climatic generalisation of short samples by stratified sampling and in interpretation of the climate change.

  15. On the Issue in Classification of Financial Control Types

    Directory of Open Access Journals (Sweden)

    Lvova I. G.

    2014-10-01

    Full Text Available The article is devoted to the issues in classification types (forms of financial supervision, in order to regulate legal budget relationship. The author analyzes the existing in scientific literature approaches to the concept and content of internal and external controls

  16. Improving Anomaly Detection for Text-Based Protocols by Exploiting Message Structures

    Directory of Open Access Journals (Sweden)

    Christian M. Mueller

    2010-12-01

    Full Text Available Service platforms using text-based protocols need to be protected against attacks. Machine-learning algorithms with pattern matching can be used to detect even previously unknown attacks. In this paper, we present an extension to known Support Vector Machine (SVM based anomaly detection algorithms for the Session Initiation Protocol (SIP. Our contribution is to extend the amount of different features used for classification (feature space by exploiting the structure of SIP messages, which reduces the false positive rate. Additionally, we show how combining our approach with attribute reduction significantly improves throughput.

  17. Classification of Pulse Waveforms Using Edit Distance with Real Penalty

    Directory of Open Access Journals (Sweden)

    Zhang Dongyu

    2010-01-01

    Full Text Available Abstract Advances in sensor and signal processing techniques have provided effective tools for quantitative research in traditional Chinese pulse diagnosis (TCPD. Because of the inevitable intraclass variation of pulse patterns, the automatic classification of pulse waveforms has remained a difficult problem. In this paper, by referring to the edit distance with real penalty (ERP and the recent progress in -nearest neighbors (KNN classifiers, we propose two novel ERP-based KNN classifiers. Taking advantage of the metric property of ERP, we first develop an ERP-induced inner product and a Gaussian ERP kernel, then embed them into difference-weighted KNN classifiers, and finally develop two novel classifiers for pulse waveform classification. The experimental results show that the proposed classifiers are effective for accurate classification of pulse waveform.

  18. Classification of Patients Treated for Infertility Using the IVF Method

    Directory of Open Access Journals (Sweden)

    Malinowski Paweł

    2015-12-01

    Full Text Available One of the most effective methods of infertility treatment is in vitro fertilization (IVF. Effectiveness of the treatment, as well as classification of the data obtained from it, is still an ongoing issue. Classifiers obtained so far are powerful, but even the best ones do not exhibit equal quality concerning possible treatment outcome predictions. Usually, lack of pregnancy is predicted far too often. This creates a constant need for further exploration of this issue. Careful use of different classification methods can, however, help to achieve that goal.

  19. Nonlinear Inertia Classification Model and Application

    Directory of Open Access Journals (Sweden)

    Mei Wang

    2014-01-01

    Full Text Available Classification model of support vector machine (SVM overcomes the problem of a big number of samples. But the kernel parameter and the punishment factor have great influence on the quality of SVM model. Particle swarm optimization (PSO is an evolutionary search algorithm based on the swarm intelligence, which is suitable for parameter optimization. Accordingly, a nonlinear inertia convergence classification model (NICCM is proposed after the nonlinear inertia convergence (NICPSO is developed in this paper. The velocity of NICPSO is firstly defined as the weighted velocity of the inertia PSO, and the inertia factor is selected to be a nonlinear function. NICPSO is used to optimize the kernel parameter and a punishment factor of SVM. Then, NICCM classifier is trained by using the optical punishment factor and the optical kernel parameter that comes from the optimal particle. Finally, NICCM is applied to the classification of the normal state and fault states of online power cable. It is experimentally proved that the iteration number for the proposed NICPSO to reach the optimal position decreases from 15 to 5 compared with PSO; the training duration is decreased by 0.0052 s and the recognition precision is increased by 4.12% compared with SVM.

  20. Classification with support hyperplanes

    NARCIS (Netherlands)

    G.I. Nalbantov (Georgi); J.C. Bioch (Cor); P.J.F. Groenen (Patrick)

    2006-01-01

    textabstractA new classification method is proposed, called Support Hy- perplanes (SHs). To solve the binary classification task, SHs consider the set of all hyperplanes that do not make classification mistakes, referred to as semi-consistent hyperplanes. A test object is classified using

  1. Text Mining of UU-ITE Implementation in Indonesia

    Science.gov (United States)

    Hakim, Lukmanul; Kusumasari, Tien F.; Lubis, Muharman

    2018-04-01

    At present, social media and networks act as one of the main platforms for sharing information, idea, thought and opinions. Many people share their knowledge and express their views on the specific topics or current hot issues that interest them. The social media texts have rich information about the complaints, comments, recommendation and suggestion as the automatic reaction or respond to government initiative or policy in order to overcome certain issues.This study examines the sentiment from netizensas part of citizen who has vocal sound about the implementation of UU ITE as the first cyberlaw in Indonesia as a means to identify the current tendency of citizen perception. To perform text mining techniques, this study used Twitter Rest API while R programming was utilized for the purpose of classification analysis based on hierarchical cluster.

  2. download full text

    African Journals Online (AJOL)

    Adopting a surveillance system for antibacterial use has therefore become a more realistic ..... Financial support was obtained from the African Poverty Related Infection ... classification and Defined Daily Dose system methodology in Canada.

  3. Multispectral LiDAR Data for Land Cover Classification of Urban Areas

    Directory of Open Access Journals (Sweden)

    Salem Morsy

    2017-04-01

    Full Text Available Airborne Light Detection And Ranging (LiDAR systems usually operate at a monochromatic wavelength measuring the range and the strength of the reflected energy (intensity from objects. Recently, multispectral LiDAR sensors, which acquire data at different wavelengths, have emerged. This allows for recording of a diversity of spectral reflectance from objects. In this context, we aim to investigate the use of multispectral LiDAR data in land cover classification using two different techniques. The first is image-based classification, where intensity and height images are created from LiDAR points and then a maximum likelihood classifier is applied. The second is point-based classification, where ground filtering and Normalized Difference Vegetation Indices (NDVIs computation are conducted. A dataset of an urban area located in Oshawa, Ontario, Canada, is classified into four classes: buildings, trees, roads and grass. An overall accuracy of up to 89.9% and 92.7% is achieved from image classification and 3D point classification, respectively. A radiometric correction model is also applied to the intensity data in order to remove the attenuation due to the system distortion and terrain height variation. The classification process is then repeated, and the results demonstrate that there are no significant improvements achieved in the overall accuracy.

  4. Literatūrinio modernizmo preliudai XIX–XX amžių sandūros Vilniuje. Interjero kultūra ir privačiojo estetizmo tekstai | Preludes of Literary Modernism in Fin-de-siècle Vilnius: Interior Culture and Private Aestheticism Texts

    Directory of Open Access Journals (Sweden)

    Mindaugas Kvietkauskas

    2006-01-01

    Full Text Available The genesis prerequisites of the early literary Modernism appear in Vilnius during the last decade of the 19th century when the city encountered the first manifestations of the cultural consciousness of the modern liberal bourgeoisie. The first publicist texts of the independent Russian media (in 1898 the daily Северо-Западное Слово started circulating, and the architectural discourse of the city (the stylistics of Vilnius Land Bank colonies witness the classical liberal ideology of modernisation according to which the rational progress of civilisation harmoniously coincides with the individualistic values. The then still very scarce texts of Vilnius Russian and Polish literature and aesthetics were gradually moving away from the positivism of the 19th century towards the expression of modern individualism, bourgeois privacy, value liberalism, and aesthetic refinement. Vilnius literary texts generously depicted aesthetical interiors as private individual space, cultural shades from the dehumanising effect of technical progress (W. Benjamin. Aesthetical, closed interior turns into a semantically crucial space in the Vilnius novels Dwór w Haliniszkach (1903 and Z miłości (1903 by the Polish writer E. Jeleńska-Dmochowska. The private interior fantasies and the feeling of social isolation are clear features of mentality in novellas by the Vilnius Russian writer Yevgeny Shveder (Наброски и силуэты, 1904. The larger part of the early creative period of Kazys Puida should also be attributed to the melancholic discourse of the private aestheticism (Iš sermėgiaus krūtinės, 1906, Ruduo, 1906. The texts by the above authors sporadically demonstrated stylistic features of Parnassian and Decadent aestheticisms. The first collection of Polish poems published in the 20th century in Vilnius, the book Fale by Stanisława Szadurska (1906 also represented some melancholic features of the salon aesthetics and the classical

  5. DOE LLW classification rationale

    International Nuclear Information System (INIS)

    Flores, A.Y.

    1991-01-01

    This report was about the rationale which the US Department of Energy had with low-level radioactive waste (LLW) classification. It is based on the Nuclear Regulatory Commission's classification system. DOE site operators met to review the qualifications and characteristics of the classification systems. They evaluated performance objectives, developed waste classification tables, and compiled dose limits on the waste. A goal of the LLW classification system was to allow each disposal site the freedom to develop limits to radionuclide inventories and concentrations according to its own site-specific characteristics. This goal was achieved with the adoption of a performance objectives system based on a performance assessment, with site-specific environmental conditions and engineered disposal systems

  6. Content Abstract Classification Using Naive Bayes

    Science.gov (United States)

    Latif, Syukriyanto; Suwardoyo, Untung; Aldrin Wihelmus Sanadi, Edwin

    2018-03-01

    This study aims to classify abstract content based on the use of the highest number of words in an abstract content of the English language journals. This research uses a system of text mining technology that extracts text data to search information from a set of documents. Abstract content of 120 data downloaded at www.computer.org. Data grouping consists of three categories: DM (Data Mining), ITS (Intelligent Transport System) and MM (Multimedia). Systems built using naive bayes algorithms to classify abstract journals and feature selection processes using term weighting to give weight to each word. Dimensional reduction techniques to reduce the dimensions of word counts rarely appear in each document based on dimensional reduction test parameters of 10% -90% of 5.344 words. The performance of the classification system is tested by using the Confusion Matrix based on comparative test data and test data. The results showed that the best classification results were obtained during the 75% training data test and 25% test data from the total data. Accuracy rates for categories of DM, ITS and MM were 100%, 100%, 86%. respectively with dimension reduction parameters of 30% and the value of learning rate between 0.1-0.5.

  7. Tax reliefs in the Russian Federation, their definition, types and classification

    Directory of Open Access Journals (Sweden)

    Natalia Soloveva

    2012-12-01

    Full Text Available The present article analyzes the definition of tax allowances that is fixed in Tax Code of the Russian Federation and classification of tax allowances into tax exceptions, tax abatements and tax discharges. The article also covers the author's classification of tax allowances into direct and indirect ones, according to economic benefits obtained by taxpayers as a result of using tax allowances. In the conclusion, the author determines an exhaustive list of tax allowances in the Russian tax legislation.

  8. Asteroid taxonomic classifications

    International Nuclear Information System (INIS)

    Tholen, D.J.

    1989-01-01

    This paper reports on three taxonomic classification schemes developed and applied to the body of available color and albedo data. Asteroid taxonomic classifications according to two of these schemes are reproduced

  9. SPORT FOOD ADDITIVE CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    I. P. Prokopenko

    2015-01-01

    Full Text Available Correctly organized nutritive and pharmacological support is an important component of an athlete's preparation for competitions, an optimal shape maintenance, fast recovery and rehabilitation after traumas and defatigation. Special products of enhanced biological value (BAS for athletes nutrition are used with this purpose. Easy-to-use energy sources are administered into athlete's organism, yielded materials and biologically active substances which regulate and activate exchange reactions which proceed with difficulties during certain physical trainings. The article presents sport supplements classification which can be used before warm-up and trainings, after trainings and in competitions breaks.

  10. A NEW WASTE CLASSIFYING MODEL: HOW WASTE CLASSIFICATION CAN BECOME MORE OBJECTIVE?

    Directory of Open Access Journals (Sweden)

    Burcea Stefan Gabriel

    2015-07-01

    Full Text Available The waste management specialist must be able to identify and analyze waste generation sources and to propose proper solutions to prevent the waste generation and encurage the waste minimisation. In certain situations like implementing an integrated waste management sustem and configure the waste collection methods and capacities, practitioners can face the challenge to classify the generated waste. This will tend to be the more demanding as the literature does not provide a coherent system of criteria required for an objective waste classification process. The waste incineration will determine no doubt a different waste classification than waste composting or mechanical and biological treatment. In this case the main question is what are the proper classification criteria witch can be used to realise an objective waste classification? The article provide a short critical literature review of the existing waste classification criteria and suggests the conclusion that the literature can not provide unitary waste classification system which is unanimously accepted and assumed by ideologists and practitioners. There are various classification criteria and more interesting perspectives in the literature regarding the waste classification, but the most common criteria based on which specialists classify waste into several classes, categories and types are the generation source, physical and chemical features, aggregation state, origin or derivation, hazardous degree etc. The traditional classification criteria divided waste into various categories, subcategories and types; such an approach is a conjectural one because is inevitable that according to the context in which the waste classification is required the used criteria to differ significantly; hence the need to uniformizating the waste classification systems. For the first part of the article it has been used indirect observation research method by analyzing the literature and the various

  11. Improving Hyperspectral Image Classification Method for Fine Land Use Assessment Application Using Semisupervised Machine Learning

    Directory of Open Access Journals (Sweden)

    Chunyang Wang

    2015-01-01

    Full Text Available Study on land use/cover can reflect changing rules of population, economy, agricultural structure adjustment, policy, and traffic and provide better service for the regional economic development and urban evolution. The study on fine land use/cover assessment using hyperspectral image classification is a focal growing area in many fields. Semisupervised learning method which takes a large number of unlabeled samples and minority labeled samples, improving classification and predicting the accuracy effectively, has been a new research direction. In this paper, we proposed improving fine land use/cover assessment based on semisupervised hyperspectral classification method. The test analysis of study area showed that the advantages of semisupervised classification method could improve the high precision overall classification and objective assessment of land use/cover results.

  12. Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features

    Directory of Open Access Journals (Sweden)

    Linyi Li

    2017-01-01

    Full Text Available In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images.

  13. A Framework for Text Mining in Scientometric Study: A Case Study in Biomedicine Publications

    Science.gov (United States)

    Silalahi, V. M. M.; Hardiyati, R.; Nadhiroh, I. M.; Handayani, T.; Rahmaida, R.; Amelia, M.

    2018-04-01

    The data of Indonesians research publications in the domain of biomedicine has been collected to be text mined for the purpose of a scientometric study. The goal is to build a predictive model that provides a classification of research publications on the potency for downstreaming. The model is based on the drug development processes adapted from the literatures. An effort is described to build the conceptual model and the development of a corpus on the research publications in the domain of Indonesian biomedicine. Then an investigation is conducted relating to the problems associated with building a corpus and validating the model. Based on our experience, a framework is proposed to manage the scientometric study based on text mining. Our method shows the effectiveness of conducting a scientometric study based on text mining in order to get a valid classification model. This valid model is mainly supported by the iterative and close interactions with the domain experts starting from identifying the issues, building a conceptual model, to the labelling, validation and results interpretation.

  14. A hierarchical approach of hybrid image classification for land use and land cover mapping

    Directory of Open Access Journals (Sweden)

    Rahdari Vahid

    2018-01-01

    Full Text Available Remote sensing data analysis can provide thematic maps describing land-use and land-cover (LULC in a short period. Using proper image classification method in an area, is important to overcome the possible limitations of satellite imageries for producing land-use and land-cover maps. In the present study, a hierarchical hybrid image classification method was used to produce LULC maps using Landsat Thematic mapper TM for the year of 1998 and operational land imager OLI for the year of 2016. Images were classified using the proposed hybrid image classification method, vegetation cover crown percentage map from normalized difference vegetation index, Fisher supervised classification and object-based image classification methods. Accuracy assessment results showed that the hybrid classification method produced maps with total accuracy up to 84 percent with kappa statistic value 0.81. Results of this study showed that the proposed classification method worked better with OLI sensor than with TM. Although OLI has a higher radiometric resolution than TM, the produced LULC map using TM is almost accurate like OLI, which is because of LULC definitions and image classification methods used.

  15. An Improved Brain-Inspired Emotional Learning Algorithm for Fast Classification

    Directory of Open Access Journals (Sweden)

    Ying Mei

    2017-06-01

    Full Text Available Classification is an important task of machine intelligence in the field of information. The artificial neural network (ANN is widely used for classification. However, the traditional ANN shows slow training speed, and it is hard to meet the real-time requirement for large-scale applications. In this paper, an improved brain-inspired emotional learning (BEL algorithm is proposed for fast classification. The BEL algorithm was put forward to mimic the high speed of the emotional learning mechanism in mammalian brain, which has the superior features of fast learning and low computational complexity. To improve the accuracy of BEL in classification, the genetic algorithm (GA is adopted for optimally tuning the weights and biases of amygdala and orbitofrontal cortex in the BEL neural network. The combinational algorithm named as GA-BEL has been tested on eight University of California at Irvine (UCI datasets and two well-known databases (Japanese Female Facial Expression, Cohn–Kanade. The comparisons of experiments indicate that the proposed GA-BEL is more accurate than the original BEL algorithm, and it is much faster than the traditional algorithm.

  16. Comparison Of Power Quality Disturbances Classification Based On Neural Network

    Directory of Open Access Journals (Sweden)

    Nway Nway Kyaw Win

    2015-07-01

    Full Text Available Abstract Power quality disturbances PQDs result serious problems in the reliability safety and economy of power system network. In order to improve electric power quality events the detection and classification of PQDs must be made type of transient fault. Software analysis of wavelet transform with multiresolution analysis MRA algorithm and feed forward neural network probabilistic and multilayer feed forward neural network based methodology for automatic classification of eight types of PQ signals flicker harmonics sag swell impulse fluctuation notch and oscillatory will be presented. The wavelet family Db4 is chosen in this system to calculate the values of detailed energy distributions as input features for classification because it can perform well in detecting and localizing various types of PQ disturbances. This technique classifies the types of PQDs problem sevents.The classifiers classify and identify the disturbance type according to the energy distribution. The results show that the PNN can analyze different power disturbance types efficiently. Therefore it can be seen that PNN has better classification accuracy than MLFF.

  17. A Novel Algorithm for Imbalance Data Classification Based on Neighborhood Hypergraph

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2014-01-01

    Full Text Available The classification problem for imbalance data is paid more attention to. So far, many significant methods are proposed and applied to many fields. But more efficient methods are needed still. Hypergraph may not be powerful enough to deal with the data in boundary region, although it is an efficient tool to knowledge discovery. In this paper, the neighborhood hypergraph is presented, combining rough set theory and hypergraph. After that, a novel classification algorithm for imbalance data based on neighborhood hypergraph is developed, which is composed of three steps: initialization of hyperedge, classification of training data set, and substitution of hyperedge. After conducting an experiment of 10-fold cross validation on 18 data sets, the proposed algorithm has higher average accuracy than others.

  18. The Performance of LBP and NSVC Combination Applied to Face Classification

    Directory of Open Access Journals (Sweden)

    Mohammed Ngadi

    2016-01-01

    Full Text Available The growing demand in the field of security led to the development of interesting approaches in face classification. These works are interested since their beginning in extracting the invariant features of the face to build a single model easily identifiable by classification algorithms. Our goal in this article is to develop more efficient practical methods for face detection. We present a new fast and accurate approach based on local binary patterns (LBP for the extraction of the features that is combined with the new classifier Neighboring Support Vector Classifier (NSVC for classification. The experimental results on different natural images show that the proposed method can get very good results at a very short detection time. The best precision obtained by LBP-NSVC exceeds 99%.

  19. Sparse Representation Based Multi-Instance Learning for Breast Ultrasound Image Classification

    Directory of Open Access Journals (Sweden)

    Lu Bing

    2017-01-01

    Full Text Available We propose a novel method based on sparse representation for breast ultrasound image classification under the framework of multi-instance learning (MIL. After image enhancement and segmentation, concentric circle is used to extract the global and local features for improving the accuracy in diagnosis and prediction. The classification problem of ultrasound image is converted to sparse representation based MIL problem. Each instance of a bag is represented as a sparse linear combination of all basis vectors in the dictionary, and then the bag is represented by one feature vector which is obtained via sparse representations of all instances within the bag. The sparse and MIL problem is further converted to a conventional learning problem that is solved by relevance vector machine (RVM. Results of single classifiers are combined to be used for classification. Experimental results on the breast cancer datasets demonstrate the superiority of the proposed method in terms of classification accuracy as compared with state-of-the-art MIL methods.

  20. Classification of Indonesian quote on Twitter using Naïve Bayes

    Science.gov (United States)

    Rachmadany, A.; Pranoto, Y. M.; Gunawan; Multazam, M. T.; Nandiyanto, A. B. D.; Abdullah, A. G.; Widiaty, I.

    2018-01-01

    Quote is sentences made in the hope that someone can become strong personalities, individuals who always improve themselves to move forward and achieve success. Social media is a place for people to express his heart to the world that sometimes the expression of the heart is quotes. Here, the purpose of this study was to classify Indonesian quote on Twitter using Naïve Bayes. This experiment uses text classification from Twitter data written by Twitter users which are quote then classification again grouped into 6 categories (Love, Life, Motivation, Education, Religion, Others). The language used is Indonesian. The method used is Naive Bayes. The results of this experiment are a web application collection of Indonesian quote that have been classified. This classification gives the user ease in finding quote based on class or keyword. For example, when a user wants to find a 'motivation' quote, this classification would be very useful.