WorldWideScience

Sample records for text mining-based approach

  1. The Application of Machine Learning Algorithms for Text Mining based on Sentiment Analysis Approach

    Directory of Open Access Journals (Sweden)

    Reza Samizade

    2018-06-01

    Full Text Available Classification of the cyber texts and comments into two categories of positive and negative sentiment among social media users is of high importance in the research are related to text mining. In this research, we applied supervised classification methods to classify Persian texts based on sentiment in cyber space. The result of this research is in a form of a system that can decide whether a comment which is published in cyber space such as social networks is considered positive or negative. The comments that are published in Persian movie and movie review websites from 1392 to 1395 are considered as the data set for this research. A part of these data are considered as training and others are considered as testing data. Prior to implementing the algorithms, pre-processing activities such as tokenizing, removing stop words, and n-germs process were applied on the texts. Naïve Bayes, Neural Networks and support vector machine were used for text classification in this study. Out of sample tests showed that there is no evidence indicating that the accuracy of SVM approach is statistically higher than Naïve Bayes or that the accuracy of Naïve Bayes is not statistically higher than NN approach. However, the researchers can conclude that the accuracy of the classification using SVM approach is statistically higher than the accuracy of NN approach in 5% confidence level.

  2. Argo: an integrative, interactive, text mining-based workbench supporting curation

    Science.gov (United States)

    Rak, Rafal; Rowley, Andrew; Black, William; Ananiadou, Sophia

    2012-01-01

    Curation of biomedical literature is often supported by the automatic analysis of textual content that generally involves a sequence of individual processing components. Text mining (TM) has been used to enhance the process of manual biocuration, but has been focused on specific databases and tasks rather than an environment integrating TM tools into the curation pipeline, catering for a variety of tasks, types of information and applications. Processing components usually come from different sources and often lack interoperability. The well established Unstructured Information Management Architecture is a framework that addresses interoperability by defining common data structures and interfaces. However, most of the efforts are targeted towards software developers and are not suitable for curators, or are otherwise inconvenient to use on a higher level of abstraction. To overcome these issues we introduce Argo, an interoperable, integrative, interactive and collaborative system for text analysis with a convenient graphic user interface to ease the development of processing workflows and boost productivity in labour-intensive manual curation. Robust, scalable text analytics follow a modular approach, adopting component modules for distinct levels of text analysis. The user interface is available entirely through a web browser that saves the user from going through often complicated and platform-dependent installation procedures. Argo comes with a predefined set of processing components commonly used in text analysis, while giving the users the ability to deposit their own components. The system accommodates various areas and levels of user expertise, from TM and computational linguistics to ontology-based curation. One of the key functionalities of Argo is its ability to seamlessly incorporate user-interactive components, such as manual annotation editors, into otherwise completely automatic pipelines. As a use case, we demonstrate the functionality of an in

  3. A Process Mining Based Service Composition Approach for Mobile Information Systems

    Directory of Open Access Journals (Sweden)

    Chengxi Huang

    2017-01-01

    Full Text Available Due to the growing trend in applying big data and cloud computing technologies in information systems, it is becoming an important issue to handle the connection between large scale of data and the associated business processes in the Internet of Everything (IoE environment. Service composition as a widely used phase in system development has some limits when the complexity of relationship among data increases. Considering the expanding scale and the variety of devices in mobile information systems, a process mining based service composition approach is proposed in this paper in order to improve the adaptiveness and efficiency of compositions. Firstly, a preprocessing is conducted to extract existing service execution information from server-side logs. Then process mining algorithms are applied to discover the overall event sequence with preprocessed data. After that, a scene-based service composition is applied to aggregate scene information and relocate services of the system. Finally, a case study that applied the work in mobile medical application proves that the approach is practical and valuable in improving service composition adaptiveness and efficiency.

  4. Fuzzy OLAP association rules mining-based modular reinforcement learning approach for multiagent systems.

    Science.gov (United States)

    Kaya, Mehmet; Alhajj, Reda

    2005-04-01

    Multiagent systems and data mining have recently attracted considerable attention in the field of computing. Reinforcement learning is the most commonly used learning process for multiagent systems. However, it still has some drawbacks, including modeling other learning agents present in the domain as part of the state of the environment, and some states are experienced much less than others, or some state-action pairs are never visited during the learning phase. Further, before completing the learning process, an agent cannot exhibit a certain behavior in some states that may be experienced sufficiently. In this study, we propose a novel multiagent learning approach to handle these problems. Our approach is based on utilizing the mining process for modular cooperative learning systems. It incorporates fuzziness and online analytical processing (OLAP) based mining to effectively process the information reported by agents. First, we describe a fuzzy data cube OLAP architecture which facilitates effective storage and processing of the state information reported by agents. This way, the action of the other agent, not even in the visual environment. of the agent under consideration, can simply be predicted by extracting online association rules, a well-known data mining technique, from the constructed data cube. Second, we present a new action selection model, which is also based on association rules mining. Finally, we generalize not sufficiently experienced states, by mining multilevel association rules from the proposed fuzzy data cube. Experimental results obtained on two different versions of a well-known pursuit domain show the robustness and effectiveness of the proposed fuzzy OLAP mining based modular learning approach. Finally, we tested the scalability of the approach presented in this paper and compared it with our previous work on modular-fuzzy Q-learning and ordinary Q-learning.

  5. Text mining-based in silico drug discovery in oral mucositis caused by high-dose cancer therapy.

    Science.gov (United States)

    Kirk, Jon; Shah, Nirav; Noll, Braxton; Stevens, Craig B; Lawler, Marshall; Mougeot, Farah B; Mougeot, Jean-Luc C

    2018-08-01

    Oral mucositis (OM) is a major dose-limiting side effect of chemotherapy and radiation used in cancer treatment. Due to the complex nature of OM, currently available drug-based treatments are of limited efficacy. Our objectives were (i) to determine genes and molecular pathways associated with OM and wound healing using computational tools and publicly available data and (ii) to identify drugs formulated for topical use targeting the relevant OM molecular pathways. OM and wound healing-associated genes were determined by text mining, and the intersection of the two gene sets was selected for gene ontology analysis using the GeneCodis program. Protein interaction network analysis was performed using STRING-db. Enriched gene sets belonging to the identified pathways were queried against the Drug-Gene Interaction database to find drug candidates for topical use in OM. Our analysis identified 447 genes common to both the "OM" and "wound healing" text mining concepts. Gene enrichment analysis yielded 20 genes representing six pathways and targetable by a total of 32 drugs which could possibly be formulated for topical application. A manual search on ClinicalTrials.gov confirmed no relevant pathway/drug candidate had been overlooked. Twenty-five of the 32 drugs can directly affect the PTGS2 (COX-2) pathway, the pathway that has been targeted in previous clinical trials with limited success. Drug discovery using in silico text mining and pathway analysis tools can facilitate the identification of existing drugs that have the potential of topical administration to improve OM treatment.

  6. ASCOT: a text mining-based web-service for efficient search and assisted creation of clinical trials

    Science.gov (United States)

    2012-01-01

    Clinical trials are mandatory protocols describing medical research on humans and among the most valuable sources of medical practice evidence. Searching for trials relevant to some query is laborious due to the immense number of existing protocols. Apart from search, writing new trials includes composing detailed eligibility criteria, which might be time-consuming, especially for new researchers. In this paper we present ASCOT, an efficient search application customised for clinical trials. ASCOT uses text mining and data mining methods to enrich clinical trials with metadata, that in turn serve as effective tools to narrow down search. In addition, ASCOT integrates a component for recommending eligibility criteria based on a set of selected protocols. PMID:22595088

  7. Theoretical approaches to creation of robotic coal mines based on the synthesis of simulation technologies

    Science.gov (United States)

    Fryanov, V. N.; Pavlova, L. D.; Temlyantsev, M. V.

    2017-09-01

    Methodological approaches to theoretical substantiation of the structure and parameters of robotic coal mines are outlined. The results of mathematical and numerical modeling revealed the features of manifestation of geomechanical and gas dynamic processes in the conditions of robotic mines. Technological solutions for the design and manufacture of technical means for robotic mine are adopted using the method of economic and mathematical modeling and in accordance with the current regulatory documents. For a comparative performance evaluation of technological schemes of traditional and robotic mines, methods of cognitive modeling and matrix search for subsystem elements in the synthesis of a complex geotechnological system are applied. It is substantiated that the process of technical re-equipment of a traditional mine with a phased transition to a robotic mine will reduce unit costs by almost 1.5 times with a significant social effect due to a reduction in the number of personnel engaged in hazardous work.

  8. Working with text tools, techniques and approaches for text mining

    CERN Document Server

    Tourte, Gregory J L

    2016-01-01

    Text mining tools and technologies have long been a part of the repository world, where they have been applied to a variety of purposes, from pragmatic aims to support tools. Research areas as diverse as biology, chemistry, sociology and criminology have seen effective use made of text mining technologies. Working With Text collects a subset of the best contributions from the 'Working with text: Tools, techniques and approaches for text mining' workshop, alongside contributions from experts in the area. Text mining tools and technologies in support of academic research include supporting research on the basis of a large body of documents, facilitating access to and reuse of extant work, and bridging between the formal academic world and areas such as traditional and social media. Jisc have funded a number of projects, including NaCTem (the National Centre for Text Mining) and the ResDis programme. Contents are developed from workshop submissions and invited contributions, including: Legal considerations in te...

  9. Networks Models of Actin Dynamics during Spermatozoa Postejaculatory Life: A Comparison among Human-Made and Text Mining-Based Models

    Directory of Open Access Journals (Sweden)

    Nicola Bernabò

    2016-01-01

    Full Text Available Here we realized a networks-based model representing the process of actin remodelling that occurs during the acquisition of fertilizing ability of human spermatozoa (HumanMade_ActinSpermNetwork, HM_ASN. Then, we compared it with the networks provided by two different text mining tools: Agilent Literature Search (ALS and PESCADOR. As a reference, we used the data from the online repository Kyoto Encyclopaedia of Genes and Genomes (KEGG, referred to the actin dynamics in a more general biological context. We found that HM_ALS and the networks from KEGG data shared the same scale-free topology following the Barabasi-Albert model, thus suggesting that the information is spread within the network quickly and efficiently. On the contrary, the networks obtained by ALS and PESCADOR have a scale-free hierarchical architecture, which implies a different pattern of information transmission. Also, the hubs identified within the networks are different: HM_ALS and KEGG networks contain as hubs several molecules known to be involved in actin signalling; ALS was unable to find other hubs than “actin,” whereas PESCADOR gave some nonspecific result. This seems to suggest that the human-made information retrieval in the case of a specific event, such as actin dynamics in human spermatozoa, could be a reliable strategy.

  10. Text mining with R a tidy approach

    CERN Document Server

    Silge, Julia

    2017-01-01

    Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you'll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You'll learn how tidytext and other tidy tools in R can make text analysis easier and more effective. The authors demonstrate how treating text as data frames enables you to manipulate, summarize, and visualize characteristics of text. You'll also learn how to integrate natural language processing (NLP) into effective workflows. Practical code examples and data explorations will help you generate real insights from literature, news, and social media. Learn how to apply the tidy text format to NLP Use sentiment analysis to mine the emotional content of text Identify a document's most important terms with frequency measurements E...

  11. Texts, Transmissions, Receptions. Modern Approaches to Narratives

    NARCIS (Netherlands)

    Lardinois, A.P.M.H.; Levie, S.A.; Hoeken, H.; Lüthy, C.H.

    2015-01-01

    The papers collected in this volume study the function and meaning of narrative texts from a variety of perspectives. The word 'text' is used here in the broadest sense of the term: it denotes literary books, but also oral tales, speeches, newspaper articles and comics. One of the purposes of this

  12. Writing Treatment for Aphasia: A Texting Approach

    Science.gov (United States)

    Beeson, Pelagie M.; Higginson, Kristina; Rising, Kindle

    2013-01-01

    Purpose: Treatment studies have documented the therapeutic and functional value of lexical writing treatment for individuals with severe aphasia. The purpose of this study was to determine whether such retraining could be accomplished using the typing feature of a cellular telephone, with the ultimate goal of using text messaging for…

  13. Pedoinformatics Approach to Soil Text Analytics

    Science.gov (United States)

    Furey, J.; Seiter, J.; Davis, A.

    2017-12-01

    The several extant schema for the classification of soils rely on differing criteria, but the major soil science taxonomies, including the United States Department of Agriculture (USDA) and the international harmonized World Reference Base for Soil Resources systems, are based principally on inferred pedogenic properties. These taxonomies largely result from compiled individual observations of soil morphologies within soil profiles, and the vast majority of this pedologic information is contained in qualitative text descriptions. We present text mining analyses of hundreds of gigabytes of parsed text and other data in the digitally available USDA soil taxonomy documentation, the Soil Survey Geographic (SSURGO) database, and the National Cooperative Soil Survey (NCSS) soil characterization database. These analyses implemented iPython calls to Gensim modules for topic modelling, with latent semantic indexing completed down to the lowest taxon level (soil series) paragraphs. Via a custom extension of the Natural Language Toolkit (NLTK), approximately one percent of the USDA soil series descriptions were used to train a classifier for the remainder of the documents, essentially by treating soil science words as comprising a novel language. While location-specific descriptors at the soil series level are amenable to geomatics methods, unsupervised clustering of the occurrence of other soil science words did not closely follow the usual hierarchy of soil taxa. We present preliminary phrasal analyses that may account for some of these effects.

  14. Cognition-Based Approaches for High-Precision Text Mining

    Science.gov (United States)

    Shannon, George John

    2017-01-01

    This research improves the precision of information extraction from free-form text via the use of cognitive-based approaches to natural language processing (NLP). Cognitive-based approaches are an important, and relatively new, area of research in NLP and search, as well as linguistics. Cognitive approaches enable significant improvements in both…

  15. Associated diacritical watermarking approach to protect sensitive arabic digital texts

    Science.gov (United States)

    Kamaruddin, Nurul Shamimi; Kamsin, Amirrudin; Hakak, Saqib

    2017-10-01

    Among multimedia content, one of the most predominant medium is text content. There have been lots of efforts to protect and secure text information over the Internet. The limitations of existing works have been identified in terms of watermark capacity, time complexity and memory complexity. In this work, an invisible digital watermarking approach has been proposed to protect and secure the most sensitive text i.e. Digital Holy Quran. The proposed approach works by XOR-ing only those Quranic letters that has certain diacritics associated with it. Due to sensitive nature of Holy Quran, diacritics play vital role in the meaning of the particular verse. Hence, securing letters with certain diacritics will preserve the original meaning of Quranic verses in case of alternation attempt. Initial results have shown that the proposed approach is promising with less memory complexity and time complexity compared to existing approaches.

  16. The semiotics of typography in literary texts. A multimodal approach

    DEFF Research Database (Denmark)

    Nørgaard, Nina

    2009-01-01

    to multimodal discourse proposed, for instance, by Kress & Van Leeuwen (2001) and Baldry & Thibault (2006), and, more specifically, the multimodal approach to typography suggested by Van Leeuwen (2005b; 2006), in order to sketch out a methodological framework applicable to the description and analysis...... of the semiotic potential of typography in literary texts....

  17. Towards Technological Approaches for Concept Maps Mining from Text

    Directory of Open Access Journals (Sweden)

    Camila Zacche Aguiar

    2018-04-01

    Full Text Available Concept maps are resources for the representation and construction of knowledge. They allow showing, through concepts and relationships, how knowledge about a subject is organized. Technological advances have boosted the development of approaches for the automatic construction of a concept map, to facilitate and provide the benefits of that resource more broadly. Due to the need to better identify and analyze the functionalities and characteristics of those approaches, we conducted a detailed study on technological approaches for automatic construction of concept maps published between 1994 and 2016 in the IEEE Xplore, ACM and Elsevier Science Direct data bases. From this study, we elaborate a categorization defined on two perspectives, Data Source and Graphic Representation, and fourteen categories. That study collected 30 relevant articles, which were applied to the proposed categorization to identify the main features and limitations of each approach. A detailed view on these approaches, their characteristics and techniques are presented enabling a quantitative analysis. In addition, the categorization has given us objective conditions to establish new specification requirements for a new technological approach aiming at concept maps mining from texts.

  18. Building a glaucoma interaction network using a text mining approach.

    Science.gov (United States)

    Soliman, Maha; Nasraoui, Olfa; Cooper, Nigel G F

    2016-01-01

    The volume of biomedical literature and its underlying knowledge base is rapidly expanding, making it beyond the ability of a single human being to read through all the literature. Several automated methods have been developed to help make sense of this dilemma. The present study reports on the results of a text mining approach to extract gene interactions from the data warehouse of published experimental results which are then used to benchmark an interaction network associated with glaucoma. To the best of our knowledge, there is, as yet, no glaucoma interaction network derived solely from text mining approaches. The presence of such a network could provide a useful summative knowledge base to complement other forms of clinical information related to this disease. A glaucoma corpus was constructed from PubMed Central and a text mining approach was applied to extract genes and their relations from this corpus. The extracted relations between genes were checked using reference interaction databases and classified generally as known or new relations. The extracted genes and relations were then used to construct a glaucoma interaction network. Analysis of the resulting network indicated that it bears the characteristics of a small world interaction network. Our analysis showed the presence of seven glaucoma linked genes that defined the network modularity. A web-based system for browsing and visualizing the extracted glaucoma related interaction networks is made available at http://neurogene.spd.louisville.edu/GlaucomaINViewer/Form1.aspx. This study has reported the first version of a glaucoma interaction network using a text mining approach. The power of such an approach is in its ability to cover a wide range of glaucoma related studies published over many years. Hence, a bigger picture of the disease can be established. To the best of our knowledge, this is the first glaucoma interaction network to summarize the known literature. The major findings were a set of

  19. Genre based Approach to Teach Writing Descriptive Text

    Directory of Open Access Journals (Sweden)

    Putu Ngurah Rusmawan

    2017-10-01

    Full Text Available This study aims to discuss how teaching and learning activities were carried out by using Genre based Approach in teaching writing descriptive text at junior high school. This study was conducted in the classroom of VII-1. Therefore, the appropriate design was qualitative research design. The subject of the study was the English teacher. To collect data, the researcher used observation and interview. The finding of the study described that the teaching and learning activities that were carried out by the teacher fulfilled the basic competencies. The teacher carried out the opening teaching activities by greeting, asking the students’ preparation during the lesson, checking the student’s attendance list, and informing the learning objective. The teacher carried out the main teaching activities by informing about how to write a descriptive text, giving, and asking opinions, eliciting the students’ understanding, prompting and directing to do exercises. The teacher carried out the closing teaching activities by directing the student to continue at home and eliciting the students’ reflection of what they could learn at that time.

  20. Adjustable typography: an approach to enhancing low vision text accessibility.

    Science.gov (United States)

    Arditi, Aries

    2004-04-15

    Millions of people have low vision, a disability condition caused by uncorrectable or partially correctable disorders of the eye. The primary goal of low vision rehabilitation is increasing access to printed material. This paper describes how adjustable typography, a computer graphic approach to enhancing text accessibility, can play a role in this process, by allowing visually-impaired users to customize fonts to maximize legibility according to their own visual needs. Prototype software and initial testing of the concept is described. The results show that visually-impaired users tend to produce a variety of very distinct fonts, and that the adjustment process results in greatly enhanced legibility. But this initial testing has not yet demonstrated increases in legibility over and above the legibility of highly legible standard fonts such as Times New Roman.

  1. Trace of Knowledge: Benchmarking Novel Text Mining Based Measurements

    DEFF Research Database (Denmark)

    Woltmann, Sabrina

    The impact of public research outcomes on economies, and societies, in particular, in terms of innovation and development is widely accepted and empirically investigated [9, 3]. However, many studies suggest a systematic underestimation of the impact and benefits of public research. Empirical stu...

  2. A multiresolutional approach to fuzzy text meaning: A first attempt

    Energy Technology Data Exchange (ETDEWEB)

    Mehler, A.

    1996-12-31

    The present paper focuses on the connotative meaning aspect of language signs especially above the level of words. In this context the view is taken that texts can be defined as a kind of supersign, to which-in the same way as to other signs-a meaning can be assigned. A text can therefore be described as the result of a sign articulation which connects the material text sign with a corresponding meaning. For the constitution of the structural text meaning a kind of a semiotic composition principle is responsible, which leads to the emergence of interlocked levels of language units, demonstrating different grades of resolution. Starting on the level of words, and going through the level of sentences this principle reaches finally the level of texts by aggregating step by step the meaning of a unit on a higher level out of the meanings of all components one level below, which occur within this unit. Besides, this article will elaborate the hypothesis that the meaning constitution as a two-stage process, corresponding to the syntagmatic and paradigmatic restrictions of language elements among each other, obtains equally on the level of texts. On text level this two-levelledness leads to the constitution of the connotative text meaning, whose constituents are determined on word level by the syntagmatic and paradigmatic relations of the words. The formalization of the text meaning representation occurs with the help of fuzzy set theory.

  3. Nonverbatim Captioning in Dutch Television Programs: A Text Linguistic Approach

    Science.gov (United States)

    Schilperoord, Joost; de Groot, Vanja; van Son, Nic

    2005-01-01

    In the Netherlands, as in most other European countries, closed captions for the deaf summarize texts rather than render them verbatim. Caption editors argue that in this way television viewers have enough time to both read the text and watch the program. They also claim that the meaning of the original message is properly conveyed. However, many…

  4. A Relational Reasoning Approach to Text-Graphic Processing

    Science.gov (United States)

    Danielson, Robert W.; Sinatra, Gale M.

    2017-01-01

    We propose that research on text-graphic processing could be strengthened by the inclusion of relational reasoning perspectives. We briefly outline four aspects of relational reasoning: "analogies," "anomalies," "antinomies", and "antitheses". Next, we illustrate how text-graphic researchers have been…

  5. Towards Technological Approaches for Concept Maps Mining from Text

    OpenAIRE

    Camila Zacche Aguiar; Davidson Cury; Amal Zouaq

    2018-01-01

    Concept maps are resources for the representation and construction of knowledge. They allow showing, through concepts and relationships, how knowledge about a subject is organized. Technological advances have boosted the development of approaches for the automatic construction of a concept map, to facilitate and provide the benefits of that resource more broadly. Due to the need to better identify and analyze the functionalities and characteristics of those approaches, we conducted a detailed...

  6. An Approach to Retrieval of OCR Degraded Text

    Directory of Open Access Journals (Sweden)

    Yuen-Hsien Tseng

    1998-12-01

    Full Text Available The major problem with retrieval of OCR text is the unpredictable distortion of characters due to recognition errors. Because users have no ideas of such distortion, the terms they query can hardly match the terms stored in the OCR text exactly. Thus retrieval effectiveness is significantly reduced , especially for low-quality input. To reduce the losses from retrieving such noisy OCR text, a fault-tolerant retrieval strategy based on automatic keyword extraction and fuzzy matching is proposed. In this strategy, terms, correct or not, and their term frequencies are extracted from the noisy text and presented for browsing and selection in response to users' initial queries , With theunderstanding of the real terms stored in the noisy text and of their estimated frequency distributions, users may then choose appropriate terms for a more effective searching, A text retrieval system based on this strategy has been built. Examples to show the effectiveness are demonstrated. Finally, some OCR issues for further enhancingretrieval effectiveness are discussed.

  7. Biomarker Identification Using Text Mining

    Directory of Open Access Journals (Sweden)

    Hui Li

    2012-01-01

    Full Text Available Identifying molecular biomarkers has become one of the important tasks for scientists to assess the different phenotypic states of cells or organisms correlated to the genotypes of diseases from large-scale biological data. In this paper, we proposed a text-mining-based method to discover biomarkers from PubMed. First, we construct a database based on a dictionary, and then we used a finite state machine to identify the biomarkers. Our method of text mining provides a highly reliable approach to discover the biomarkers in the PubMed database.

  8. A Novel Approach for Arabic Text Steganography Based on the “BloodGroup” Text Hiding Method

    Directory of Open Access Journals (Sweden)

    S. Malalla,

    2017-04-01

    Full Text Available Steganography is the science of hiding certain messages (data in groups of irrelevant data possibly of other form. The purpose of steganography is covert communication to hide the existence of a message from an intermediary. Text Steganography is the process of embedding secret message (text in another text (cover text so that the existence of secret message cannot be detected by a third party. This paper presents a novel approach for text steganography using the Blood Group (BG method based on the behavior of blood group. Experimentally it is found that the proposed method got good results in capacity, hiding capacity, time complexity, robustness, visibility, and similarity which shows its superiority as compared to most several existing methods.

  9. The Application of Text Mining in Business Research

    DEFF Research Database (Denmark)

    Preuss, Bjørn

    2017-01-01

    The aim of this paper is to present a methodological concept in business research that has the potential to become one of the most powerful methods in the upcoming years when it comes to research qualitative phenomena in business and society. It presents a selection of algorithms as well elaborat...... on potential use cases for a text mining based approach to qualitative data analysis....

  10. Intertextual Content Analysis: An Approach for Analysing Text-Related Discussions with Regard to Movability in Reading and How Text Content Is Handled

    Science.gov (United States)

    Hallesson, Yvonne; Visén, Pia

    2018-01-01

    Reading and discussing texts as a means for learning subject content are regular features within educational contexts. This paper presents an approach for intertextual content analysis (ICA) of such text-related discussions revealing what the participants make of the text. Thus, in contrast to many other approaches for analysing conversation that…

  11. Approach to Mathematics in Textbooks at Tertiary Level--Exploring Authors' Views about Their Texts

    Science.gov (United States)

    Randahl, Mira

    2012-01-01

    The aim of this article is to present and discuss some results from an inquiry into mathematics textbooks authors' visions about their texts and approaches they choose when new concepts are introduced. Authors' responses are discussed in relation to results about students' difficulties with approaching calculus reported by previous research. A…

  12. Methodological Demonstration of a Text Analytics Approach to Country Logistics System Assessments

    DEFF Research Database (Denmark)

    Kinra, Aseem; Mukkamala, Raghava Rao; Vatrapu, Ravi

    2017-01-01

    The purpose of this study is to develop and demonstrate a semi-automated text analytics approach for the identification and categorization of information that can be used for country logistics assessments. In this paper, we develop the methodology on a set of documents for 21 countries using...... and the text analyst. Implications are discussed and future work is outlined....

  13. Opinion Mining in Latvian Text Using Semantic Polarity Analysis and Machine Learning Approach

    Directory of Open Access Journals (Sweden)

    Gatis Špats

    2016-07-01

    Full Text Available In this paper we demonstrate approaches for opinion mining in Latvian text. Authors have applied, combined and extended results of several previous studies and public resources to perform opinion mining in Latvian text using two approaches, namely, semantic polarity analysis and machine learning. One of the most significant constraints that make application of opinion mining for written content classification in Latvian text challenging is the limited publicly available text corpora for classifier training. We have joined several sources and created a publically available extended lexicon. Our results are comparable to or outperform current achievements in opinion mining in Latvian. Experiments show that lexicon-based methods provide more accurate opinion mining than the application of Naive Bayes machine learning classifier on Latvian tweets. Methods used during this study could be further extended using human annotators, unsupervised machine learning and bootstrapping to create larger corpora of classified text.

  14. Modelling text as process a dynamic approach to EFL classroom discourse

    CERN Document Server

    Yang, Xueyan

    2010-01-01

    A discourse analysis that is not based on grammar is likely to end up as a running commentary on a text, whereas a grammar-based one tends to treat text as a finished product rather than an on-going process. This book offers an approach to discourse analysis that is both grammar-based and oriented towards text as process. It proposes a model called TEXT TYPE within the framework of Hallidayan systemic-functional linguistics, which views grammatical choices in a text not as elements that combine to form a clause structure, but as semantic features that link successive clauses into an unfolding

  15. Interdisciplinary Approach to the Mental Lexicon: Neural Network and Text Extraction From Long-term Memory

    Directory of Open Access Journals (Sweden)

    Vardan G. Arutyunyan

    2013-01-01

    Full Text Available The paper touches upon the principles of mental lexicon organization in the light of recent research in psycho- and neurolinguistics. As a focal point of discussion two main approaches to mental lexicon functioning are considered: modular or dual-system approach, developed within generativism and opposite single-system approach, representatives of which are the connectionists and supporters of network models. The paper is an endeavor towards advocating the viewpoint that mental lexicon is complex psychological organization based upon specific composition of neural network. In this regard, the paper further elaborates on the matter of storing text in human mental space and introduces a model of text extraction from long-term memory. Based upon data available, the author develops a methodology of modeling structures of knowledge representation in the systems of artificial intelligence.

  16. Building a protein name dictionary from full text: a machine learning term extraction approach

    Directory of Open Access Journals (Sweden)

    Campagne Fabien

    2005-04-01

    Full Text Available Abstract Background The majority of information in the biological literature resides in full text articles, instead of abstracts. Yet, abstracts remain the focus of many publicly available literature data mining tools. Most literature mining tools rely on pre-existing lexicons of biological names, often extracted from curated gene or protein databases. This is a limitation, because such databases have low coverage of the many name variants which are used to refer to biological entities in the literature. Results We present an approach to recognize named entities in full text. The approach collects high frequency terms in an article, and uses support vector machines (SVM to identify biological entity names. It is also computationally efficient and robust to noise commonly found in full text material. We use the method to create a protein name dictionary from a set of 80,528 full text articles. Only 8.3% of the names in this dictionary match SwissProt description lines. We assess the quality of the dictionary by studying its protein name recognition performance in full text. Conclusion This dictionary term lookup method compares favourably to other published methods, supporting the significance of our direct extraction approach. The method is strong in recognizing name variants not found in SwissProt.

  17. Predicting Text Comprehension, Processing, and Familiarity in Adult Readers: New Approaches to Readability Formulas

    Science.gov (United States)

    Crossley, Scott A.; Skalicky, Stephen; Dascalu, Mihai; McNamara, Danielle S.; Kyle, Kristopher

    2017-01-01

    Research has identified a number of linguistic features that influence the reading comprehension of young readers; yet, less is known about whether and how these findings extend to adult readers. This study examines text comprehension, processing, and familiarity judgment provided by adult readers using a number of different approaches (i.e.,…

  18. Place as Text: Approaches to Active Learning. 2nd Edition. National Collegiate Honors Council Monograph Series

    Science.gov (United States)

    Braid, Bernice, Ed.; Long, Ada, Ed.

    2010-01-01

    The decade since publication of "Place as Text: Approaches to Active Learning" has seen an explosion of interest and productivity in the field of experiential education. This monograph presents a story of an experiment and a blueprint of sorts for anyone interested in enriching an existing program or willing to experiment with pedagogy…

  19. Word-level recognition of multifont Arabic text using a feature vector matching approach

    Science.gov (United States)

    Erlandson, Erik J.; Trenkle, John M.; Vogt, Robert C., III

    1996-03-01

    Many text recognition systems recognize text imagery at the character level and assemble words from the recognized characters. An alternative approach is to recognize text imagery at the word level, without analyzing individual characters. This approach avoids the problem of individual character segmentation, and can overcome local errors in character recognition. A word-level recognition system for machine-printed Arabic text has been implemented. Arabic is a script language, and is therefore difficult to segment at the character level. Character segmentation has been avoided by recognizing text imagery of complete words. The Arabic recognition system computes a vector of image-morphological features on a query word image. This vector is matched against a precomputed database of vectors from a lexicon of Arabic words. Vectors from the database with the highest match score are returned as hypotheses for the unknown image. Several feature vectors may be stored for each word in the database. Database feature vectors generated using multiple fonts and noise models allow the system to be tuned to its input stream. Used in conjunction with database pruning techniques, this Arabic recognition system has obtained promising word recognition rates on low-quality multifont text imagery.

  20. An Approach to a Comprehensive Test Framework for Analysis and Evaluation of Text Line Segmentation Algorithms

    Directory of Open Access Journals (Sweden)

    Zoran N. Milivojevic

    2011-09-01

    Full Text Available The paper introduces a testing framework for the evaluation and validation of text line segmentation algorithms. Text line segmentation represents the key action for correct optical character recognition. Many of the tests for the evaluation of text line segmentation algorithms deal with text databases as reference templates. Because of the mismatch, the reliable testing framework is required. Hence, a new approach to a comprehensive experimental framework for the evaluation of text line segmentation algorithms is proposed. It consists of synthetic multi-like text samples and real handwritten text as well. Although the tests are mutually independent, the results are cross-linked. The proposed method can be used for different types of scripts and languages. Furthermore, two different procedures for the evaluation of algorithm efficiency based on the obtained error type classification are proposed. The first is based on the segmentation line error description, while the second one incorporates well-known signal detection theory. Each of them has different capabilities and convenience, but they can be used as supplements to make the evaluation process efficient. Overall the proposed procedure based on the segmentation line error description has some advantages, characterized by five measures that describe measurement procedures.

  1. A Novel Text Clustering Approach Using Deep-Learning Vocabulary Network

    Directory of Open Access Journals (Sweden)

    Junkai Yi

    2017-01-01

    Full Text Available Text clustering is an effective approach to collect and organize text documents into meaningful groups for mining valuable information on the Internet. However, there exist some issues to tackle such as feature extraction and data dimension reduction. To overcome these problems, we present a novel approach named deep-learning vocabulary network. The vocabulary network is constructed based on related-word set, which contains the “cooccurrence” relations of words or terms. We replace term frequency in feature vectors with the “importance” of words in terms of vocabulary network and PageRank, which can generate more precise feature vectors to represent the meaning of text clustering. Furthermore, sparse-group deep belief network is proposed to reduce the dimensionality of feature vectors, and we introduce coverage rate for similarity measure in Single-Pass clustering. To verify the effectiveness of our work, we compare the approach to the representative algorithms, and experimental results show that feature vectors in terms of deep-learning vocabulary network have better clustering performance.

  2. Task-based Language Teaching and Text Types in Teaching Writing Using Communicative Approach

    Directory of Open Access Journals (Sweden)

    Riyana Sari Ni Nyoman

    2018-01-01

    Full Text Available One of the most important language competencies in teaching learning process is writing. The present study focused on investigating the effect of communicative approach with task-based language teaching and communicative approach on the students’ writing competency at SMP N 2 Kediri viewed from text types(i.e. descriptive, recount, and narrative. To analyze the data, the design of the experimental study was posttest-only comparison groups by involving 60 students that were selected as the sample of the study through cluster random design. The sample’s post tests were assessed by using analytical scoring rubric. The data were then analyzed by using One-way ANOVA and the post hoc test was done by computing Multiple Comparison using Tukey HSD Test. The result showed that there was significant difference of the effect of communicative approach with task-based language teaching and communicative approach on the students’ writing competency. These findings are expected to give contribution in teaching English, particularly writing.

  3. Separation in Data Mining Based on Fractal Nature of Data

    Czech Academy of Sciences Publication Activity Database

    Jiřina, Marcel; Jiřina jr., M.

    2013-01-01

    Roč. 3, č. 1 (2013), s. 44-60 ISSN 2225-658X Institutional support: RVO:67985807 Keywords : nearest neighbor * fractal set * multifractal * IINC method * correlation dimension Subject RIV: JC - Computer Hardware ; Software http://sdiwc.net/digital-library/separation-in-data-mining-based-on-fractal-nature-of-data.html

  4. A Network of Themes: A Qualitative Approach to Gerhard Richter's Text

    Directory of Open Access Journals (Sweden)

    Narvika Bovcon

    2017-07-01

    Full Text Available Gerhard Richter's books Text – a collection of painter's verbal statements about his artistic method – and Atlas – 783 sheets with images, mainly photographs and visual notations – are two archives that complement the understanding of his diverse artistic practice. The paper presents a textual model that experimentally simulates a possible ordering principle for archives. Richter's statements in the book Text are cut up and used as short quotations. Those that relate to multiple aspects of the painter's oeuvre are identified as hubs in the semantic network. The hubs are organized paratactically, as an array of different themes. The paper presents a methodological hypothesis and an experimental model that aim to connect the research of real networks with the paradigms of humanistic interpretation. We have to bear in mind that the network is a result of the researcher's interpretative approach, which is added to the initial archive included in the book Text. The breaking up of Richter's poetics into atoms of quotations is an experimental proposal of a new textuality in art history and humanities, which has its own history. In comparison to digital archives with complex interfaces that often tend to obscure the content, the elements in our experiment appear as specific configurations of the semantic network and are presented in a limited number of linear texts. The method of listing of quotations gathers the fragments into a potential “whole”, i.e. a narrativized gateway to an archive according to the researcher's interpretation.

  5. Text mining approach to predict hospital admissions using early medical records from the emergency department.

    Science.gov (United States)

    Lucini, Filipe R; S Fogliatto, Flavio; C da Silveira, Giovani J; L Neyeloff, Jeruza; Anzanello, Michel J; de S Kuchenbecker, Ricardo; D Schaan, Beatriz

    2017-04-01

    Emergency department (ED) overcrowding is a serious issue for hospitals. Early information on short-term inward bed demand from patients receiving care at the ED may reduce the overcrowding problem, and optimize the use of hospital resources. In this study, we use text mining methods to process data from early ED patient records using the SOAP framework, and predict future hospitalizations and discharges. We try different approaches for pre-processing of text records and to predict hospitalization. Sets-of-words are obtained via binary representation, term frequency, and term frequency-inverse document frequency. Unigrams, bigrams and trigrams are tested for feature formation. Feature selection is based on χ 2 and F-score metrics. In the prediction module, eight text mining methods are tested: Decision Tree, Random Forest, Extremely Randomized Tree, AdaBoost, Logistic Regression, Multinomial Naïve Bayes, Support Vector Machine (Kernel linear) and Nu-Support Vector Machine (Kernel linear). Prediction performance is evaluated by F1-scores. Precision and Recall values are also informed for all text mining methods tested. Nu-Support Vector Machine was the text mining method with the best overall performance. Its average F1-score in predicting hospitalization was 77.70%, with a standard deviation (SD) of 0.66%. The method could be used to manage daily routines in EDs such as capacity planning and resource allocation. Text mining could provide valuable information and facilitate decision-making by inward bed management teams. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  6. Approach to mathematics in textbooks at tertiary level - exploring authors' views about their texts

    Science.gov (United States)

    Randahl, Mira

    2012-10-01

    The aim of this article is to present and discuss some results from an inquiry into mathematics textbooks authors' visions about their texts and approaches they choose when new concepts are introduced. Authors' responses are discussed in relation to results about students' difficulties with approaching calculus reported by previous research. A questionnaire has been designed and sent to seven authors of the most used calculus textbooks in Norway and four authors have responded. The responses show that the authors mainly view teaching in terms of transmission so they focus mainly on getting the mathematical content correct and 'clear'. The dominant view is that the textbook is intended to help the students to learn by explaining and clarifying. The authors prefer the approach to introduce new concepts based on the traditional way of perceiving mathematics as a system of definitions, examples and exercises. The results of this study may enhance our understanding of the role of the textbook at tertiary level. They may also form a foundation for further research.

  7. Axiomatic Ontology Learning Approaches for English Translation of the Meaning of Quranic Texts

    Directory of Open Access Journals (Sweden)

    Saad Saidah

    2017-01-01

    Full Text Available Ontology learning (OL is the computational task of generating a knowledge base in the form of an ontology, given an unstructured corpus in natural language (NL. While most works in the field of ontology learning have been primarily based on a statistical approach to extract lightweight OL, very few attempts have been made to extract axiomatic OL (called heavyweight OL from NL text documents. Axiomatic OL supports more precise formal logic-based reasoning when compared to lightweight OL. Lexico-syntactic pattern matching and statisticsal one cannot lead to very accurate learning, mostly because of several linguistic nuances in the NL. Axiomatic OL is an alternative methodology that has not been explored much, where a deep linguistics analysis in computational linguistics is used to generate formal axioms and definitions instead of simply inducing a taxonomy. The ontology that is created not only stores the information about the application domain in explicit knowledge, but also can deduce the implicit knowledge from this ontology. This research will explore the English translation of the meaning of Quranic texts.

  8. Text messaging approach improves weight loss in patients with nonalcoholic fatty liver disease: A randomized study.

    Science.gov (United States)

    Axley, Page; Kodali, Sudha; Kuo, Yong-Fang; Ravi, Sujan; Seay, Toni; Parikh, Nina M; Singal, Ashwani K

    2018-05-01

    Nonalcoholic fatty liver disease (NAFLD) is emerging as the most common liver disease. The only effective treatment is 7%-10% weight loss. Mobile technology is increasingly used in weight management. This study was performed to evaluate the effects of text messaging intervention on weight loss in patients with NAFLD. Thirty well-defined NAFLD patients (mean age 52 years, 67% females, mean BMI 38) were randomized 1:1 to control group: counselling on healthy diet and exercise, or intervention group: text messages in addition to healthy life style counselling. NAFLD text messaging program sent weekly messages for 22 weeks on healthy life style education. Primary outcome was change in weight. Secondary outcomes were changes in liver enzymes and lipid profile. Intervention group lost an average of 6.9 lbs. (P = .03) compared to gain of 1.8 lbs. in the control group (P = .45). Intervention group also showed a decrease in ALT level (-12.5 IU/L, P = .035) and improvement in serum triglycerides (-28 mg/dL, P = .048). There were no changes in the control group on serum ALT level (-6.1 IU/L, P = .46) and on serum triglycerides (-20.3 mg/dL P = .27). Using one-way analysis of variance, change in outcomes in intervention group compared to control group was significant for weight (P = .02) and BMI (P = .02). Text messaging on healthy life style is associated with reduction in weight in NAFLD patients. Larger studies are suggested to examine benefits on liver histology, and assess long-term impact of this approach in patients with NAFLD. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  9. Using text mining for study identification in systematic reviews: a systematic review of current approaches.

    Science.gov (United States)

    O'Mara-Eves, Alison; Thomas, James; McNaught, John; Miwa, Makoto; Ananiadou, Sophia

    2015-01-14

    The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying relevant studies in an unbiased way for inclusion in systematic reviews both complex and time consuming. Text mining has been offered as a potential solution: through automating some of the screening process, reviewer time can be saved. The evidence base around the use of text mining for screening has not yet been pulled together systematically; this systematic review fills that research gap. Focusing mainly on non-technical issues, the review aims to increase awareness of the potential of these technologies and promote further collaborative research between the computer science and systematic review communities. Five research questions led our review: what is the state of the evidence base; how has workload reduction been evaluated; what are the purposes of semi-automation and how effective are they; how have key contextual problems of applying text mining to the systematic review field been addressed; and what challenges to implementation have emerged? We answered these questions using standard systematic review methods: systematic and exhaustive searching, quality-assured data extraction and a narrative synthesis to synthesise findings. The evidence base is active and diverse; there is almost no replication between studies or collaboration between research teams and, whilst it is difficult to establish any overall conclusions about best approaches, it is clear that efficiencies and reductions in workload are potentially achievable. On the whole, most suggested that a saving in workload of between 30% and 70% might be possible, though sometimes the saving in workload is accompanied by the loss of 5% of relevant studies (i.e. a 95% recall). Using text mining to prioritise the order in which items are screened should be considered safe and ready for use in 'live' reviews. The use of text mining as a 'second screener' may also be used cautiously

  10. LIBERAL THOUGHT IN QUR’ANIC STUDIES: Tracing Humanistic Approach to Sacred Text in Islamic Scholarship

    Directory of Open Access Journals (Sweden)

    M. Nur Kholis Setiawan

    2007-03-01

    Full Text Available Literary approach to the Qur’an developed by al-Khuli created deep critiques from its opponents, in whose opinion, the usage of literary paradigm to the study of the Qur’an, according to them, implied a consequence of treating the Qur’an as a human text which clearly indicates a strong influence of a liberal mode of thinking that goes out of the line of the Qur’an’s spirit. This article shows a diametric fact compared to that they have claimed. The data proves that linguistic aspects of the Qur’an have succeeded in making an intellectual connection among progressive and liberal scholars in the classical and modern era. This supports the assumption that progressive and liberal thought whose one of its indicators is freedom of thought in accordance to Charles Kurzman term, is “children” of the Islamic civilization. Freedom of thought in the classical Islamic scholarship should be the élan of intellectualism including the field of Qur’anic studies.

  11. Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach.

    Science.gov (United States)

    Schneider, Nadine; Fechner, Nikolas; Landrum, Gregory A; Stiefl, Nikolaus

    2017-08-28

    Big data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: more and more data are being generated, for instance, by technologies such as DNA encoded libraries, peptide libraries, text mining of large literature corpora, and new in silico enumeration methods. Handling those huge sets of molecules effectively is quite challenging and requires compromises that often come at the expense of the interpretability of the results. In order to find an intuitive and meaningful approach to organizing large molecular data sets, we adopted a probabilistic framework called "topic modeling" from the text-mining field. Here we present the first chemistry-related implementation of this method, which allows large molecule sets to be assigned to "chemical topics" and investigating the relationships between those. In this first study, we thoroughly evaluate this novel method in different experiments and discuss both its disadvantages and advantages. We show very promising results in reproducing human-assigned concepts using the approach to identify and retrieve chemical series from sets of molecules. We have also created an intuitive visualization of the chemical topics output by the algorithm. This is a huge benefit compared to other unsupervised machine-learning methods, like clustering, which are commonly used to group sets of molecules. Finally, we applied the new method to the 1.6 million molecules of the ChEMBL22 data set to test its robustness and efficiency. In about 1 h we built a 100-topic model of this large data set in which we could identify interesting topics like "proteins", "DNA", or "steroids". Along with this publication we provide our data sets and an open-source implementation of the new method (CheTo) which

  12. The Voice of Chinese Health Consumers: A Text Mining Approach to Web-Based Physician Reviews.

    Science.gov (United States)

    Hao, Haijing; Zhang, Kunpeng

    2016-05-10

    skills and bedside manner, general appreciation from patients, and description of various symptoms. To the best of our knowledge, our work is the first study using an automated text-mining approach to analyze a large amount of unstructured textual data of Web-based physician reviews in China. Based on our analysis, we found that Chinese reviewers mainly concentrate on a few popular topics. This is consistent with the goal of Chinese online health platforms and demonstrates the health care focus in China's health care system. Our text-mining approach reveals a new research area on how to use big data to help health care providers, health care administrators, and policy makers hear patient voices, target patient concerns, and improve the quality of care in this age of patient-centered care. Also, on the health care consumer side, our text mining technique helps patients make more informed decisions about which specialists to see without reading thousands of reviews, which is simply not feasible. In addition, our comparison analysis of Web-based physician reviews in China and the United States also indicates some cultural differences.

  13. Two approaches to gathering text corpora from the WorldWideWeb

    CSIR Research Space (South Africa)

    Botha, G

    2005-11-01

    Full Text Available Many applications of pattern recognition to natural language processing require large text corpora in a specified language. For many of the languages of the world, such corpora are not readily available, but significant quantities of text...

  14. Improving Collaborative Learning in the Classroom: Text Mining Based Grouping and Representing

    Science.gov (United States)

    Erkens, Melanie; Bodemer, Daniel; Hoppe, H. Ulrich

    2016-01-01

    Orchestrating collaborative learning in the classroom involves tasks such as forming learning groups with heterogeneous knowledge and making learners aware of the knowledge differences. However, gathering information on which the formation of appropriate groups and the creation of graphical knowledge representations can be based is very effortful…

  15. A constructivist approach to e-text design for use in undergraduate physiology courses.

    Science.gov (United States)

    Rhodes, Ashley E; Rozell, Timothy G

    2015-09-01

    Electronic textbooks, or e-texts, will have an increasingly important role in college science courses within the next few years due to the rising costs of traditional texts and the increasing availability of software allowing instructors to create their own e-text. However, few guidelines exist in the literature to aid instructors in the development and design specifically of e-texts using sound learning theories; this is especially true for undergraduate physiology e-texts. In this article, we describe why constructivism is a very important educational theory for e-text design and how it may be applied in e-text development by instructors. We also provide examples of two undergraduate physiology e-texts that were designed in accordance with this educational theory but for learners of quite different backgrounds and prior knowledge levels. Copyright © 2015 The American Physiological Society.

  16. DataToText: A Consumer-Oriented Approach to Data Analysis

    Science.gov (United States)

    Kenny, David A.

    2010-01-01

    DataToText is a project developed where the user communicates the relevant information for an analysis and DataToText computer routine produces text output that describes in words, tables, and figures the results from the analyses. Two extended examples are given, one an example of a moderator analysis and the other an example of a dyadic data…

  17. A Constructivist Approach to E-Text Design for Use in Undergraduate Physiology Courses

    Science.gov (United States)

    Rhodes, Ashley E.; Rozell, Timothy G.

    2015-01-01

    Electronic textbooks, or e-texts, will have an increasingly important role in college science courses within the next few years due to the rising costs of traditional texts and the increasing availability of software allowing instructors to create their own e-text. However, few guidelines exist in the literature to aid instructors in the…

  18. A Study of Readability of Texts in Bangla through Machine Learning Approaches

    Science.gov (United States)

    Sinha, Manjira; Basu, Anupam

    2016-01-01

    In this work, we have investigated text readability in Bangla language. Text readability is an indicator of the suitability of a given document with respect to a target reader group. Therefore, text readability has huge impact on educational content preparation. The advances in the field of natural language processing have enabled the automatic…

  19. Individual differences in reading comprehension : A componential approach to eighth graders’ expository text comprehension

    NARCIS (Netherlands)

    Welie, C.J.M.

    2017-01-01

    Why do secondary school students differ in their text comprehension? This is an important question because many secondary school students are unable to achieve the level of text comprehension required to enable learning from their school book texts. This thesis contributes to answering this question

  20. Approach for Text Classification Based on the Similarity Measurement between Normal Cloud Models

    Directory of Open Access Journals (Sweden)

    Jin Dai

    2014-01-01

    Full Text Available The similarity between objects is the core research area of data mining. In order to reduce the interference of the uncertainty of nature language, a similarity measurement between normal cloud models is adopted to text classification research. On this basis, a novel text classifier based on cloud concept jumping up (CCJU-TC is proposed. It can efficiently accomplish conversion between qualitative concept and quantitative data. Through the conversion from text set to text information table based on VSM model, the text qualitative concept, which is extraction from the same category, is jumping up as a whole category concept. According to the cloud similarity between the test text and each category concept, the test text is assigned to the most similar category. By the comparison among different text classifiers in different feature selection set, it fully proves that not only does CCJU-TC have a strong ability to adapt to the different text features, but also the classification performance is also better than the traditional classifiers.

  1. Development of Human Face Literature Database Using Text Mining Approach: Phase I.

    Science.gov (United States)

    Kaur, Paramjit; Krishan, Kewal; Sharma, Suresh K

    2018-06-01

    The face is an important part of the human body by which an individual communicates in the society. Its importance can be highlighted by the fact that a person deprived of face cannot sustain in the living world. The amount of experiments being performed and the number of research papers being published under the domain of human face have surged in the past few decades. Several scientific disciplines, which are conducting research on human face include: Medical Science, Anthropology, Information Technology (Biometrics, Robotics, and Artificial Intelligence, etc.), Psychology, Forensic Science, Neuroscience, etc. This alarms the need of collecting and managing the data concerning human face so that the public and free access of it can be provided to the scientific community. This can be attained by developing databases and tools on human face using bioinformatics approach. The current research emphasizes on creating a database concerning literature data of human face. The database can be accessed on the basis of specific keywords, journal name, date of publication, author's name, etc. The collected research papers will be stored in the form of a database. Hence, the database will be beneficial to the research community as the comprehensive information dedicated to the human face could be found at one place. The information related to facial morphologic features, facial disorders, facial asymmetry, facial abnormalities, and many other parameters can be extracted from this database. The front end has been developed using Hyper Text Mark-up Language and Cascading Style Sheets. The back end has been developed using hypertext preprocessor (PHP). The JAVA Script has used as scripting language. MySQL (Structured Query Language) is used for database development as it is most widely used Relational Database Management System. XAMPP (X (cross platform), Apache, MySQL, PHP, Perl) open source web application software has been used as the server.The database is still under the

  2. Text in context: a textual-linguistic approach to Amos 4: 7-8

    Directory of Open Access Journals (Sweden)

    del Barco del Barco, Francisco Javier

    2002-12-01

    Full Text Available This article will study Amos 4:7-8 from a textlinguistic approach: the form of this section will be analyzed within the structure of the chapter in which it is inserted. Such an analysis is needed because the set of verb forms used seems to be different from the rest of verb forms used in the chapter. While the whole chapter tends to be structured as a brief chain of narrative passages with wayyiqtol, the structure of Amos 4:7-8 seems to be a predictive section -developed through weqatal- inserted or pasted in the middle of the chapter. Translations usually do not note the difference between the set of verb forms used. A textlinguistic analysis of Amos 4:7-8 will show that the kind of discourse used here is different from the one used in the rest of the chapter, and, therefore, this difference should be reflected in the translation. The specific function of some discourse types is also discussed.

    En este artículo se presenta un análisis de Amos 4:7-8 a partir de los presupuestos de la lingüística textual. La forma del texto se analizará tomando en cuenta la estructura del capítulo en el que se halla inserto. Este análisis resulta necesario porque el grupo de formas verbales utilizado en la sección propuesta no parece ser el mismo que el del resto del capítulo. Mientras el capítulo en su conjunto es un discurso narrativo estructurado en torno a wayyiqtol, Amos 4:7-8 parece responder al esquema del discurso predictivo desarrollado a partir de weqatal. Un análisis textual se hace necesario porque las traducciones bíblicas no parecen hacerse eco del cambio en el uso de las formas verbales. Además de este análisis, se trata también de la función específica de algunos tipos de discurso.

  3. An improved algorithm for information hiding based on features of Arabic text: A Unicode approach

    Directory of Open Access Journals (Sweden)

    A.A. Mohamed

    2014-07-01

    Full Text Available Steganography means how to hide secret information in a cover media, so that other individuals fail to realize their existence. Due to the lack of data redundancy in the text file in comparison with other carrier files, text steganography is a difficult problem to solve. In this paper, we proposed a new promised steganographic algorithm for Arabic text based on features of Arabic text. The focus is on more secure algorithm and high capacity of the carrier. Our extensive experiments using the proposed algorithm resulted in a high capacity of the carrier media. The embedding capacity rate ratio of the proposed algorithm is high. In addition, our algorithm can resist traditional attacking methods since it makes the changes in carrier text as minimum as possible.

  4. Tracing Knowledge Transfer from Universities to Industry: A Text Mining Approach

    DEFF Research Database (Denmark)

    Woltmann, Sabrina; Alkærsig, Lars

    2017-01-01

    This paper identifies transferred knowledge between universities and the industry by proposing the use of a computational linguistic method. Current research on university-industry knowledge exchange relies often on formal databases and indicators such as patents, collaborative publications and l...... is the first step to enable the identification of common knowledge and knowledge transfer via text mining to increase its measurability....... and license agreements, to assess the contribution to the socioeconomic surrounding of universities. We, on the other hand, use the texts from university abstracts to identify university knowledge and compare them with texts from firm webpages. We use these text data to identify common key words and thereby...... identify overlapping contents among the texts. As method we use a well-established word ranking method from the field of information retrieval term frequency–inverse document frequency (TFIDF) to identify commonalities between texts from university. In examining the outcomes of the TFIDF statistic we find...

  5. Oral History as Complement to Place-as-Text: Approaches to Service Learning

    Science.gov (United States)

    Pederson, JoEllen; Znosko, Jessi; Peters, Jesse; Cannata, Susan M.

    2018-01-01

    The purpose of this paper is to discuss the advantages of combining place-as-text curriculum with an oral history collection to act as catalysts for transformational learning. These experiential and service learning practices complement each other to enrich the encounters students are afforded. First, the nature and procedures of place-as-text and…

  6. A Machine Learning Approach to Measurement of Text Readability for EFL Learners Using Various Linguistic Features

    Science.gov (United States)

    Kotani, Katsunori; Yoshimi, Takehiko; Isahara, Hitoshi

    2011-01-01

    The present paper introduces and evaluates a readability measurement method designed for learners of EFL (English as a foreign language). The proposed readability measurement method (a regression model) estimates the text readability based on linguistic features, such as lexical, syntactic and discourse features. Text readability refers to the…

  7. Semi-supervised probabilistics approach for normalising informal short text messages

    CSIR Research Space (South Africa)

    Modupe, A

    2017-03-01

    Full Text Available The growing use of informal social text messages on Twitter is one of the known sources of big data. These type of messages are noisy and frequently rife with acronyms, slangs, grammatical errors and non-standard words causing grief for natural...

  8. Standard Chinese: A Modular Approach. Student Text. Module 1: Orientation; Module 2: Biographic Information.

    Science.gov (United States)

    Defense Language Inst., Monterey, CA.

    Texts in spoken Standard Chinese were developed to improve and update Chinese materials to reflect current usage in Beijing and Taipei. The focus is on communicating in Chinese in practical situations, and the texts summarize and supplement tapes. The overall course is organized into 10 situational modules, student workbooks, and resource modules.…

  9. Standard Chinese: A Modular Approach. Student Text. Module 3: Money; Module 4: Directions.

    Science.gov (United States)

    Defense Language Inst., Monterey, CA.

    Texts in spoken Standard Chinese were developed to improve and update Chinese materials to reflect current usage in Beijing and Taipei. The focus is on communicating in practical situations, and the texts summarize and supplement tapes. The overall course is organized into 10 situational modules, student workbooks, and resource modules. This text…

  10. Balancing Linguistic and Social Needs: Evaluating Texts Using a Critical Language Awareness Approach

    Science.gov (United States)

    Case, Rod E.; Ndura, Elavie; Righettini, Marielena

    2005-01-01

    English as a second language (ESL) content-based texts are often evaluated for their presentation of sound second-language teaching practices. While such reviews are important and valuable, they ignore an examination of the race, class, and gender issues introduced in the texts. A critical perspective on textbook evaluation organized around the…

  11. Texting As A Discursive Approach For The Production Of Agricultural Solutions

    Directory of Open Access Journals (Sweden)

    Ronan G. Zagado

    2015-08-01

    Full Text Available This paper demonstrates how the short messaging service SMS popularly known as texting has facilitated production of solutions to farm issues using the Farmers Text Centre FTC of the Philippine Rice Research PhilRice as the case study. Text messages registered in the FTC database in 2010 covering one cropping season were discourse analyzed. Interpretive qualitative research particularly the Grounded Theory was employed to interprettheorize said data. Since texting is a new emerging discourse in agricultural development Grounded Theory allows the explication of theoretical accounts that explain its existence and impact. Results indicate that timing queries received within working days from 8am to 5pm get speedy response content the easier the question the faster it gets reply length the shorter the message the better and clarity of the querytext message as well as cultural factors such as greetings and terms of respect are all important governing factors in texting for farm use. Moreover analysis reveals that the series of text messages sent back and forth by farmers and agricultural specialist in FTC suggests a dynamic process of negotiation rather than passive information sharing. The analysis further reveals that texting has allowed farmers to have access to a negotiated knowledge rather than a standard scientific recommendation vis--vis the solution to their farm issues. The term negotiated implies that farmers are actively involved in knowledge production via texting. Textholder is coined in this paper to describe farmers and agricultural specialists as co-creators of knowledge in texting as opposed to their traditional role as knowledge generator and user respectively. From the analysis reflections implications and theoretical contributions are drawn in relation to the value of SMSing in agricultural extension and communication.

  12. Word2vec and dictionary based approach for uyghur text filtering

    Science.gov (United States)

    Tohti, Turdi; Zhao, Yunxing; Musajan, Winira

    2017-08-01

    With emerging of deep learning, the expression of words in computer has made major breakthroughs and the effect of text processing based on word vector has also been significantly improved. This paper maps all patterns into a more abstract vector space by Uyghur-Chinese dictionary and deep learning tool Word2vec, at first. Secondly, a similar pattern is found according the characteristics of the original pattern. Finally, texts are filtered using Wu-Manber algorithm. Experiments show that this method can get obvious filtering accuracy and recall of Uyghur text information improved.

  13. Mentor Texts and the Coding of Academic Writing Structures: A Functional Approach

    Directory of Open Access Journals (Sweden)

    Wilder Yesid Escobar Alméciga

    2014-10-01

    Full Text Available The purpose of the present pedagogical experience was to address the English language writing needs of university-level students pursuing a degree in bilingual education with an emphasis in the teaching of English. Using mentor texts and coding academic writing structures, an instructional design was developed to directly address the shortcomings presented through a triangulated needs analysis. Through promoting awareness of international standards of writing as well as fostering an understanding of the inherent structures of academic texts, a methodology intended to increase academic writing proficiency was explored. The study suggests that mentor texts and the coding of academic writing structures can have a positive impact on the production of students’ academic writing.

  14. Using text mining for study identification in systematic reviews: a systematic review of current approaches

    OpenAIRE

    O?Mara-Eves, Alison; Thomas, James; McNaught, John; Miwa, Makoto; Ananiadou, Sophia

    2015-01-01

    Background The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying relevant studies in an unbiased way for inclusion in systematic reviews both complex and time consuming. Text mining has been offered as a potential solution: through automating some of the screening process, reviewer time can be saved. The evidence base around the use of text mining for screening has not yet been pulled together systematically; this systematic...

  15. A new approach to the classification of African oral texts | Kam ...

    African Journals Online (AJOL)

    Toutes ces raisons ont conduit à un nouvel examen des différents genres oraux dans le cadre africain et à proposer une division de ces textes en cinq grandes catégories. Mots clés: littérature orale, genres oraux, textes oraux, discours, énoncés, jeux de plaisanterie, chercheurs en littérature orale. Tydskrif vir Letterkunde ...

  16. Linguacultural space “Man-Nature” in literary texts: cognitive and pragmatic approach

    Directory of Open Access Journals (Sweden)

    Eldarova Ruzanna Alievna

    2016-06-01

    Full Text Available The magnitude of representation of nature images, the links to the author’s mind, the hero, the reader can be considered in literary texts as one of the most important sources for identifying the parameters of the national picture of the world and the individually author’s transformation of its components. Researches that identify patterns of functioning linguacultural spaces in the texts are able to give new results projected in the linguistic picture of the ethnic group of the world due to reflections in literary texts of archetypal, stereotyped images of peculiar linguistic culture and ethnic group as a whole as well as individually-copyright, which characterize a particular linguistic identity and its conception of the world. Cognitive paradigm of modern linguistics, anthropocentric in nature allows to consider culture as a process modeling language, which naturally highlights the problem of linguistic linguaculture of predetermined value. Great importance in this regard is the concept of space as linguocultural cognitive model of objective reality. Cognitive-pragmatic potential of a literary text is deepening due to the introduction the descriptions of nature, since they always implement the ethical, aesthetic, and intellectual abilities of the creative subject.

  17. Resins and Gums in Historical Iatrosophia Texts from Cyprus - A Botanical and Medico-pharmacological Approach.

    Science.gov (United States)

    Lardos, Andreas; Prieto-Garcia, José; Heinrich, Michael

    2011-01-01

    This study explores historical iatrosophia texts from Cyprus from a botanical and medico-pharmacological point of view focusing on remedies containing resins and gums. The iatrosophia are a genre of Greek medical literature of Byzantine origin and can be described as medicine handbooks which serve as therapeutic repositories containing recipes or advice. To extract and analyze information on plant usage in such sources - which are largely unedited texts and so far have not been translated - we investigate (i) the relationship of the iatrosophia to Dioscorides' De Materia Medica as well as historic pharmaceutical books or standard texts on modern phytotherapy and (ii) the validity of the remedies by comparing them to modern scientific data on reported biological activities. In the six texts investigated 27 substances incorporating plant exudates are mentioned. They are obtained from over 43 taxa of higher plants and in particular are used to treat dermatological, gastrointestinal, and respiratory tract conditions. The comparison to historic pharmaceutical books and phytotherapy texts reflects the gradual decline of the use of plant exudates in Western medicine. While remarkable parallels to Dioscorides' text exist, the non-Dioscoridean influence suggests a complex pattern of knowledge exchange. Overall, this resulted in an integration of knowledge from so far poorly understood sources. The comparison with bioscientific data reveals a fragmentary picture and highlights the potential of these unexplored substances and their uses. Where relevant bioscientific data are available, we generally found a confirmation. This points to a largely rational use of the associated remedies. Taken together, the iatrosophia are a valuable resource for ethnopharmacological and natural product research. Most importantly they contribute to the understanding of the development of herbal medicines in the (Eastern) Mediterranean and Europe.

  18. Using Short Texts to Teach English as Second Language: An Integrated Approach

    Science.gov (United States)

    Kembo, Jane

    2016-01-01

    The teacher of English Language is often hard pressed to find interesting and authentic ways to present language to target second language speakers. While language can be taught and learned, part of it must be acquired and short texts provide powerful tools for doing so and reinforcing what has been taught/learned. This paper starts from research,…

  19. Illustrations as Adjuncts to Prose: A Text-Appropriate Processing Approach.

    Science.gov (United States)

    Waddill, Paula J.; And Others

    1988-01-01

    The effects of pictorial illustrations on memory for text were studied in 144 college students. Two experiments indicated that illustrations serve a supplementary function; adjunct pictures alone, without special processing instructions, do not help learners encode information that is not normally encoded in the first place. (SLD)

  20. Introducing the interpretation of medieval Hindi texts into the Hindi curriculum: An alternative approach

    Czech Academy of Sciences Publication Activity Database

    Strnad, Jaroslav

    2010-01-01

    Roč. 9, č. 2 (2010), s. 25-38 ISSN 1648-2662. [Regional Conference on Indology for Central and Eastern Europe - New Perspectives in Education about India /2./. Vilnijus, 24.8.2006-26. 8.2006] Institutional research plan: CEZ:AV0Z90210515 Keywords : Hindi * texts * analysis Subject RIV: AI - Linguistics

  1. Mentor Texts and the Coding of Academic Writing Structures: A Functional Approach

    Science.gov (United States)

    Escobar Alméciga, Wilder Yesid; Evans, Reid

    2014-01-01

    The purpose of the present pedagogical experience was to address the English language writing needs of university-level students pursuing a degree in bilingual education with an emphasis in the teaching of English. Using mentor texts and coding academic writing structures, an instructional design was developed to directly address the shortcomings…

  2. "What is relevant in a text document?": An interpretable machine learning approach.

    Directory of Open Access Journals (Sweden)

    Leila Arras

    Full Text Available Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML models have been trained to automatically map documents to these abstract concepts, allowing to annotate very large text collections, more than could be processed by a human in a lifetime. Besides predicting the text's category very accurately, it is also highly desirable to understand how and why the categorization process takes place. In this paper, we demonstrate that such understanding can be achieved by tracing the classification decision back to individual words using layer-wise relevance propagation (LRP, a recently developed technique for explaining predictions of complex non-linear classifiers. We train two word-based ML models, a convolutional neural network (CNN and a bag-of-words SVM classifier, on a topic categorization task and adapt the LRP method to decompose the predictions of these models onto words. Resulting scores indicate how much individual words contribute to the overall classification decision. This enables one to distill relevant information from text documents without an explicit semantic information extraction step. We further use the word-wise relevance scores for generating novel vector-based document representations which capture semantic information. Based on these document vectors, we introduce a measure of model explanatory power and show that, although the SVM and CNN models perform similarly in terms of classification accuracy, the latter exhibits a higher level of explainability which makes it more comprehensible for humans and potentially more useful for other applications.

  3. A Novel Approach in Text-Independent Speaker Recognition in Noisy Environment

    Directory of Open Access Journals (Sweden)

    Nona Heydari Esfahani

    2014-10-01

    Full Text Available In this paper, robust text-independent speaker recognition is taken into consideration. The proposed method performs on manual silence-removed utterances that are segmented into smaller speech units containing few phones and at least one vowel. The segments are basic units for long-term feature extraction. Sub-band entropy is directly extracted in each segment. A robust vowel detection method is then applied on each segment to separate a high energy vowel that is used as unit for pitch frequency and formant extraction. By applying a clustering technique, extracted short-term features namely MFCC coefficients are combined with long term features. Experiments using MLP classifier show that the average speaker accuracy recognition rate is 97.33% for clean speech and 61.33% in noisy environment for -2db SNR, that shows improvement compared to other conventional methods.

  4. Understanding disciplinary vocabularies using a full-text enabled domain-independent term extraction approach.

    Science.gov (United States)

    Yan, Erjia; Williams, Jake; Chen, Zheng

    2017-01-01

    Publication metadata help deliver rich analyses of scholarly communication. However, research concepts and ideas are more effectively expressed through unstructured fields such as full texts. Thus, the goals of this paper are to employ a full-text enabled method to extract terms relevant to disciplinary vocabularies, and through them, to understand the relationships between disciplines. This paper uses an efficient, domain-independent term extraction method to extract disciplinary vocabularies from a large multidisciplinary corpus of PLoS ONE publications. It finds a power-law pattern in the frequency distributions of terms present in each discipline, indicating a semantic richness potentially sufficient for further study and advanced analysis. The salient relationships amongst these vocabularies become apparent in application of a principal component analysis. For example, Mathematics and Computer and Information Sciences were found to have similar vocabulary use patterns along with Engineering and Physics; while Chemistry and the Social Sciences were found to exhibit contrasting vocabulary use patterns along with the Earth Sciences and Chemistry. These results have implications to studies of scholarly communication as scholars attempt to identify the epistemological cultures of disciplines, and as a full text-based methodology could lead to machine learning applications in the automated classification of scholarly work according to disciplinary vocabularies.

  5. Stopping Antidepressants and Anxiolytics as Major Concerns Reported in Online Health Communities: A Text Mining Approach.

    Science.gov (United States)

    Abbe, Adeline; Falissard, Bruno

    2017-10-23

    Internet is a particularly dynamic way to quickly capture the perceptions of a population in real time. Complementary to traditional face-to-face communication, online social networks help patients to improve self-esteem and self-help. The aim of this study was to use text mining on material from an online forum exploring patients' concerns about treatment (antidepressants and anxiolytics). Concerns about treatment were collected from discussion titles in patients' online community related to antidepressants and anxiolytics. To examine the content of these titles automatically, we used text mining methods, such as word frequency in a document-term matrix and co-occurrence of words using a network analysis. It was thus possible to identify topics discussed on the forum. The forum included 2415 discussions on antidepressants and anxiolytics over a period of 3 years. After a preprocessing step, the text mining algorithm identified the 99 most frequently occurring words in titles, among which were escitalopram, withdrawal, antidepressant, venlafaxine, paroxetine, and effect. Patients' concerns were related to antidepressant withdrawal, the need to share experience about symptoms, effects, and questions on weight gain with some drugs. Patients' expression on the Internet is a potential additional resource in addressing patients' concerns about treatment. Patient profiles are close to that of patients treated in psychiatry. ©Adeline Abbe, Bruno Falissard. Originally published in JMIR Mental Health (http://mental.jmir.org), 23.10.2017.

  6. A Text Mining Approach for Extracting Lessons Learned from Project Documentation: An Illustrative Case Study

    Directory of Open Access Journals (Sweden)

    Benjamin Matthies

    2017-12-01

    Full Text Available Lessons learned are important building blocks for continuous learning in project-based organisations. Nonetheless, the practical reality is that lessons learned are often not consistently reused for organisational learning. Two problems are commonly described in this context: the information overload and the lack of procedures and methods for the assessment and implementation of lessons learned. This paper addresses these problems, and appropriate solutions are combined in a systematic lesson learned process. Latent Dirichlet Allocation is presented to solve the first problem. Regarding the second problem, established risk management methods are adapted. The entire lessons learned process will be demonstrated in a practical case study

  7. Review Essay: Autobiography as Scientific Text: A Dialectical Approach to the Role of Experience

    Directory of Open Access Journals (Sweden)

    Wolff-Michael Roth

    2004-01-01

    Full Text Available The Sneaky Kid and Its Aftermath is a first-person account of the (sexual intimacy between a researcher (Harry WOLCOTT and his research participant (Brad, the sneaky kid. Two years after the events, the sneaky kid returned with a vengeance, beating up the researcher and burning down his house. Autobiographical texts may lead readers to confuse author and literary figure of the same name. Any critique of the protagonist potentially can be read as a critique of the author and therefore as an ad hominem attack—to mark the difference I propose to differentiate the two for the purpose of deconstruction (here, Harry WOLCOTT and Wally Haircut, respectively. In my reading, the relationship between Wally Haircut and Brad is highly unsymmetrical in terms of FOUCAULT's knowledge/power concept and BOURDIEU's analyses of the relations between economic, social, cultural, and symbolic capital. Wally Haircut, I will argue in part, had everything to gain in these dimensions and his research participant, the "sneaky kid," had everything to lose. This is just how it turned out. Unfortunately, Harry WOLCOTT failed to draw on existing social theory to provide a reasonable explanation of the events. I conclude with a "two thumbs down. URN: urn:nbn:de:0114-fqs040199

  8. A scalable machine-learning approach to recognize chemical names within large text databases

    Directory of Open Access Journals (Sweden)

    Wren Jonathan D

    2006-09-01

    Full Text Available Abstract Motivation The use or study of chemical compounds permeates almost every scientific field and in each of them, the amount of textual information is growing rapidly. There is a need to accurately identify chemical names within text for a number of informatics efforts such as database curation, report summarization, tagging of named entities and keywords, or the development/curation of reference databases. Results A first-order Markov Model (MM was evaluated for its ability to distinguish chemical names from words, yielding ~93% recall in recognizing chemical terms and ~99% precision in rejecting non-chemical terms on smaller test sets. However, because total false-positive events increase with the number of words analyzed, the scalability of name recognition was measured by processing 13.1 million MEDLINE records. The method yielded precision ranges from 54.7% to 100%, depending upon the cutoff score used, averaging 82.7% for approximately 1.05 million putative chemical terms extracted. Extracted chemical terms were analyzed to estimate the number of spelling variants per term, which correlated with the total number of times the chemical name appeared in MEDLINE. This variability in term construction was found to affect both information retrieval and term mapping when using PubMed and Ovid.

  9. [Hygienic Assesment of Educational Texts: Methodical Approaches and Evaluation of Difficulties for Children of Secondary Textbooks].

    Science.gov (United States)

    Kuchma, V R; Tkachuk, E A

    2015-01-01

    The understandability and readability of the text are significant indicators of evaluation of textbooks. The aim of the study - rationale of improving the readability and understandability of textbooks. 60 modern textbooks for 5-11th classes on History, Physics, Biology and 23 textbooks of 1960-1980's edition. Flesch index was used to assess the readability, Fogh index - to evaluate understandability. The readability and understandability of texts in textbooks of 1960-1980's and modern editions have no differences and show the same complexity of old and modern textbooks for students. The indicator of understandability of textbooks for primary classes corresponds to age norm and is 4.4±0.2 points. The indicator of readability for these books is less age norm and is 53.8±2.9 points, which increases the physiological cost of educational activities of children of primary school age. Children's readability and understandability of school textbooks are a significant factor of intensity of training activities and can be objectively assessed by Flesch and Fogh indices, that it is appropriate for an objective hygienic assessment of the tension of the educational activities for children. The main direction of optimization of the tension of educational activity is to reduce the intellectual and emotional loads in children by increasing the easiness of reading textbooks due to their compliance with the age peculiarities of students.

  10. Systematic analysis of molecular mechanisms for HCC metastasis via text mining approach.

    Science.gov (United States)

    Zhen, Cheng; Zhu, Caizhong; Chen, Haoyang; Xiong, Yiru; Tan, Junyuan; Chen, Dong; Li, Jin

    2017-02-21

    To systematically explore the molecular mechanism for hepatocellular carcinoma (HCC) metastasis and identify regulatory genes with text mining methods. Genes with highest frequencies and significant pathways related to HCC metastasis were listed. A handful of proteins such as EGFR, MDM2, TP53 and APP, were identified as hub nodes in PPI (protein-protein interaction) network. Compared with unique genes for HBV-HCCs, genes particular to HCV-HCCs were less, but may participate in more extensive signaling processes. VEGFA, PI3KCA, MAPK1, MMP9 and other genes may play important roles in multiple phenotypes of metastasis. Genes in abstracts of HCC-metastasis literatures were identified. Word frequency analysis, KEGG pathway and PPI network analysis were performed. Then co-occurrence analysis between genes and metastasis-related phenotypes were carried out. Text mining is effective for revealing potential regulators or pathways, but the purpose of it should be specific, and the combination of various methods will be more useful.

  11. Parenthetical Cohesive Explicitness: A Linguistic Approach for a Modified Translation of the Quranic Text

    Directory of Open Access Journals (Sweden)

    Mohammad Amin Hawamdeh

    2015-09-01

    Full Text Available Motivated by the severe criticism the Hilali and Khan (HK Translation of the Holy Quran has received for its too many parenthetical insertions, this study aimed at linguistically realizing how such added pieces of information could be for necessary cohesive explicitness or worthless redundant interpolation. Methodically, the HK translation of the first 8 verses of Chapter 18 (The Cave, Surah Al Kahf of the Holy Quran was selected to be a subject material. A number of 15 instances of explicitation put in parentheses were encountered; they were found to be based upon 23 cohesive (grammatical/lexical relationships and, hence, to be considered as ones of cohesive explicitness. Eventually, such an analysis could be of use for modifying the available translations of the Holy Quran.

  12. Evaluating Approaches to Rendering Braille Text on a High-Density Pin Display.

    Science.gov (United States)

    Morash, Valerie S; Russomanno, Alexander; Gillespie, R Brent; OModhrain, Sile

    2017-10-13

    Refreshable displays for tactile graphics are typically composed of pins that have smaller diameters and spacing than standard braille dots. We investigated configurations of high-density pins to form braille text on such displays using non-refreshable stimuli produced with a 3D printer. Normal dot braille (diameter 1.5 mm) was compared to high-density dot braille (diameter 0.75 mm) wherein each normal dot was rendered by high-density simulated pins alone or in a cluster of pins configured in a diamond, X, or square; and to "blobs" that could result from covering normal braille and high-density multi-pin configurations with a thin membrane. Twelve blind participants read MNREAD sentences displayed in these conditions. For high-density simulated pins, single pins were as quickly and easily read as normal braille, but diamond, X, and square multi-pin configurations were slower and/or harder to read than normal braille. We therefore conclude that as long as center-to-center dot spacing and dot placement is maintained, the dot diameter may be open to variability for rendering braille on a high density tactile display.

  13. Injury risk in Danish youth and senior elite handball using a new SMS text messages approach.

    Science.gov (United States)

    Moller, Merete; Attermann, Jorn; Myklebust, Grethe; Wedderkopp, Niels

    2012-06-01

    To assess the injury incidence in elite handball, and if gender and previous injuries are risk factors for new injuries. Cohort study of 517 male and female elite handball players (age groups under (u)16, u-18 and senior). Participants completed a web survey establishing injury history, demographic information and sports experience, and provided weekly reports of time-loss injuries and handball exposure for 31 weeks by short message service text messaging (SMS). Injuries were further classified by telephone interview. The weekly response rate ranged from 85% to 90% illustrating the promise of the SMS system as a tool in injury surveillance. Of 448 reported injuries, 165 injuries (37%) were overuse injuries and 283 (63%) traumatic injuries. Knee (19%) and ankle (29%) were the most common traumatic injuries. The injury incidence during match play was 23.5 (95% CI 17.8 to 30.4), 15.1 (95% CI 9.7 to 22.2), 11.1 (95% CI 7.0 to 16.6) injuries per 1000 match hours among senior, u-18 and u-16 players, respectively. U-18 male players had an overall 1.76 (95% CI 1.10 to 2.80) times higher risk of injury compared to females. Having had two or more previous injuries causing absence from handball for more than 4 weeks increased the risk of new injury in the u-16 group (IRR: 1.79 (95% CI 1.03 to 3.11)-2.23 (95% CI 1.22 to 4.10)). The incidence of time-loss injuries in elite handball was higher during match play than previously reported in recreational handball. Previous injuries were a risk factor for new injuries among u-16 players. Male players had a significant higher injury rate in the u-18 group.

  14. Interdisciplinarity in translation teaching: competence-based education, translation task-based approach, context-based text typology

    Directory of Open Access Journals (Sweden)

    Edelweiss Vitol Gysel

    2017-05-01

    Full Text Available In the context of competence-based teaching, this paper draws upon the model of Translation Competence (TC put forward by the PACTE group (2003 to establish a dialogue between cognitive-constructivist paradigms for translation teaching and the model of the Context-based Text Typology (MATTHIESSEN et al., 2007. In this theoretical environment, it proposes a model for the design of a Teaching Unit (TU for the development of the bilingual competence in would-be-translators.To this end, it explores translation as a cognitive, communicative and textual activity (HURTADO ALBIR, 2011 and considers its teaching from the translation task-based approach (HURTADO ALBIR, 1999. This approach is illustrated through the practical example of the design of a TU elaborated for the subject ‘Introduction to Specialized Translation’,part of the curricular grid of the program ‘Secretariado Executivo’ at Universidade Federal de Santa Catarina. Aspects such as the establishment of learning objectives and their alignment with the translation tasks composing the TU are addressed for this specific pedagogical situation. We argue for the development of textual competences by means of the acquisition of strategies derived from the Context-based Text Typology to solve problems arising from the translation of different text types and contextual configurations.

  15. Interword and intraword pause threshold in the writing of texts by children and adolescents : a methodological approach

    Directory of Open Access Journals (Sweden)

    Florence eChenu

    2014-03-01

    Full Text Available Writing words in real life involves setting objectives, imagining a recipient, translating ideas into linguistic forms, managing grapho-motor gestures, etc. Understanding writing requires observation of the processes as they occur in real time. Analysis of pauses is one of the preferred methods for accessing the dynamics of writing and is based on the idea that pauses are behavioral correlates of cognitive processes. However, there is a need to clarify what we are observing when studying pause phenomena, as we will argue in the first section. This taken into account, the study of pause phenomena can be considered following two approaches. A first approach, driven by temporality, would define a threshold and observe where pauses, e.g. scriptural inactivity occurs. A second approach, linguistically driven, would define structural units and look for scriptural inactivity at the boundaries of these units or within these units. Taking a temporally driven approach, we present two methods which aim at the automatic identification of scriptural inactivity which is most likely not attributable to grapho-motor management in texts written by children and adolescents using digitizing tablets in association with Eye and Pen© (Chesnet & Alamargot, 2005. The first method is purely statistical and is based on the idea that the distribution of pauses exhibits different Gaussian components each of them corresponding to a different type of pause. After having reviewed the limits of this statistical method, we present a second method based on writing dynamics which attempts to identify breaking points in the writing dynamics rather than relying only on pause duration. This second method needs to be refined to overcome the fact that calculation is impossible when there is insufficient data which is often the case when working with young scriptors.

  16. Visualization and Analysis of a Cardio Vascular Diseaseand MUPP1-related Biological Network combining Text Mining and Data Warehouse Approaches

    Directory of Open Access Journals (Sweden)

    Sommer Björn

    2010-03-01

    Full Text Available Detailed investigation of socially important diseases with modern experimental methods has resulted in the generation of large volume of valuable data. However, analysis and interpretation of this data needs application of efficient computational techniques and systems biology approaches. In particular, the techniques allowing the reconstruction of associative networks of various biological objects and events can be useful. In this publication, the combination of different techniques to create such a network associated with an abstract cell environment is discussed in order to gain insights into the functional as well as spatial interrelationships. It is shown that experimentally gained knowledge enriched with data warehouse content and text mining data can be used for the reconstruction and localization of a cardiovascular disease developing network beginning with MUPP1/MPDZ (multi-PDZ domain protein.

  17. Implicit prosody mining based on the human eye image capture technology

    Science.gov (United States)

    Gao, Pei-pei; Liu, Feng

    2013-08-01

    The technology of eye tracker has become the main methods of analyzing the recognition issues in human-computer interaction. Human eye image capture is the key problem of the eye tracking. Based on further research, a new human-computer interaction method introduced to enrich the form of speech synthetic. We propose a method of Implicit Prosody mining based on the human eye image capture technology to extract the parameters from the image of human eyes when reading, control and drive prosody generation in speech synthesis, and establish prosodic model with high simulation accuracy. Duration model is key issues for prosody generation. For the duration model, this paper put forward a new idea for obtaining gaze duration of eyes when reading based on the eye image capture technology, and synchronous controlling this duration and pronunciation duration in speech synthesis. The movement of human eyes during reading is a comprehensive multi-factor interactive process, such as gaze, twitching and backsight. Therefore, how to extract the appropriate information from the image of human eyes need to be considered and the gaze regularity of eyes need to be obtained as references of modeling. Based on the analysis of current three kinds of eye movement control model and the characteristics of the Implicit Prosody reading, relative independence between speech processing system of text and eye movement control system was discussed. It was proved that under the same text familiarity condition, gaze duration of eyes when reading and internal voice pronunciation duration are synchronous. The eye gaze duration model based on the Chinese language level prosodic structure was presented to change previous methods of machine learning and probability forecasting, obtain readers' real internal reading rhythm and to synthesize voice with personalized rhythm. This research will enrich human-computer interactive form, and will be practical significance and application prospect in terms of

  18. Computing symmetrical strength of N-grams: a two pass filtering approach in automatic classification of text documents.

    Science.gov (United States)

    Agnihotri, Deepak; Verma, Kesari; Tripathi, Priyanka

    2016-01-01

    The contiguous sequences of the terms (N-grams) in the documents are symmetrically distributed among different classes. The symmetrical distribution of the N-Grams raises uncertainty in the belongings of the N-Grams towards the class. In this paper, we focused on the selection of most discriminating N-Grams by reducing the effects of symmetrical distribution. In this context, a new text feature selection method named as the symmetrical strength of the N-Grams (SSNG) is proposed using a two pass filtering based feature selection (TPF) approach. Initially, in the first pass of the TPF, the SSNG method chooses various informative N-Grams from the entire extracted N-Grams of the corpus. Subsequently, in the second pass the well-known Chi Square (χ(2)) method is being used to select few most informative N-Grams. Further, to classify the documents the two standard classifiers Multinomial Naive Bayes and Linear Support Vector Machine have been applied on the ten standard text data sets. In most of the datasets, the experimental results state the performance and success rate of SSNG method using TPF approach is superior to the state-of-the-art methods viz. Mutual Information, Information Gain, Odds Ratio, Discriminating Feature Selection and χ(2).

  19. Personality and Education Mining based Job Advisory System

    Directory of Open Access Journals (Sweden)

    Rajendra S. Choudhary

    2014-09-01

    Full Text Available Every job demands an employee with some specific qualities in addition to the basic educational qualification. For example, an introvert person cannot be a good leader despite of a very good academic qualification. Thinking and logical ability is required for a person to be a successful software engineer. So, the aim of this paper is to present a novel approach for advising an ideal job to the job seeker while considering his personality trait and educational qualification both. Very well-known theories of personality like MBTI indicator and OCEAN theory, are used for personality mining. For education mining, score based system is used. The score based system captures the information from attributes like most scoring subject, dream job etc. After personality mining, the resultant values are coalesced with the information extracted from education mining. And finally, the most suited jobs, in terms of personality and educational qualification are recommended to the job seekers. The experiment is conducted on the students who have earned an engineering degree in the field of computer science, information technology and electronics. Nevertheless, the same architecture can easily be extended to other educational degrees also. To the best of the author’s knowledge, this is a first e-job advisory system that recommends the job best suited as per one’s personality using MBTI and OCEAN theory both.

  20. Development and testing of a text-mining approach to analyse patients' comments on their experiences of colorectal cancer care.

    Science.gov (United States)

    Wagland, Richard; Recio-Saucedo, Alejandra; Simon, Michael; Bracher, Michael; Hunt, Katherine; Foster, Claire; Downing, Amy; Glaser, Adam; Corner, Jessica

    2016-08-01

    Quality of cancer care may greatly impact on patients' health-related quality of life (HRQoL). Free-text responses to patient-reported outcome measures (PROMs) provide rich data but analysis is time and resource-intensive. This study developed and tested a learning-based text-mining approach to facilitate analysis of patients' experiences of care and develop an explanatory model illustrating impact on HRQoL. Respondents to a population-based survey of colorectal cancer survivors provided free-text comments regarding their experience of living with and beyond cancer. An existing coding framework was tested and adapted, which informed learning-based text mining of the data. Machine-learning algorithms were trained to identify comments relating to patients' specific experiences of service quality, which were verified by manual qualitative analysis. Comparisons between coded retrieved comments and a HRQoL measure (EQ5D) were explored. The survey response rate was 63.3% (21 802/34 467), of which 25.8% (n=5634) participants provided free-text comments. Of retrieved comments on experiences of care (n=1688), over half (n=1045, 62%) described positive care experiences. Most negative experiences concerned a lack of post-treatment care (n=191, 11% of retrieved comments) and insufficient information concerning self-management strategies (n=135, 8%) or treatment side effects (n=160, 9%). Associations existed between HRQoL scores and coded algorithm-retrieved comments. Analysis indicated that the mechanism by which service quality impacted on HRQoL was the extent to which services prevented or alleviated challenges associated with disease and treatment burdens. Learning-based text mining techniques were found useful and practical tools to identify specific free-text comments within a large dataset, facilitating resource-efficient qualitative analysis. This method should be considered for future PROM analysis to inform policy and practice. Study findings indicated that

  1. Analyzing discourse and text complexity for learning and collaborating a cognitive approach based on natural language processing

    CERN Document Server

    Dascălu, Mihai

    2014-01-01

    With the advent and increasing popularity of Computer Supported Collaborative Learning (CSCL) and e-learning technologies, the need of automatic assessment and of teacher/tutor support for the two tightly intertwined activities of comprehension of reading materials and of collaboration among peers has grown significantly. In this context, a polyphonic model of discourse derived from Bakhtin’s work as a paradigm is used for analyzing both general texts and CSCL conversations in a unique framework focused on different facets of textual cohesion. As specificity of our analysis, the individual learning perspective is focused on the identification of reading strategies and on providing a multi-dimensional textual complexity model, whereas the collaborative learning dimension is centered on the evaluation of participants’ involvement, as well as on collaboration assessment. Our approach based on advanced Natural Language Processing techniques provides a qualitative estimation of the learning process and enhance...

  2. Text mining and natural language processing approaches for automatic categorization of lay requests to web-based expert forums.

    Science.gov (United States)

    Himmel, Wolfgang; Reincke, Ulrich; Michelmann, Hans Wilhelm

    2009-07-22

    Both healthy and sick people increasingly use electronic media to obtain medical information and advice. For example, Internet users may send requests to Web-based expert forums, or so-called "ask the doctor" services. To automatically classify lay requests to an Internet medical expert forum using a combination of different text-mining strategies. We first manually classified a sample of 988 requests directed to a involuntary childlessness forum on the German website "Rund ums Baby" ("Everything about Babies") into one or more of 38 categories belonging to two dimensions ("subject matter" and "expectations"). After creating start and synonym lists, we calculated the average Cramer's V statistic for the association of each word with each category. We also used principle component analysis and singular value decomposition as further text-mining strategies. With these measures we trained regression models and determined, on the basis of best regression models, for any request the probability of belonging to each of the 38 different categories, with a cutoff of 50%. Recall and precision of a test sample were calculated as a measure of quality for the automatic classification. According to the manual classification of 988 documents, 102 (10%) documents fell into the category "in vitro fertilization (IVF)," 81 (8%) into the category "ovulation," 79 (8%) into "cycle," and 57 (6%) into "semen analysis." These were the four most frequent categories in the subject matter dimension (consisting of 32 categories). The expectation dimension comprised six categories; we classified 533 documents (54%) as "general information" and 351 (36%) as a wish for "treatment recommendations." The generation of indicator variables based on the chi-square analysis and Cramer's V proved to be the best approach for automatic classification in about half of the categories. In combination with the two other approaches, 100% precision and 100% recall were realized in 18 (47%) out of the 38

  3. Dramatis persona in poetical and practical approach of dramatic text in 17th century French theory of theatre

    Directory of Open Access Journals (Sweden)

    Michał Bajer

    2009-01-01

    Full Text Available The idea of the dramatis persona posited by the first French theatre theorists of the Richelieu circle, Jean Chapelain and Jules de la Mesnardiere, emerges as a quite literał implementation of the Aristotelian concepts unfolded in the sixth and fifteenth chapter of his Poetics. In a later period, the third of the aforementioned group of authors, François Hédelin d’Aubignac, dismisses the Aristotelian categories, erecting his theory upon the elements adopted from the Roman theory of rhetoric. The analysis of the Persona in classical drama theory allows to reconstruct the relation between these two 17th century dramatic approaches. The former is the traditional perspective relying on the postulations of the Aristotelian theory. The latter, which is a practical grasp, is new to the 17th century’s dramatic mindset, and was formulated by abbé d’Aubignac. Whereas the axis of poetics is the structural analysis of a work of art, it is the functioning of that work of art in the theatrical process of communication between the stage and the audience that remains the core interest of the practical approach. In this process, the rhetorical effect of presence of the dramatis persona should by created in the imagination of the spectator-auditor. The subject of analysis is common to both perspectives and the discrepancies concem merely aspects of its description. Therefore poetics and practice are neither competitive nor mutually exclusive, but can both legitimately coexist in the description of the very same work of art.

  4. Personality and Education Mining based Job Advisory System

    OpenAIRE

    Rajendra S. Choudhary; Rajul Kukreja; Nitika Jain; Shikha Jain

    2014-01-01

    Every job demands an employee with some specific qualities in addition to the basic educational qualification. For example, an introvert person cannot be a good leader despite of a very good academic qualification. Thinking and logical ability is required for a person to be a successful software engineer. So, the aim of this paper is to present a novel approach for advising an ideal job to the job seeker while considering his personality trait and educational qualification both. Very well-kno...

  5. Study on the Method of Association Rules Mining Based on Genetic Algorithm and Application in Analysis of Seawater Samples

    Directory of Open Access Journals (Sweden)

    Qiuhong Sun

    2014-04-01

    Full Text Available Based on the data mining research, the data mining based on genetic algorithm method, the genetic algorithm is briefly introduced, while the genetic algorithm based on two important theories and theoretical templates principle implicit parallelism is also discussed. Focuses on the application of genetic algorithms for association rule mining method based on association rule mining, this paper proposes a genetic algorithm fitness function structure, data encoding, such as the title of the improvement program, in particular through the early issues study, proposed the improved adaptive Pc, Pm algorithm is applied to the genetic algorithm, thereby improving efficiency of the algorithm. Finally, a genetic algorithm based association rule mining algorithm, and be applied in sea water samples database in data mining and prove its effective.

  6. Data Mining Based on Cloud-Computing Technology

    Directory of Open Access Journals (Sweden)

    Ren Ying

    2016-01-01

    Full Text Available There are performance bottlenecks and scalability problems when traditional data-mining system is used in cloud computing. In this paper, we present a data-mining platform based on cloud computing. Compared with a traditional data mining system, this platform is highly scalable, has massive data processing capacities, is service-oriented, and has low hardware cost. This platform can support the design and applications of a wide range of distributed data-mining systems.

  7. Text Mining.

    Science.gov (United States)

    Trybula, Walter J.

    1999-01-01

    Reviews the state of research in text mining, focusing on newer developments. The intent is to describe the disparate investigations currently included under the term text mining and provide a cohesive structure for these efforts. A summary of research identifies key organizations responsible for pushing the development of text mining. A section…

  8. The Use of History of Science Texts in Teaching Science: Two Cases of an Innovative, Constructivist Approach

    Science.gov (United States)

    Koliopoulos, Dimitris; Dossis, Sotiris; Stamoulis, Efthymios

    2007-01-01

    This study proposes an empirical classification of ways to introduce elements of the history of science into science teaching, as well as describing a special way to do so characterized by the introduction of short extracts from historical texts. The aim is to motivate students to participate in problem-solving activities and to transform their…

  9. A Digital Humanities Approach to the History of Science Eugenics Revisited in Hidden Debates by Means of Semantic Text Mining

    OpenAIRE

    Huijnen, Pim; Laan, Fons; de Rijke, Maarten; Pieters, Toine

    2014-01-01

    Comparative historical research on the the intensity, diversity and fluidity of public discourses has been severely hampered by the extraordinary task of manually gathering and processing large sets of opinionated data in news media in different countries. At most 50,000 documents have been systematically studied in a single comparative historical project in the subject area of heredity and eugenics. Digital techniques, like the text mining tools WAHSP and BILAND we have developed in two succ...

  10. GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text.

    Science.gov (United States)

    Zhu, Qile; Li, Xiaolin; Conesa, Ana; Pereira, Cécile

    2018-05-01

    Best performing named entity recognition (NER) methods for biomedical literature are based on hand-crafted features or task-specific rules, which are costly to produce and difficult to generalize to other corpora. End-to-end neural networks achieve state-of-the-art performance without hand-crafted features and task-specific knowledge in non-biomedical NER tasks. However, in the biomedical domain, using the same architecture does not yield competitive performance compared with conventional machine learning models. We propose a novel end-to-end deep learning approach for biomedical NER tasks that leverages the local contexts based on n-gram character and word embeddings via Convolutional Neural Network (CNN). We call this approach GRAM-CNN. To automatically label a word, this method uses the local information around a word. Therefore, the GRAM-CNN method does not require any specific knowledge or feature engineering and can be theoretically applied to a wide range of existing NER problems. The GRAM-CNN approach was evaluated on three well-known biomedical datasets containing different BioNER entities. It obtained an F1-score of 87.26% on the Biocreative II dataset, 87.26% on the NCBI dataset and 72.57% on the JNLPBA dataset. Those results put GRAM-CNN in the lead of the biological NER methods. To the best of our knowledge, we are the first to apply CNN based structures to BioNER problems. The GRAM-CNN source code, datasets and pre-trained model are available online at: https://github.com/valdersoul/GRAM-CNN. andyli@ece.ufl.edu or aconesa@ufl.edu. Supplementary data are available at Bioinformatics online.

  11. A preliminary approach to creating an overview of lactoferrin multi-functionality utilizing a text mining method.

    Science.gov (United States)

    Shimazaki, Kei-ichi; Kushida, Tatsuya

    2010-06-01

    Lactoferrin is a multi-functional metal-binding glycoprotein that exhibits many biological functions of interest to many researchers from the fields of clinical medicine, dentistry, pharmacology, veterinary medicine, nutrition and milk science. To date, a number of academic reports concerning the biological activities of lactoferrin have been published and are easily accessible through public data repositories. However, as the literature is expanding daily, this presents challenges in understanding the larger picture of lactoferrin function and mechanisms. In order to overcome the "analysis paralysis" associated with lactoferrin information, we attempted to apply a text mining method to the accumulated lactoferrin literature. To this end, we used the information extraction system GENPAC (provided by Nalapro Technologies Inc., Tokyo). This information extraction system uses natural language processing and text mining technology. This system analyzes the sentences and titles from abstracts stored in the PubMed database, and can automatically extract binary relations that consist of interactions between genes/proteins, chemicals and diseases/functions. We expect that such information visualization analysis will be useful in determining novel relationships among a multitude of lactoferrin functions and mechanisms. We have demonstrated the utilization of this method to find pathways of lactoferrin participation in neovascularization, Helicobacter pylori attack on gastric mucosa, atopic dermatitis and lipid metabolism.

  12. Research design: qualitative, quantitative and mixed methods approaches Research design: qualitative, quantitative and mixed methods approaches Creswell John W Sage 320 £29 0761924426 0761924426 [Formula: see text].

    Science.gov (United States)

    2004-09-01

    The second edition of Creswell's book has been significantly revised and updated. The author clearly sets out three approaches to research: quantitative, qualitative and mixed methods. As someone who has used mixed methods in my research, it is refreshing to read a textbook that addresses this. The differences between the approaches are clearly identified and a rationale for using each methodological stance provided.

  13. Texting for Health: An Evaluation of a Population Approach to Type 2 Diabetes Risk Reduction With a Personalized Message.

    Science.gov (United States)

    Khurshid, Anjum; Brown, Lisanne; Mukherjee, Snigdha; Abebe, Nebeyou; Kulick, David

    2015-11-01

    txt4health is an innovative, 14-week, interactive, population-based mobile health program for individuals at risk of type 2 diabetes, developed under the Beacon Community Program in the Greater New Orleans, La., area. A comprehensive social marketing campaign sought to enroll hard-to-reach, at-risk populations using a combination of mass media and face-to-face engagement in faith-based and retail environments. Little is known about the effectiveness of social marketing for mobile technology application in the general population. A systematic evaluation of the campaign identified successes and barriers to implementing a population-based mobile health program. Face-to-face engagement helped increase program enrollment after the initial launch; otherwise, enrollment leveled off over time. Results show positive trends in reaching target populations and in the use of mobile phones to record personal health information and set goals for reducing the risk of type 2 diabetes. The lessons from the txt4health campaign can help inform the development and programmatic strategies to provide a person-level intervention using a population-level approach for individuals at risk for diabetes as well as aid in chronic disease management.

  14. Examining Thematic Similarity, Difference, and Membership in Three Online Mental Health Communities from Reddit: A Text Mining and Visualization Approach.

    Science.gov (United States)

    Park, Albert; Conway, Mike; Chen, Annie T

    2018-01-01

    Social media, including online health communities, have become popular platforms for individuals to discuss health challenges and exchange social support with others. These platforms can provide support for individuals who are concerned about social stigma and discrimination associated with their illness. Although mental health conditions can share similar symptoms and even co-occur, the extent to which discussion topics in online mental health communities are similar, different, or overlapping is unknown. Discovering the topical similarities and differences could potentially inform the design of related mental health communities and patient education programs. This study employs text mining, qualitative analysis, and visualization techniques to compare discussion topics in publicly accessible online mental health communities for three conditions: Anxiety, Depression and Post-Traumatic Stress Disorder. First, online discussion content for the three conditions was collected from three Reddit communities (r/Anxiety, r/Depression, and r/PTSD). Second, content was pre-processed, and then clustered using the k -means algorithm to identify themes that were commonly discussed by members. Third, we qualitatively examined the common themes to better understand them, as well as their similarities and differences. Fourth, we employed multiple visualization techniques to form a deeper understanding of the relationships among the identified themes for the three mental health conditions. The three mental health communities shared four themes: sharing of positive emotion, gratitude for receiving emotional support, and sleep- and work-related issues. Depression clusters tended to focus on self-expressed contextual aspects of depression, whereas the Anxiety Disorders and Post-Traumatic Stress Disorder clusters addressed more treatment- and medication-related issues. Visualizations showed that discussion topics from the Anxiety Disorders and Post-Traumatic Stress Disorder subreddits

  15. Improving Students� Ability in Writing Hortatory Exposition Texts by Using Process-Genre Based Approach with YouTube Videos as the Media

    Directory of Open Access Journals (Sweden)

    fifin naili rizkiyah

    2017-06-01

    Full Text Available Abstract: This research is aimed at finding out how Process-Genre Based Approach strategy with YouTube Videos as the media are employed to improve the students� ability in writing hortatory exposition texts. This study uses collaborative classroom action research design following the procedures namely planning, implementing, observing, and reflecting. The procedures of carrying out the strategy are: (1 relating several issues/ cases to the students� background knowledge and introducing the generic structures and linguistic features of hortatory exposition text as the BKoF stage, (2 analyzing the generic structure and the language features used in the text and getting model on how to write a hortatory exposition text by using the YouTube Video as the MoT stage, (3 writing a hortatory exposition text collaboratively in a small group and in pairs through process writing as the JCoT stage, and (4 writing a hortatory exposition text individually as the ICoT stage. The result shows that the use of Process-Genre Based Approach and YouTube Videos can improve the students� ability in writing hortatory exposition texts. The percentage of the students achieving the score above the minimum passing grade (70 had improved from only 15.8% (3 out of 19 students in the preliminary study to 100% (22 students in the Cycle 1. Besides, the score of each aspect; content, organization, vocabulary, grammar, and mechanics also improved. � Key Words: writing ability, hortatory exposition text, process-genre based approach, youtube video

  16. Text Maps: Helping Students Navigate Informational Texts.

    Science.gov (United States)

    Spencer, Brenda H.

    2003-01-01

    Notes that a text map is an instructional approach designed to help students gain fluency in reading content area materials. Discusses how the goal is to teach students about the important features of the material and how the maps can be used to build new understandings. Presents the procedures for preparing and using a text map. (SG)

  17. The Effects of Using Multimodal Approaches in Meaning-Making of 21st Century Literacy Texts Among ESL Students in a Private School in Malaysia

    Directory of Open Access Journals (Sweden)

    Malini Ganapathy

    2016-04-01

    Full Text Available In today’s globalised digital era, students are inevitably engaged in various multimodal texts due to their active participation in social media and frequent usage of mobile devices on a daily basis. Such daily activities advocate the need for a transformation in the teaching and learning of ESL lessons in order to promote students’ capabilities in making meaning of different literacy texts which students come across in their ESL learning activities. This paper puts forth the framework of Multimodality in the restructuring of the teaching and learning of ESL with the aim of investigating its effects and students perspectives on the use of multimodal approaches underlying the Multiliteracies theory. Using focus group interviews, this qualitative case study examines the effectiveness of ESL teaching and learning using the Multimodal approaches on literacy in meaning-making among 15 students in a private school in Penang, Malaysia. The results confirm the need to reorientate the teaching and learning of ESL with the focus on multimodal pedagogical practices as it promotes positive learning outcomes among students. The implications of this study suggest that the multimodal approaches integrated in the teaching and learning of ESL have the capacity to promote students’ autonomy in learning, improve motivation to learn and facilitate various learning styles. Keywords: Multimodal Approaches; Multiliteracies; Monomodal; Flipped Classroom; Literacy; Multimodal texts; Ipad

  18. Systematic text condensation

    DEFF Research Database (Denmark)

    Malterud, Kirsti

    2012-01-01

    To present background, principles, and procedures for a strategy for qualitative analysis called systematic text condensation and discuss this approach compared with related strategies.......To present background, principles, and procedures for a strategy for qualitative analysis called systematic text condensation and discuss this approach compared with related strategies....

  19. Suggestions toward some discourse-analytic approaches to text difficulty: with special reference to ‘T-unit configuration’ in the textual unfolding

    Directory of Open Access Journals (Sweden)

    Kazem Lotfipour-Saedi

    2015-01-01

    Full Text Available This paper represents some suggestions towards discourse-analytic approaches for ESL/EFL education, with the focus on identifying the textual forms which can contribute to the textual difficulty. Textual difficulty / comprehensibility, rather than being purely text-based or reader-dependent, is certainly a matter of interaction between text and reader. The paper will look at some of the textual factors which can be argued to make a text more or less readable for the same reader. The main focus here will be on academic texts. The high cognitive load and low readability of the expository texts in various academic disciplines will be argued to belong to certain textual strategies as well as variations in the configurations of the T-units as the prime scaffolding for the textualization process. Different categories of these variations to be discussed here will be exemplified from a few academic and expository registers. More extensive textual analyses will, of course, be necessary in order to be able to make evidential suggestions for possible correlations between certain types and clusters of T-unit configurations on the one hand, and cognitive load and readability indices on the other, across various academic registers, genres and disciplines.

  20. Bilingual approach to online cancer genetics education for Deaf American Sign Language users produces greater knowledge and confidence than English text only: A randomized study.

    Science.gov (United States)

    Palmer, Christina G S; Boudreault, Patrick; Berman, Barbara A; Wolfson, Alicia; Duarte, Lionel; Venne, Vickie L; Sinsheimer, Janet S

    2017-01-01

    Deaf American Sign Language-users (ASL) have limited access to cancer genetics information they can readily understand, increasing risk for health disparities. We compared effectiveness of online cancer genetics information presented using a bilingual approach (ASL with English closed captioning) and a monolingual approach (English text). Bilingual modality would increase cancer genetics knowledge and confidence to create a family tree; education would interact with modality. We used a parallel 2:1 randomized pre-post study design stratified on education. 150 Deaf ASL-users ≥18 years old with computer and internet access participated online; 100 (70 high, 30 low education) and 50 (35 high, 15 low education) were randomized to the bilingual and monolingual modalities. Modalities provide virtually identical content on creating a family tree, using the family tree to identify inherited cancer risk factors, understanding how cancer predisposition can be inherited, and the role of genetic counseling and testing for prevention or treatment. 25 true/false items assessed knowledge; a Likert scale item assessed confidence. Data were collected within 2 weeks before and after viewing the information. Significant interaction of language modality, education, and change in knowledge scores was observed (p = .01). High education group increased knowledge regardless of modality (Bilingual: p information than a monolingual approach. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  1. Time delay and profit accumulation effect on a mine-based uranium market clearing model

    International Nuclear Information System (INIS)

    Auzans, Aris; Teder, Allan; Tkaczyk, Alan H.

    2016-01-01

    Highlights: • Improved version of a mine-based uranium market clearing model for the front-end uranium market and enrichment industries is proposed. • A profit accumulation algorithm and time delay function provides more realistic uranium mine decision making process. • Operational decision delay increased uranium market price volatility. - Abstract: The mining industry faces a number of challenges such as market volatility, investment safety, issues surrounding employment and productivity. Therefore, computer simulations are highly relevant in order to reduce financial risks associated with these challenges. In the mining industry, each firm must compete with other mines and the basic target is profit maximization. The aim of this paper is to evaluate the world uranium (U) supply by simulating financial management challenges faced by an individual U mine that are caused by a variety of regulation issues. In this paper front-end nuclear fuel cycle tool is used to simulate market conditions and the effects they have on the stability of U supply. An individual U mine’s exit or entry in the market might cause changes in the U supply side which can increase or decrease the market price. In this paper we offer a more advanced version of a mine-based U market clearing model. The existing U market model incorporates the market of primary U from uranium mines with secondary uranium (depleted uranium DU), enriched uranium (HEU) and enrichment services. In the model each uranium mine acts as an independent agent that is able to make operational decisions based on the market price. This paper introduces a more realistic decision making algorithm of individual U mine that adds constraints to production decisions. The authors added an accumulated profit model, which allows for the profits accumulated to cover any possible future economic losses and the time-delay algorithm to simulate delayed process of reopening a U mine. The U market simulation covers time period 2010

  2. Time delay and profit accumulation effect on a mine-based uranium market clearing model

    Energy Technology Data Exchange (ETDEWEB)

    Auzans, Aris [Institute of Physics, University of Tartu, Ostwaldi 1, EE-50411 Tartu (Estonia); Teder, Allan [School of Economics and Business Administration, University of Tartu, Narva mnt 4, EE-51009 Tartu (Estonia); Tkaczyk, Alan H., E-mail: alan@ut.ee [Institute of Physics, University of Tartu, Ostwaldi 1, EE-50411 Tartu (Estonia)

    2016-12-15

    Highlights: • Improved version of a mine-based uranium market clearing model for the front-end uranium market and enrichment industries is proposed. • A profit accumulation algorithm and time delay function provides more realistic uranium mine decision making process. • Operational decision delay increased uranium market price volatility. - Abstract: The mining industry faces a number of challenges such as market volatility, investment safety, issues surrounding employment and productivity. Therefore, computer simulations are highly relevant in order to reduce financial risks associated with these challenges. In the mining industry, each firm must compete with other mines and the basic target is profit maximization. The aim of this paper is to evaluate the world uranium (U) supply by simulating financial management challenges faced by an individual U mine that are caused by a variety of regulation issues. In this paper front-end nuclear fuel cycle tool is used to simulate market conditions and the effects they have on the stability of U supply. An individual U mine’s exit or entry in the market might cause changes in the U supply side which can increase or decrease the market price. In this paper we offer a more advanced version of a mine-based U market clearing model. The existing U market model incorporates the market of primary U from uranium mines with secondary uranium (depleted uranium DU), enriched uranium (HEU) and enrichment services. In the model each uranium mine acts as an independent agent that is able to make operational decisions based on the market price. This paper introduces a more realistic decision making algorithm of individual U mine that adds constraints to production decisions. The authors added an accumulated profit model, which allows for the profits accumulated to cover any possible future economic losses and the time-delay algorithm to simulate delayed process of reopening a U mine. The U market simulation covers time period 2010

  3. Zum Bildungspotenzial biblischer Texte

    Directory of Open Access Journals (Sweden)

    Theis, Joachim

    2017-11-01

    Full Text Available Biblical education as a holistic process goes far beyond biblical learning. It must be understood as a lifelong process, in which both biblical texts and their understanders operate appropriating their counterpart in a dialogical way. – Neither does the recipient’s horizon of understanding appear as an empty room, which had to be filled with the text only, nor is the latter a dead material one could only examine cognitively. The recipient discovers the meaning of the biblical text recomposing it by existential appropriation. So the text is brought to live in each individual reality. Both scientific insights and subjective structures as well as the understanders’ community must be included to avoid potential one-sidednesses. Unfortunately, a special negative association obscures the approach of the bible very often: Still biblical work as part of religious education appears in a cognitively oriented habit, which is neither regarding the vitality and sovereignty of the biblical texts nor the students’ desire for meaning. Moreover, the bible is getting misused for teaching moral terms or pontifications. Such downfalls can be disrupted by biblical didactics which are empowerment didactics. Regarding the sovereignty of biblical texts, these didactics assist the understander with his/her individuation by opening the texts with focus on the understander’s otherness. Thus each the text and the recipient become subjects in a dialogue. The approach of the Biblical-Enabling-Didactics leads the Bible to become always new a book of life. Understanding them from within their hermeneutics, empowerment didactics could be raised to the principle of biblical didactics in general and grow into an essential element of holistic education.

  4. Core Values in Nursing Care Based on the Experiences of Nurses Engaged in Neonatal Nursing: A Text-mining Approach for Analyzing Reflection Records

    Science.gov (United States)

    Watanabe, Hiromi; Okuda, Reiko; Hagino, Hiroshi

    2018-01-01

    Background Strong feelings about and enthusiasm for nursing care are reflected in nurses’ thoughts and behaviors in clinical practice and affect their profession. This study was conducted to identify the characteristics of core values in nursing care based on the experiences of nurses engaged in neonatal nursing through a process for recognizing the conceptualization of nursing. Methods We conceptualized nursing care in 43 nurses who were involved in neonatal nursing using a reflection sheet. We classified descriptions on a sheet based on the Three-Staged Recognition scheme and analyzed them using a text-mining approach. Results Nurses involved in neonatal nursing recognized that they must take care of the “child,” “mother,” and “family.” Important elements of nursing in nurses with less than 5 years versus 5 or more years of neonatal nursing experience were classified into seven clusters, respectively. These elements were mainly related to family members in both groups. In nurses with less than 5 years of experience, four clusters of one-way communication by nurses were observed in the analysis of the key elements in nursing. On the other hand, five clusters of mutual relationships between patients, their family members, and nurses were observed in nurses with 5 or more years of experience. Conclusion In conclusion, the core value of nurses engaged in neonatal nursing is family-oriented nursing. Nurses with 5 or more years of neonatal nursing experience understand patients and their family members well through establishing relationships and providing comfort and safety while taking care of them. PMID:29599621

  5. Semantic Mining based on graph theory and ontologies. Case Study: Cell Signaling Pathways

    Directory of Open Access Journals (Sweden)

    Carlos R. Rangel

    2016-08-01

    Full Text Available In this paper we use concepts from graph theory and cellular biology represented as ontologies, to carry out semantic mining tasks on signaling pathway networks. Specifically, the paper describes the semantic enrichment of signaling pathway networks. A cell signaling network describes the basic cellular activities and their interactions. The main contribution of this paper is in the signaling pathway research area, it proposes a new technique to analyze and understand how changes in these networks may affect the transmission and flow of information, which produce diseases such as cancer and diabetes. Our approach is based on three concepts from graph theory (modularity, clustering and centrality frequently used on social networks analysis. Our approach consists into two phases: the first uses the graph theory concepts to determine the cellular groups in the network, which we will call them communities; the second uses ontologies for the semantic enrichment of the cellular communities. The measures used from the graph theory allow us to determine the set of cells that are close (for example, in a disease, and the main cells in each community. We analyze our approach in two cases: TGF-ß and the Alzheimer Disease.

  6. Directed Activities Related to Text: Text Analysis and Text Reconstruction.

    Science.gov (United States)

    Davies, Florence; Greene, Terry

    This paper describes Directed Activities Related to Text (DART), procedures that were developed and are used in the Reading for Learning Project at the University of Nottingham (England) to enhance learning from texts and that fall into two broad categories: (1) text analysis procedures, which require students to engage in some form of analysis of…

  7. A Customizable Text Classifier for Text Mining

    Directory of Open Access Journals (Sweden)

    Yun-liang Zhang

    2007-12-01

    Full Text Available Text mining deals with complex and unstructured texts. Usually a particular collection of texts that is specified to one or more domains is necessary. We have developed a customizable text classifier for users to mine the collection automatically. It derives from the sentence category of the HNC theory and corresponding techniques. It can start with a few texts, and it can adjust automatically or be adjusted by user. The user can also control the number of domains chosen and decide the standard with which to choose the texts based on demand and abundance of materials. The performance of the classifier varies with the user's choice.

  8. The Effects of Using Multimodal Approaches in Meaning-Making of 21st Century Literacy Texts among ESL Students in a Private School in Malaysia

    Science.gov (United States)

    Ganapathy, Malina; Seetharam, Saundravalli A/P

    2016-01-01

    In today's globalised digital era, students are inevitably engaged in various multimodal texts due to their active participation in social media and frequent usage of mobile devices on a daily basis. Such daily activities advocate the need for a transformation in the teaching and learning of ESL lessons in order to promote students' capabilities…

  9. Prediction of Learning and Comprehension when Adolescents Read Multiple Texts: The Roles of Word-Level Processing, Strategic Approach, and Reading Motivation

    Science.gov (United States)

    Braten, Ivar; Ferguson, Leila E.; Anmarkrud, Oistein; Stromso, Helge I.

    2013-01-01

    Sixty-five Norwegian 10th graders used the software Read&Answer 2.0 (Vidal-Abarca et al., 2011) to read five different texts presenting conflicting views on the controversial scientific issue of sun exposure and health. Participants were administered a multiple-choice topic-knowledge measure before and after reading, a word recognition task,…

  10. Compreensão textual em alunos de segunda e terceira séries: uma abordagem cognitiva Text comprehension in second and third graders: a cognitive approach

    Directory of Open Access Journals (Sweden)

    Jerusa Fumagalli de Salles

    2004-04-01

    Full Text Available Este estudo teve como objetivo analisar a compreensão de leitura textual de alunos de 2ª e 3ª séries. Participaram 76 crianças, com média de idade de 8,1 anos. Cada criança lia a história, recontava-a e, posteriormente, respondia a questões. Os recontos foram analisados segundo o Modelo de Compreensão Textual de Kintsch & van Dijk (1978 e Kintsch (1988, 1998. A amostra relatou, em média, 21,07% da estrutura proposicional da história, sendo mais freqüente o relato de macroproposições. Alunos da terceira série foram superiores aos da segunda série no relato de microproposições menos relevantes do texto e em responder a questões pontuais sobre a história. Foi encontrada uma correlação significativa entre idade e o reconto da macroestrutura textual. Os resultados sugerem que durante os primeiros anos de escolarização ocorreu uma melhora da memorização de detalhes, enquanto que a retenção das idéias essenciais foi influenciada pelas variações de idade das crianças.This study aimed to analyze text comprehension of students of the 2nd and 3rd grades. The sample was constituted by 76 children, at an average of 8.1 years old. Each child read the story, retold it and, afterwards, answered questions about it. The retellings were analyzed according to the model of Text Comprehension of Kintsch and van Dijk (1978 and Kintsch (1988, 1998. The sample recalled a mean of 21.07% of the proposition structure of the story, being the report of macropropositions more frequent. Students of the third grade told larger percentage of irrelevant micropropositions of the text and they were superior in answering to specific questions than students of the second grade. A significant correlation was found between age and macroproposition's retelling. The results suggest that during the first years of schooling there is an improvement of the detail-remembering, whereas the retention of the essential ideas is influenced by age differences.

  11. How did Popular Science Become a Legend? On the linguistic communication of “Science Culture” book series in 1990s Taiwan from the approach of text analysis

    Directory of Open Access Journals (Sweden)

    Ruey-Lin Chen

    2018-01-01

    Full Text Available Commonwealth Publishing Co. in Taiwan has published a series of popular science books, named Science Culture, since 1991. This series has achieved great success in publication and in marketing and up to the present has published over 164 volumes and sold out a great number of hard copies. It is well regarded as a publication legend. How did it succeed? What strategies has it adopted to become such a legend? This paper shows that the series’ success depends on two strategies: exciting subjects and strengthening the first impression. This research applies three related tactics or techniques publishing scientific biographies, literary rhetoric, and using romanticizing titles to realize two strategies of the series. This paper reveals these strategies and techniques by investigating the writing style of books in the series, comparing the titles of the series with other titles of popular science books before 1990, and conducting interviews with the editors of that series.

  12. Unpacking the Black Box: A Formative Research Approach to the Development of Theory-Driven, Evidence-Based, and Culturally Safe Text Messages in Mobile Health Interventions.

    Science.gov (United States)

    Maar, Marion A; Yeates, Karen; Toth, Zsolt; Barron, Marcia; Boesch, Lisa; Hua-Stewart, Diane; Liu, Peter; Perkins, Nancy; Sleeth, Jessica; Wabano, Mary Jo; Williamson, Pamela; Tobe, Sheldon W

    2016-01-22

    evidence-based text message created by researchers and the message received by the recipient in mobile health interventions. These discrepancies were primarily generated by six mediators of meaning in SMS messages: (1) negative or non-affirming framing of advocacies, (2) fear- or stress-inducing content, (3) oppressive or authoritarian content, (4) incongruity with cultural and traditional practices, (5) disconnect with the reality of the social determinants of health and the diversity of cultures within a population, and (6) lack of clarity and/or practicality of content. These 6 mediators of meaning provide the basis for sound strategies for message development because they impact directly on the target populations' capability, opportunity, and motivation for behavior change. The quality of text messages impacts significantly on the effectiveness of a mobile health intervention. Our research underscores the urgent need for interventions to incorporate and evaluate the quality of SMS messages and to examine the mediators of meaning within each targeted cultural and demographic group. Reporting on this aspect of mobile health intervention research will allow researchers to move away from the current black box of SMS text message development, thus improving the transparency of the process as well as the quality of the outcomes.

  13. O texto bíblico e a igreja católica romana: aproximações pastorais = Bible text and Roman Catholic Church: approaches pastoral

    Directory of Open Access Journals (Sweden)

    Junqueira, Sérgio Rogério Azevedo

    2013-01-01

    Full Text Available O texto é parte de uma pesquisa qualitativa histórica sobre o uso do texto bíblico na pastoral. Articulado a partir do início da era cristã, perpassando pelo período medieval, renascimento, moderno e contemporâneo, este breve estudo histórico será pressuposto para outras etapas da pesquisa do uso pastoral da Bíblia. Significativamente pelo fato de que, ao longo dos séculos, o uso do texto bíblico vinha acompanhado de várias questões acerca de quem e como interpretá-lo, considerando a tradição e o magistério, de forma que houve uma restrição ao texto para a maioria dos cristãos. Trata-se de uma longa história da qual se pretende apresentar alguns acenos para levantar novas questões, sobretudo, quanto ao lugar da Escritura na pastoral da Igreja hoje, especialmente na pastoral escolar

  14. Avoid violence, rioting, and outrage; approach celebration, delight, and strength: Using large text corpora to compute valence, arousal, and the basic emotions.

    Science.gov (United States)

    Westbury, Chris; Keith, Jeff; Briesemeister, Benny B; Hofmann, Markus J; Jacobs, Arthur M

    2015-01-01

    Ever since Aristotle discussed the issue in Book II of his Rhetoric, humans have attempted to identify a set of "basic emotion labels". In this paper we propose an algorithmic method for evaluating sets of basic emotion labels that relies upon computed co-occurrence distances between words in a 12.7-billion-word corpus of unselected text from USENET discussion groups. Our method uses the relationship between human arousal and valence ratings collected for a large list of words, and the co-occurrence similarity between each word and emotion labels. We assess how well the words in each of 12 emotion label sets-proposed by various researchers over the past 118 years-predict the arousal and valence ratings on a test and validation dataset, each consisting of over 5970 items. We also assess how well these emotion labels predict lexical decision residuals (LDRTs), after co-varying out the effects attributable to basic lexical predictors. We then demonstrate a generalization of our method to determine the most predictive "basic" emotion labels from among all of the putative models of basic emotion that we considered. As well as contributing empirical data towards the development of a more rigorous definition of basic emotions, our method makes it possible to derive principled computational estimates of emotionality-specifically, of arousal and valence-for all words in the language.

  15. Advantages of combined touch screen technology and text hyperlink for the pathology grossing manual: a simple approach to access instructive information in biohazardous environments.

    Science.gov (United States)

    Qu, Zhenhong; Ghorbani, Rhonda P; Li, Hongyan; Hunter, Robert L; Hannah, Christina D

    2007-03-01

    Gross examination, encompassing description, dissection, and sampling, is a complex task and an essential component of surgical pathology. Because of the complexity of the task, standardized protocols to guide the gross examination often become a bulky manual that is difficult to use. This problem is further compounded by the high specimen volume and biohazardous nature of the task. As a result, such a manual is often underused, leading to errors that are potentially harmful and time consuming to correct-a common chronic problem affecting many pathology laboratories. To combat this problem, we have developed a simple method that incorporates complex text and graphic information of a typical procedure manual and yet allows easy access to any intended instructive information in the manual. The method uses the Object-Linking-and-Embedding function of Microsoft Word (Microsoft, Redmond, WA) to establish hyperlinks among different contents, and then it uses the touch screen technology to facilitate navigation through the manual on a computer screen installed at the cutting bench with no need for a physical keyboard or a mouse. It takes less than 4 seconds to reach any intended information in the manual by 3 to 4 touches on the screen. A 3-year follow-up study shows that this method has increased use of the manual and has improved the quality of gross examination. The method is simple and can be easily tailored to different formats of instructive information, allowing flexible organization, easy access, and quick navigation. Increased compliance to instructive information reduces errors at the grossing bench and improves work efficiency.

  16. Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine.

    Directory of Open Access Journals (Sweden)

    Ayush Singhal

    2016-11-01

    Full Text Available The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed. Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD, diabetes mellitus, and cystic fibrosis. We then evaluate our approach in two ways: (1 a direct comparison with the state of the art using benchmark datasets; (2 a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79 over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB, we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets

  17. Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine.

    Science.gov (United States)

    Singhal, Ayush; Simmons, Michael; Lu, Zhiyong

    2016-11-01

    The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease

  18. Active Learning for Text Classification

    OpenAIRE

    Hu, Rong

    2011-01-01

    Text classification approaches are used extensively to solve real-world challenges. The success or failure of text classification systems hangs on the datasets used to train them, without a good dataset it is impossible to build a quality system. This thesis examines the applicability of active learning in text classification for the rapid and economical creation of labelled training data. Four main contributions are made in this thesis. First, we present two novel selection strategies to cho...

  19. Text-Fabric

    NARCIS (Netherlands)

    Roorda, Dirk

    2016-01-01

    Text-Fabric is a Python3 package for Text plus Annotations. It provides a data model, a text file format, and a binary format for (ancient) text plus (linguistic) annotations. The emphasis of this all is on: data processing; sharing data; and contributing modules. A defining characteristic is that

  20. Contextual Text Mining

    Science.gov (United States)

    Mei, Qiaozhu

    2009-01-01

    With the dramatic growth of text information, there is an increasing need for powerful text mining systems that can automatically discover useful knowledge from text. Text is generally associated with all kinds of contextual information. Those contexts can be explicit, such as the time and the location where a blog article is written, and the…

  1. XML and Free Text.

    Science.gov (United States)

    Riggs, Ken Roger

    2002-01-01

    Discusses problems with marking free text, text that is either natural language or semigrammatical but unstructured, that prevent well-formed XML from marking text for readily available meaning. Proposes a solution to mark meaning in free text that is consistent with the intended simplicity of XML versus SGML. (Author/LRW)

  2. A method for integrating and ranking the evidence for biochemical pathways by mining reactions from text

    Science.gov (United States)

    Miwa, Makoto; Ohta, Tomoko; Rak, Rafal; Rowley, Andrew; Kell, Douglas B.; Pyysalo, Sampo; Ananiadou, Sophia

    2013-01-01

    Motivation: To create, verify and maintain pathway models, curators must discover and assess knowledge distributed over the vast body of biological literature. Methods supporting these tasks must understand both the pathway model representations and the natural language in the literature. These methods should identify and order documents by relevance to any given pathway reaction. No existing system has addressed all aspects of this challenge. Method: We present novel methods for associating pathway model reactions with relevant publications. Our approach extracts the reactions directly from the models and then turns them into queries for three text mining-based MEDLINE literature search systems. These queries are executed, and the resulting documents are combined and ranked according to their relevance to the reactions of interest. We manually annotate document-reaction pairs with the relevance of the document to the reaction and use this annotation to study several ranking methods, using various heuristic and machine-learning approaches. Results: Our evaluation shows that the annotated document-reaction pairs can be used to create a rule-based document ranking system, and that machine learning can be used to rank documents by their relevance to pathway reactions. We find that a Support Vector Machine-based system outperforms several baselines and matches the performance of the rule-based system. The success of the query extraction and ranking methods are used to update our existing pathway search system, PathText. Availability: An online demonstration of PathText 2 and the annotated corpus are available for research purposes at http://www.nactem.ac.uk/pathtext2/. Contact: makoto.miwa@manchester.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23813008

  3. E-text

    DEFF Research Database (Denmark)

    Finnemann, Niels Ole

    2018-01-01

    text can be defined by taking as point of departure the digital format in which everything is represented in the binary alphabet. While the notion of text, in most cases, lends itself to be independent of medium and embodiment, it is also often tacitly assumed that it is, in fact, modeled around...... the print medium, rather than written text or speech. In late 20th century, the notion of text was subject to increasing criticism as in the question raised within literary text theory: is there a text in this class? At the same time, the notion was expanded by including extra linguistic sign modalities...

  4. Texting on the Move

    Science.gov (United States)

    ... text. What's the Big Deal? The problem is multitasking. No matter how young and agile we are, ... on something other than the road. In fact, driving while texting (DWT) can be more dangerous than ...

  5. SA-Search: a web tool for protein structure mining based on a Structural Alphabet.

    Science.gov (United States)

    Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre

    2004-07-01

    SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs.jussieu.fr/cgi-bin/SA-Search.

  6. Text Coherence in Translation

    Science.gov (United States)

    Zheng, Yanping

    2009-01-01

    In the thesis a coherent text is defined as a continuity of senses of the outcome of combining concepts and relations into a network composed of knowledge space centered around main topics. And the author maintains that in order to obtain the coherence of a target language text from a source text during the process of translation, a translator can…

  7. A Novel Method of Interestingness Measures for Association Rules Mining Based on Profit

    Directory of Open Access Journals (Sweden)

    Chunhua Ju

    2015-01-01

    Full Text Available Association rules mining is an important topic in the domain of data mining and knowledge discovering. Some papers have presented several interestingness measure methods; the most typical are Support, Confidence, Lift, Improve, and so forth. But their limitations are obvious, like no objective criterion, lack of statistical base, disability of defining negative relationship, and so forth. This paper proposes three new methods, Bi-lift, Bi-improve, and Bi-confidence, for Lift, Improve, and Confidence, respectively. Then, on the basis of utility function and the executing cost of rules, we propose interestingness function based on profit (IFBP considering subjective preferences and characteristics of specific application object. Finally, a novel measure framework is proposed to improve the traditional one through experimental analysis. In conclusion, the new methods and measure framework are prior to the traditional ones in the aspects of objective criterion, comprehensive definition, and practical application.

  8. The Production Measurement Model of Open Pit Mine Based on Truck Operation Diagram

    Directory of Open Access Journals (Sweden)

    Sun Xiao-Yu

    2016-01-01

    Full Text Available Conventional production measurement of truck dispatching system in open pit mine has not been effectively expressed by a mathematical model, which brings a negative effect on the subsequent data mining and a compatibility issue to apply the production measurement with fixed assignment of truck. In this study, based on the proposed concept that truck is not only the carrier of transport material, but also act as the bridges and linkages between the loading sites and the unloading sites, a new truck operation diagram was established, which was further developed to a basic data matrix and a production measurement model. The new model allowed to calculatethe production measurement of the transport, loading, unloading, material and etc, respectively, as well as with any calculation in combination of more than one factor as needed.It solved the compatibility issue between conventional production measurement and the production measurement of fixed assignment of truck with good practical results.

  9. Air Pollution Monitoring and Mining Based on Sensor Grid in London

    Directory of Open Access Journals (Sweden)

    John Hassard

    2008-06-01

    Full Text Available In this paper, we present a distributed infrastructure based on wireless sensors network and Grid computing technology for air pollution monitoring and mining, which aims to develop low-cost and ubiquitous sensor networks to collect real-time, large scale and comprehensive environmental data from road traffic emissions for air pollution monitoring in urban environment. The main informatics challenges in respect to constructing the high-throughput sensor Grid are discussed in this paper. We present a twolayer network framework, a P2P e-Science Grid architecture, and the distributed data mining algorithm as the solutions to address the challenges. We simulated the system in TinyOS to examine the operation of each sensor as well as the networking performance. We also present the distributed data mining result to examine the effectiveness of the algorithm.

  10. A DATA-MINING BASED METHOD FOR THE GAIT PATTERN ANALYSIS

    Directory of Open Access Journals (Sweden)

    Marcelo Rudek

    2015-12-01

    Full Text Available The paper presents a method developed for the gait classification based on the analysis of the trajectory of the pressure centres (CoP extracted from the contact points of the feet with the ground during walking. The data acquirement is performed ba means of a walkway with embedded tactile sensors. The proposed method includes capturing procedures, standardization of data, creation of an organized repository (data warehouse, and development of a process mining. A graphical analysis is applied to looking at the footprint signature patterns. The aim is to obtain a visual interpretation of the grouping by situating it into the normal walking patterns or deviations associated with an individual way of walking. The method consists of data classification automation which divides them into healthy and non-healthy subjects in order to assist in rehabilitation treatments for the people with related mobility problems.

  11. Gas Emission Prediction Model of Coal Mine Based on CSBP Algorithm

    Directory of Open Access Journals (Sweden)

    Xiong Yan

    2016-01-01

    Full Text Available In view of the nonlinear characteristics of gas emission in a coal working face, a prediction method is proposed based on cuckoo search algorithm optimized BP neural network (CSBP. In the CSBP algorithm, the cuckoo search is adopted to optimize weight and threshold parameters of BP network, and obtains the global optimal solutions. Furthermore, the twelve main affecting factors of the gas emission in the coal working face are taken as input vectors of CSBP algorithm, the gas emission is acted as output vector, and then the prediction model of BP neural network with optimal parameters is established. The results show that the CSBP algorithm has batter generalization ability and higher prediction accuracy, and can be utilized effectively in the prediction of coal mine gas emission.

  12. Data mining-based coefficient of influence factors optimization of test paper reliability

    Science.gov (United States)

    Xu, Peiyao; Jiang, Huiping; Wei, Jieyao

    2018-05-01

    Test is a significant part of the teaching process. It demonstrates the final outcome of school teaching through teachers' teaching level and students' scores. The analysis of test paper is a complex operation that has the characteristics of non-linear relation in the length of the paper, time duration and the degree of difficulty. It is therefore difficult to optimize the coefficient of influence factors under different conditions in order to get text papers with clearly higher reliability with general methods [1]. With data mining techniques like Support Vector Regression (SVR) and Genetic Algorithm (GA), we can model the test paper analysis and optimize the coefficient of impact factors for higher reliability. It's easy to find that the combination of SVR and GA can get an effective advance in reliability from the test results. The optimal coefficient of influence factors optimization has a practicability in actual application, and the whole optimizing operation can offer model basis for test paper analysis.

  13. Vocabulary Constraint on Texts

    Directory of Open Access Journals (Sweden)

    C. Sutarsyah

    2008-01-01

    Full Text Available This case study was carried out in the English Education Department of State University of Malang. The aim of the study was to identify and describe the vocabulary in the reading text and to seek if the text is useful for reading skill development. A descriptive qualitative design was applied to obtain the data. For this purpose, some available computer programs were used to find the description of vocabulary in the texts. It was found that the 20 texts containing 7,945 words are dominated by low frequency words which account for 16.97% of the words in the texts. The high frequency words occurring in the texts were dominated by function words. In the case of word levels, it was found that the texts have very limited number of words from GSL (General Service List of English Words (West, 1953. The proportion of the first 1,000 words of GSL only accounts for 44.6%. The data also show that the texts contain too large proportion of words which are not in the three levels (the first 2,000 and UWL. These words account for 26.44% of the running words in the texts.  It is believed that the constraints are due to the selection of the texts which are made of a series of short-unrelated texts. This kind of text is subject to the accumulation of low frequency words especially those of content words and limited of words from GSL. It could also defeat the development of students' reading skills and vocabulary enrichment.

  14. Topography and Data Mining Based Methods for Improving Satellite Precipitation in Mountainous Areas of China

    Directory of Open Access Journals (Sweden)

    Ting Xia

    2015-07-01

    Full Text Available Topography is a significant factor influencing the spatial distribution of precipitation. This study developed a new methodology to evaluate and calibrate the Tropical Rainfall Measuring Mission Multi-satellite Precipitation Analysis (TMPA products by merging geographic and topographic information. In the proposed method, firstly, the consistency rule was introduced to evaluate the fitness of satellite rainfall with measurements on the grids with and without ground gauges. Secondly, in order to improve the consistency rate of satellite rainfall, genetic programming was introduced to mine the relationship between the gauge rainfall and location, elevation and TMPA rainfall. The proof experiment and analysis for the mean annual satellite precipitation from 2001–2012, 3B43 (V7 of TMPA rainfall product, was carried out in eight mountainous areas of China. The result shows that the proposed method is significant and efficient both for the assessment and improvement of satellite precipitation. It is found that the satellite rainfall consistency rates in the gauged and ungauged grids are different in the study area. In addition, the mined correlation of location-elevation-TMPA rainfall can noticeably improve the satellite precipitation, both in the context of the new criterion of the consistency rate and the existing criteria such as Bias and RMSD. The proposed method is also efficient for correcting the monthly and mean monthly rainfall of 3B43 and 3B42RT.

  15. Groundwater Mixing Process Identification in Deep Mines Based on Hydrogeochemical Property Analysis

    Directory of Open Access Journals (Sweden)

    Bo Liu

    2016-12-01

    Full Text Available Karst collapse columns, as a potential water passageway for mine water inrush, are always considered a critical problem for the development of deep mining techniques. This study aims to identify the mixing process of groundwater deriving two different limestone karst-fissure aquifer systems. Based on analysis of mining groundwater hydrogeochemical properties, hydraulic connection between the karst-fissure objective aquifer systems was revealed. In this paper, piper diagram was used to calculate the mixing ratios at different sampling points in the aquifer systems, and PHREEQC Interactive model (Version 2.5, USGS, Reston, VA, USA, 2001 was applied to modify the mixing ratios and model the water–rock interactions during the mixing processes. The analysis results show that the highest mixing ratio is 0.905 in the C12 borehole that is located nearest to the #2 karst collapse column, and the mixing ratio decreases with the increase of the distance from the #2 karst collapse column. It demonstrated that groundwater of the two aquifers mixed through the passage of #2 karst collapse column. As a result, the proposed Piper-PHREEQC based method can provide accurate identification of karst collapse columns’ water conductivity, and can be applied to practical applications.

  16. An association rule mining-based framework for understanding lifestyle risk behaviors.

    Directory of Open Access Journals (Sweden)

    So Hyun Park

    Full Text Available OBJECTIVES: This study investigated the prevalence and patterns of lifestyle risk behaviors in Korean adults. METHODS: We utilized data from the Fourth Korea National Health and Nutrition Examination Survey for 14,833 adults (>20 years of age. We used association rule mining to analyze patterns of lifestyle risk behaviors by characterizing non-adherence to public health recommendations related to the Alameda 7 health behaviors. The study variables were current smoking, heavy drinking, physical inactivity, obesity, inadequate sleep, breakfast skipping, and frequent snacking. RESULTS: Approximately 72% of Korean adults exhibited two or more lifestyle risk behaviors. Among women, current smoking, obesity, and breakfast skipping were associated with inadequate sleep. Among men, breakfast skipping with additional risk behaviors such as physical inactivity, obesity, and inadequate sleep was associated with current smoking. Current smoking with additional risk behaviors such as inadequate sleep or breakfast skipping was associated with physical inactivity. CONCLUSION: Lifestyle risk behaviors are intercorrelated in Korea. Information on patterns of lifestyle risk behaviors could assist in planning interventions targeted at multiple behaviors simultaneously.

  17. Dictionaries for text production

    DEFF Research Database (Denmark)

    Fuertes-Olivera, Pedro; Bergenholtz, Henning

    2018-01-01

    Dictionaries for Text Production are information tools that are designed and constructed for helping users to produce (i.e. encode) texts, both oral and written texts. These can be broadly divided into two groups: (a) specialized text production dictionaries, i.e., dictionaries that only offer...... a small amount of lexicographic data, most or all of which are typically used in a production situation, e.g. synonym dictionaries, grammar and spelling dictionaries, collocation dictionaries, concept dictionaries such as the Longman Language Activator, which is advertised as the World’s First Production...... Dictionary; (b) general text production dictionaries, i.e., dictionaries that offer all or most of the lexicographic data that are typically used in a production situation. A review of existing production dictionaries reveals that there are many specialized text production dictionaries but only a few general...

  18. Data-mining Based Detection of Glaciers: Quantifying the Extent of Alpine Valley Glaciation

    Directory of Open Access Journals (Sweden)

    Wei Luo

    2015-07-01

    Full Text Available The extent of glaciation in alpine valleys often gives clues to past climates, plate movement, mountain landforms, bedrock geology and more. However, without field investigation, the degree to which a valley was affected by a glacier has been difficult to assess. We developed a model that uses quantitative parameters derived from digital elevations model (DEM data to predict whether a glacier was likely present in an alpine valley. The model's inputs are mainly derived from the basin hypsometry, and a new parameter termed the Hypothetical Basin Equilibrium Elevation (HBEE, which is based on the equilibrium elevation altitude (ELA of a glacier. We used data mining techniques that comb through large data sets to find patterns for classification and prediction as the basis for the model. Four classifiers were utilized, and each was tested with two different training set/test data ratios of nearly 150 basins that were previously delineated as fully- or non-glaciated. The classifiers had a predictive accuracy of up to 90% with none falling below 72%. Two of the classifiers, classification tree and naïve-Bayes, have graphical outputs that visually describe the classification process, predictive results, and in the naïve-Bayes case, the relative effectiveness towards the model of each attribute. In all scenarios, the HBEE was found to be an accurate predictor for the model. The model can be applied to any area where glaciation may have occurred, but is particularly useful in areas where the valley is inaccessible for detailed field investigation.

  19. Instant Sublime Text starter

    CERN Document Server

    Haughee, Eric

    2013-01-01

    A starter which teaches the basic tasks to be performed with Sublime Text with the necessary practical examples and screenshots. This book requires only basic knowledge of the Internet and basic familiarity with any one of the three major operating systems, Windows, Linux, or Mac OS X. However, as Sublime Text 2 is primarily a text editor for writing software, many of the topics discussed will be specifically relevant to software development. That being said, the Sublime Text 2 Starter is also suitable for someone without a programming background who may be looking to learn one of the tools of

  20. A data mining based model for selecting type of treatment for kidney stone patients

    Directory of Open Access Journals (Sweden)

    Sepehri MM

    2009-09-01

    Full Text Available "n Normal 0 false false false EN-US X-NONE AR-SA MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:Arial; mso-bidi-theme-font:minor-bidi;} Background: Data mining as a multidisciplinary field is rooted in the fields such as statistics, mathematics, computer science and artificial intelligence and has been gaining momentum in scientific, managerial, and executive applications in health care. Data mining can be defined as the automated extraction of valuable, practical and hidden knowledge and information from large data. Applying data mining in medical records and data is of utmost importance for health care givers and providers and brings vital and valuable outcomes. Data mining can help doctors come up with better recommendations and plans for treatment which actually in many respects have significant impact on patients' life and satisfaction In this paper we have proposed and utilized data mining methods to extract hidden information in medical records of pelvis stone patients with ureteral stone. We have tried to design a decision support system model to be applicable for selecting type of treatment for these groups of patients."n"nMethods: We gathered needed information from Shahid Hashemi Nejad hospital. In this research we have used decision tree as a data mining tool, for selecting suitable treatment for patients with ureteral stone. This

  1. Linguistics in Text Interpretation

    DEFF Research Database (Denmark)

    Togeby, Ole

    2011-01-01

    A model for how text interpretation proceeds from what is pronounced, through what is said to what is comunicated, and definition of the concepts 'presupposition' and 'implicature'.......A model for how text interpretation proceeds from what is pronounced, through what is said to what is comunicated, and definition of the concepts 'presupposition' and 'implicature'....

  2. LocText

    DEFF Research Database (Denmark)

    Cejuela, Juan Miguel; Vinchurkar, Shrikant; Goldberg, Tatyana

    2018-01-01

    trees and was trained and evaluated on a newly improved LocTextCorpus. Combined with an automatic named-entity recognizer, LocText achieved high precision (P = 86%±4). After completing development, we mined the latest research publications for three organisms: human (Homo sapiens), budding yeast...

  3. The Perfect Text.

    Science.gov (United States)

    Russo, Ruth

    1998-01-01

    A chemistry teacher describes the elements of the ideal chemistry textbook. The perfect text is focused and helps students draw a coherent whole out of the myriad fragments of information and interpretation. The text would show chemistry as the central science necessary for understanding other sciences and would also root chemistry firmly in the…

  4. Text 2 Mind Map

    OpenAIRE

    Iona, John

    2017-01-01

    This is a review of the web resource 'Text 2 Mind Map' www.Text2MindMap.com. It covers what the resource is, and how it might be used in Library and education context, in particular for School Librarians.

  5. Text File Comparator

    Science.gov (United States)

    Kotler, R. S.

    1983-01-01

    File Comparator program IFCOMP, is text file comparator for IBM OS/VScompatable systems. IFCOMP accepts as input two text files and produces listing of differences in pseudo-update form. IFCOMP is very useful in monitoring changes made to software at the source code level.

  6. Text and ideology: text-oriented discourse analysis

    Directory of Open Access Journals (Sweden)

    Maria Eduarda Gonçalves Peixoto

    2018-04-01

    Full Text Available The article aims to contribute to the understanding of the connection between text and ideology articulated by the text-oriented analysis of discourse (ADTO. Based on the reflections of Fairclough (1989, 2001, 2003 and Fairclough and Chouliaraki (1999, the debate presents the social ontology that ADTO uses to base its conception of social life as an open system and textually mediated; the article then explains the chronological-narrative development of the main critical theories of ideology, by virtue of which ADTO organizes the assumptions that underpin the particular use it makes of the term. Finally, the discussion presents the main aspects of the connection between text and ideology, offering a conceptual framework that can contribute to the domain of the theme according to a critical discourse analysis approach.

  7. EST: Evading Scientific Text.

    Science.gov (United States)

    Ward, Jeremy

    2001-01-01

    Examines chemical engineering students' attitudes to text and other parts of English language textbooks. A questionnaire was administered to a group of undergraduates. Results reveal one way students get around the problem of textbook reading. (Author/VWL)

  8. nal Sesotho texts

    African Journals Online (AJOL)

    with literary texts written in indigenous South African languages. The project ... Homi Bhabha uses the words of Salman Rushdie to underline the fact that new .... I could not conceptualise an African-language-to-African-language dictionary. An.

  9. Plagiarism in Academic Texts

    Directory of Open Access Journals (Sweden)

    Marta Eugenia Rojas-Porras

    2012-08-01

    Full Text Available The ethical and social responsibility of citing the sources in a scientific or artistic work is undeniable. This paper explores, in a preliminary way, academic plagiarism in its various forms. It includes findings based on a forensic analysis. The purpose of this paper is to raise awareness on the importance of considering these details when writing and publishing a text. Hopefully, this analysis may put the issue under discussion.

  10. Machine Translation from Text

    Science.gov (United States)

    Habash, Nizar; Olive, Joseph; Christianson, Caitlin; McCary, John

    Machine translation (MT) from text, the topic of this chapter, is perhaps the heart of the GALE project. Beyond being a well defined application that stands on its own, MT from text is the link between the automatic speech recognition component and the distillation component. The focus of MT in GALE is on translating from Arabic or Chinese to English. The three languages represent a wide range of linguistic diversity and make the GALE MT task rather challenging and exciting.

  11. Emotion detection from text

    Science.gov (United States)

    Ramalingam, V. V.; Pandian, A.; Jaiswal, Abhijeet; Bhatia, Nikhar

    2018-04-01

    This paper presents a novel method based on concept of Machine Learning for Emotion Detection using various algorithms of Support Vector Machine and major emotions described are linked to the Word-Net for enhanced accuracy. The approach proposed plays a promising role to augment the Artificial Intelligence in the near future and could be vital in optimization of Human-Machine Interface.

  12. UNPUBLISHED TEXTS / INEDITI

    African Journals Online (AJOL)

    (Pretoria). Her research interests concentrate – within the broader area of. Italian literature – on the re-interpretation and re-invention of myth by modern and contemporary writers from a Jungian archetypal perspective which privileges an interdisciplinary approach involving literature, anthropology and depth psychology.

  13. TEXT Energy Storage System

    International Nuclear Information System (INIS)

    Weldon, W.F.; Rylander, H.G.; Woodson, H.H.

    1977-01-01

    The Texas Experimental Tokamak (TEXT) Enery Storage System, designed by the Center for Electromechanics (CEM), consists of four 50 MJ, 125 V homopolar generators and their auxiliaries and is designed to power the toroidal and poloidal field coils of TEXT on a two-minute duty cycle. The four 50 MJ generators connected in series were chosen because they represent the minimum cost configuration and also represent a minimal scale up from the successful 5.0 MJ homopolar generator designed, built, and operated by the CEM

  14. New mathematical cuneiform texts

    CERN Document Server

    Friberg, Jöran

    2016-01-01

    This monograph presents in great detail a large number of both unpublished and previously published Babylonian mathematical texts in the cuneiform script. It is a continuation of the work A Remarkable Collection of Babylonian Mathematical Texts (Springer 2007) written by Jöran Friberg, the leading expert on Babylonian mathematics. Focussing on the big picture, Friberg explores in this book several Late Babylonian arithmetical and metro-mathematical table texts from the sites of Babylon, Uruk and Sippar, collections of mathematical exercises from four Old Babylonian sites, as well as a new text from Early Dynastic/Early Sargonic Umma, which is the oldest known collection of mathematical exercises. A table of reciprocals from the end of the third millennium BC, differing radically from well-documented but younger tables of reciprocals from the Neo-Sumerian and Old-Babylonian periods, as well as a fragment of a Neo-Sumerian clay tablet showing a new type of a labyrinth are also discussed. The material is presen...

  15. The Emar Lexical Texts

    NARCIS (Netherlands)

    Gantzert, Merijn

    2011-01-01

    This four-part work provides a philological analysis and a theoretical interpretation of the cuneiform lexical texts found in the Late Bronze Age city of Emar, in present-day Syria. These word and sign lists, commonly dated to around 1100 BC, were almost all found in the archive of a single school.

  16. Text Induced Spelling Correction

    NARCIS (Netherlands)

    Reynaert, M.W.C.

    2004-01-01

    We present TISC, a language-independent and context-sensitive spelling checking and correction system designed to facilitate the automatic removal of non-word spelling errors in large corpora. Its lexicon is derived from a very large corpus of raw text, without supervision, and contains word

  17. Texts and Readers.

    Science.gov (United States)

    Iser, Wolfgang

    1980-01-01

    Notes that, since fictional discourse need not reflect prevailing systems of meaning and norms or values, readers gain detachment from their own presuppositions; by constituting and formulating text-sense, readers are constituting and formulating their own cognition and becoming aware of the operations for doing so. (FL)

  18. Documents and legal texts

    International Nuclear Information System (INIS)

    2017-01-01

    This section treats of the following documents and legal texts: 1 - Belgium 29 June 2014 - Act amending the Act of 22 July 1985 on Third-Party Liability in the Field of Nuclear Energy; 2 - Belgium, 7 December 2016. - Act amending the Act of 22 July 1985 on Third-Party Liability in the Field of Nuclear Energy

  19. Strategy as Texts

    DEFF Research Database (Denmark)

    Obed Madsen, Søren

    of the strategy into four categories. Second, the managers produce new texts based on the original strategy document by using four different ways of translation models. The study’s findings contribute to three areas. Firstly, it shows that translation is more than a sociological process. It is also...... a craftsmanship that requires knowledge and skills, which unfortunately seems to be overlooked in both the literature and in practice. Secondly, it shows that even though a strategy text is in singular, the translation makes strategy plural. Thirdly, the article proposes a way to open up the black box of what......This article shows empirically how managers translate a strategy plan at an individual level. By analysing how managers in three organizations translate strategies, it identifies that the translation happens in two steps: First, the managers decipher the strategy by coding the different parts...

  20. Reasoning with Annotations of Texts

    OpenAIRE

    Ma , Yue; Lévy , François; Ghimire , Sudeep

    2011-01-01

    International audience; Linguistic and semantic annotations are important features for text-based applications. However, achieving and maintaining a good quality of a set of annotations is known to be a complex task. Many ad hoc approaches have been developed to produce various types of annotations, while comparing those annotations to improve their quality is still rare. In this paper, we propose a framework in which both linguistic and domain information can cooperate to reason with annotat...

  1. Automatic text summarization

    CERN Document Server

    Torres Moreno, Juan Manuel

    2014-01-01

    This new textbook examines the motivations and the different algorithms for automatic document summarization (ADS). We performed a recent state of the art. The book shows the main problems of ADS, difficulties and the solutions provided by the community. It presents recent advances in ADS, as well as current applications and trends. The approaches are statistical, linguistic and symbolic. Several exemples are included in order to clarify the theoretical concepts.  The books currently available in the area of Automatic Document Summarization are not recent. Powerful algorithms have been develop

  2. Text Analysis: Critical Component of Planning for Text-Based Discussion Focused on Comprehension of Informational Texts

    Science.gov (United States)

    Kucan, Linda; Palincsar, Annemarie Sullivan

    2018-01-01

    This investigation focuses on a tool used in a reading methods course to introduce reading specialist candidates to text analysis as a critical component of planning for text-based discussions. Unlike planning that focuses mainly on important text content or information, a text analysis approach focuses both on content and how that content is…

  3. Reading Authentic Texts

    DEFF Research Database (Denmark)

    Balling, Laura Winther

    2013-01-01

    Most research on cognates has focused on words presented in isolation that are easily defined as cognate between L1 and L2. In contrast, this study investigates what counts as cognate in authentic texts and how such cognates are read. Participants with L1 Danish read news articles in their highly...... proficient L2, English, while their eye-movements were monitored. The experiment shows a cognate advantage for morphologically simple words, but only when cognateness is defined relative to translation equivalents that are appropriate in the context. For morphologically complex words, a cognate disadvantage...... word predictability indexed by the conditional probability of each word....

  4. Documents and legal texts

    International Nuclear Information System (INIS)

    2016-01-01

    This section treats of the following documents and legal texts: 1 - Brazil: Law No. 13,260 of 16 March 2016 (To regulate the provisions of item XLIII of Article 5 of the Federal Constitution on terrorism, dealing with investigative and procedural provisions and redefining the concept of a terrorist organisation; and amends Laws No. 7,960 of 21 December 1989 and No. 12,850 of 2 August 2013); 2 - India: The Atomic Energy (Amendment) Act, 2015; Department Of Atomic Energy Notification (Civil Liability for Nuclear Damage); 3 - Japan: Act on Subsidisation, etc. for Nuclear Damage Compensation Funds following the implementation of the Convention on Supplementary Compensation for Nuclear Damage

  5. Journalistic Text Production

    DEFF Research Database (Denmark)

    Haugaard, Rikke Hartmann

    , a multiple case study investigated three professional text producers’ practices as they unfolded in their natural setting at the Spanish newspaper, El Mundo. • Results indicate that journalists’ revisions are related to form markedly more often than to content. • Results suggest two writing phases serving...... at the Spanish newspaper, El Mundo, in Madrid. The study applied a combination of quantitative and qualitative methods, i.e. keystroke logging, participant observation and retrospective interview. Results indicate that journalists’ revisions are related to form markedly more often than to content (approx. three...

  6. Text Mining in Organizational Research.

    Science.gov (United States)

    Kobayashi, Vladimer B; Mol, Stefan T; Berkers, Hannah A; Kismihók, Gábor; Den Hartog, Deanne N

    2018-07-01

    Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies.

  7. Weitere Texte physiognomischen Inhalts

    Directory of Open Access Journals (Sweden)

    Böck, Barbara

    2004-12-01

    Full Text Available The present article offers the edition of three cuneiform texts belonging to the Akkadian handbook of omens drawn from the physical appearance as well as the morals and behaviour of man. The book comprising up to 27 chapters with more than 100 omens each was entitled in antiquity Alamdimmû. The edition of the three cuneiform tablets completes, thus, the author's monographic study on the ancient Mesopotamian divinatory discipline of physiognomy (Die babylonisch-assyrische Morphoskopie (Wien 2000 [=AfO Beih. 27].

    En este artículo se presenta la editio princeps de tres textos cuneiformes conservados en el British Museum (Londres y el Vorderasiatisches Museum (Berlín, que pertenecen al libro asirio-babilonio de presagios fisiognómicos. Este libro, titulado originalmente Alamdimmû ('forma, figura', consta de 27 capítulos, cada uno con más de cien presagios escritos en lengua acadia. Los tres textos completan así el estudio monográfico de la autora sobre la disciplina adivinatoria de la fisiognomía en el antiguo Oriente (Die babylonisch-assyrische Morphoskopie (Wien 2000 [=AfO Beih. 27].

  8. Utah Text Retrieval Project

    Energy Technology Data Exchange (ETDEWEB)

    Hollaar, L A

    1983-10-01

    The Utah Text Retrieval project seeks well-engineered solutions to the implementation of large, inexpensive, rapid text information retrieval systems. The project has three major components. Perhaps the best known is the work on the specialized processors, particularly search engines, necessary to achieve the desired performance and cost. The other two concern the user interface to the system and the system's internal structure. The work on user interface development is not only concentrating on the syntax and semantics of the query language, but also on the overall environment the system presents to the user. Environmental enhancements include convenient ways to browse through retrieved documents, access to other information retrieval systems through gateways supporting a common command interface, and interfaces to word processing systems. The system's internal structure is based on a high-level data communications protocol linking the user interface, index processor, search processor, and other system modules. This allows them to be easily distributed in a multi- or specialized-processor configuration. It also allows new modules, such as a knowledge-based query reformulator, to be added. 15 references.

  9. An automated approach for finding variable-constant pairing bugs

    DEFF Research Database (Denmark)

    Lawall, Julia; Lo, David

    2010-01-01

    program-analysis and data-mining based approach to identify the uses of named constants and to identify anomalies in these uses.  We have applied our approach to a recent version of the Linux kernel and have found a number of bugs affecting both correctness and software maintenance.  Many of these bugs...... have been validated by the Linux developers....

  10. Documents and legal texts

    International Nuclear Information System (INIS)

    2013-01-01

    This section reprints a selection of recently published legislative texts and documents: - Russian Federation: Federal Law No.170 of 21 November 1995 on the use of atomic energy, Adopted by the State Duma on 20 October 1995; - Uruguay: Law No.19.056 On the Radiological Protection and Safety of Persons, Property and the Environment (4 January 2013); - Japan: Third Supplement to Interim Guidelines on Determination of the Scope of Nuclear Damage resulting from the Accident at the Tokyo Electric Power Company Fukushima Daiichi and Daini Nuclear Power Plants (concerning Damages related to Rumour-Related Damage in the Agriculture, Forestry, Fishery and Food Industries), 30 January 2013; - France and the United States: Joint Statement on Liability for Nuclear Damage (Aug 2013); - Franco-Russian Nuclear Power Declaration (1 November 2013)

  11. Systematic characterizations of text similarity in full text biomedical publications.

    Science.gov (United States)

    Sun, Zhaohui; Errami, Mounir; Long, Tara; Renard, Chris; Choradia, Nishant; Garner, Harold

    2010-09-15

    Computational methods have been used to find duplicate biomedical publications in MEDLINE. Full text articles are becoming increasingly available, yet the similarities among them have not been systematically studied. Here, we quantitatively investigated the full text similarity of biomedical publications in PubMed Central. 72,011 full text articles from PubMed Central (PMC) were parsed to generate three different datasets: full texts, sections, and paragraphs. Text similarity comparisons were performed on these datasets using the text similarity algorithm eTBLAST. We measured the frequency of similar text pairs and compared it among different datasets. We found that high abstract similarity can be used to predict high full text similarity with a specificity of 20.1% (95% CI [17.3%, 23.1%]) and sensitivity of 99.999%. Abstract similarity and full text similarity have a moderate correlation (Pearson correlation coefficient: -0.423) when the similarity ratio is above 0.4. Among pairs of articles in PMC, method sections are found to be the most repetitive (frequency of similar pairs, methods: 0.029, introduction: 0.0076, results: 0.0043). In contrast, among a set of manually verified duplicate articles, results are the most repetitive sections (frequency of similar pairs, results: 0.94, methods: 0.89, introduction: 0.82). Repetition of introduction and methods sections is more likely to be committed by the same authors (odds of a highly similar pair having at least one shared author, introduction: 2.31, methods: 1.83, results: 1.03). There is also significantly more similarity in pairs of review articles than in pairs containing one review and one nonreview paper (frequency of similar pairs: 0.0167 and 0.0023, respectively). While quantifying abstract similarity is an effective approach for finding duplicate citations, a comprehensive full text analysis is necessary to uncover all potential duplicate citations in the scientific literature and is helpful when

  12. Interconnectedness und digitale Texte

    Directory of Open Access Journals (Sweden)

    Detlev Doherr

    2013-04-01

    Full Text Available Zusammenfassung Die multimedialen Informationsdienste im Internet werden immer umfangreicher und umfassender, wobei auch die nur in gedruckter Form vorliegenden Dokumente von den Bibliotheken digitalisiert und ins Netz gestellt werden. Über Online-Dokumentenverwaltungen oder Suchmaschinen können diese Dokumente gefunden und dann in gängigen Formaten wie z.B. PDF bereitgestellt werden. Dieser Artikel beleuchtet die Funktionsweise der Humboldt Digital Library, die seit mehr als zehn Jahren Dokumente von Alexander von Humboldt in englischer Übersetzung im Web als HDL (Humboldt Digital Library kostenfrei zur Verfügung stellt. Anders als eine digitale Bibliothek werden dabei allerdings nicht nur digitalisierte Dokumente als Scan oder PDF bereitgestellt, sondern der Text als solcher und in vernetzter Form verfügbar gemacht. Das System gleicht damit eher einem Informationssystem als einer digitalen Bibliothek, was sich auch in den verfügbaren Funktionen zur Auffindung von Texten in unterschiedlichen Versionen und Übersetzungen, Vergleichen von Absätzen verschiedener Dokumente oder der Darstellung von Bilden in ihrem Kontext widerspiegelt. Die Entwicklung von dynamischen Hyperlinks auf der Basis der einzelnen Textabsätze der Humboldt‘schen Werke in Form von Media Assets ermöglicht eine Nutzung der Programmierschnittstelle von Google Maps zur geographischen wie auch textinhaltlichen Navigation. Über den Service einer digitalen Bibliothek hinausgehend, bietet die HDL den Prototypen eines mehrdimensionalen Informationssystems, das mit dynamischen Strukturen arbeitet und umfangreiche thematische Auswertungen und Vergleiche ermöglicht. Summary The multimedia information services on Internet are becoming more and more comprehensive, even the printed documents are digitized and republished as digital Web documents by the libraries. Those digital files can be found by search engines or management tools and provided as files in usual formats as

  13. Documents and legal texts

    International Nuclear Information System (INIS)

    2015-01-01

    This section treats of the following Documents and legal texts: 1 - Canada: Nuclear Liability and Compensation Act (An Act respecting civil liability and compensation for damage in case of a nuclear incident, repealing the Nuclear Liability Act and making consequential amendments to other acts); 2 - Japan: Act on Compensation for Nuclear Damage (The purpose of this act is to protect persons suffering from nuclear damage and to contribute to the sound development of the nuclear industry by establishing a basic system regarding compensation in case of nuclear damage caused by reactor operation etc.); Act on Indemnity Agreements for Compensation of Nuclear Damage; 3 - Slovak Republic: Act on Civil Liability for Nuclear Damage and on its Financial Coverage and on Changes and Amendments to Certain Laws (This Act regulates: a) The civil liability for nuclear damage incurred in the causation of a nuclear incident, b) The scope of powers of the Nuclear Regulatory Authority (hereinafter only as the 'Authority') in relation to the application of this Act, c) The competence of the National Bank of Slovakia in relation to the supervised financial market entities in the financial coverage of liability for nuclear damage; and d) The penalties for violation of this Act)

  14. Documents and legal texts

    International Nuclear Information System (INIS)

    2014-01-01

    This section of the Bulletin presents the recently published documents and legal texts sorted by country: - Brazil: Resolution No. 169 of 30 April 2014. - Japan: Act Concerning Exceptions to Interruption of Prescription Pertaining to Use of Settlement Mediation Procedures by the Dispute Reconciliation Committee for Nuclear Damage Compensation in relation to Nuclear Damage Compensation Disputes Pertaining to the Great East Japan Earthquake (Act No. 32 of 5 June 2013); Act Concerning Measures to Achieve Prompt and Assured Compensation for Nuclear Damage Arising from the Nuclear Plant Accident following the Great East Japan Earthquake and Exceptions to the Extinctive Prescription, etc. of the Right to Claim Compensation for Nuclear Damage (Act No. 97 of 11 December 2013); Fourth Supplement to Interim Guidelines on Determination of the Scope of Nuclear Damage Resulting from the Accident at the Tokyo Electric Power Company Fukushima Daiichi and Daini Nuclear Power Plants (Concerning Damages Associated with the Prolongation of Evacuation Orders, etc.); Outline of 'Fourth Supplement to Interim Guidelines (Concerning Damages Associated with the Prolongation of Evacuation Orders, etc.)'. - OECD Nuclear Energy Agency: Decision and Recommendation of the Steering Committee Concerning the Application of the Paris Convention to Nuclear Installations in the Process of Being Decommissioned; Joint Declaration on the Security of Supply of Medical Radioisotopes. - United Arab Emirates: Federal Decree No. (51) of 2014 Ratifying the Convention on Supplementary Compensation for Nuclear Damage; Ratification of the Federal Supreme Council of Federal Decree No. (51) of 2014 Ratifying the Convention on Supplementary Compensation for Nuclear Damage

  15. Automated analysis of instructional text

    Energy Technology Data Exchange (ETDEWEB)

    Norton, L.M.

    1983-05-01

    The development of a capability for automated processing of natural language text is a long-range goal of artificial intelligence. This paper discusses an investigation into the issues involved in the comprehension of descriptive, as opposed to illustrative, textual material. The comprehension process is viewed as the conversion of knowledge from one representation into another. The proposed target representation consists of statements of the prolog language, which can be interpreted both declaratively and procedurally, much like production rules. A computer program has been written to model in detail some ideas about this process. The program successfully analyzes several heavily edited paragraphs adapted from an elementary textbook on programming, automatically synthesizing as a result of the analysis a working Prolog program which, when executed, can parse and interpret let commands in the basic language. The paper discusses the motivations and philosophy of the project, the many kinds of prerequisite knowledge which are necessary, and the structure of the text analysis program. A sentence-by-sentence account of the analysis of the sample text is presented, describing the syntactic and semantic processing which is involved. The paper closes with a discussion of lessons learned from the project, possible alternative approaches, and possible extensions for future work. The entire project is presented as illustrative of the nature and complexity of the text analysis process, rather than as providing definitive or optimal solutions to any aspects of the task. 12 references.

  16. A quick survey of text categorization algorithms

    Directory of Open Access Journals (Sweden)

    Dan MUNTEANU

    2007-12-01

    Full Text Available This paper contains an overview of basic formulations and approaches to text classification. This paper surveys the algorithms used in text categorization: handcrafted rules, decision trees, decision rules, on-line learning, linear classifier, Rocchio’s algorithm, k Nearest Neighbor (kNN, Support Vector Machines (SVM.

  17. Text

    International Nuclear Information System (INIS)

    Anon.

    2009-01-01

    The purpose of this act is to safeguard against the dangers and harmful effects of radioactive waste and to contribute to public safety and environmental protection by laying down requirements for the safe and efficient management of radioactive waste. We will find definitions, interrelation with other legislation, responsibilities of the state and local governments, responsibilities of radioactive waste management companies and generators, formulation of the basic plan for the control of radioactive waste, radioactive waste management ( with public information, financing and part of spent fuel management), Korea radioactive waste management corporation ( business activities, budget), establishment of a radioactive waste fund in order to secure the financial resources required for radioactive waste management, and penalties in case of improper operation of radioactive waste management. (N.C.)

  18. New Historicism: Text and Context

    Directory of Open Access Journals (Sweden)

    Violeta M. Vesić

    2016-02-01

    Full Text Available During most of the twentieth century history was seen as a phenomenon outside of literature that guaranteed the veracity of literary interpretation. History was unique and it functioned as a basis for reading literary works. During the seventies of the twentieth century there occurred a change of attitude towards history in American literary theory, and there appeared a new theoretical approach which soon became known as New Historicism. Since its inception, New Historicism has been identified with the study of Renaissance and Romanticism, but nowadays it has been increasingly involved in other literary trends. Although there are great differences in the arguments and practices at various representatives of this school, New Historicism has clearly recognizable features and many new historicists will agree with the statement of Walter Cohen that New Historicism, when it appeared in the eighties, represented something quite new in reference to the studies of theory, criticism and history (Cohen 1987, 33. Theoretical connection with Bakhtin, Foucault and Marx is clear, as well as a kind of uneasy tie with deconstruction and the work of Paul de Man. At the center of this approach is a renewed interest in the study of literary works in the light of historical and political circumstances in which they were created. Foucault encouraged readers to begin to move literary texts and to link them with discourses and representations that are not literary, as well as to examine the sociological aspects of the texts in order to take part in the social struggles of today. The study of literary works using New Historicism is the study of politics, history, culture and circumstances in which these works were created. With regard to one of the main fact which is located in the center of the criticism, that history cannot be viewed objectively and that reality can only be understood through a cultural context that reveals the work, re-reading and interpretation of

  19. Teaching Text Structure: Examining the Affordances of Children's Informational Texts

    Science.gov (United States)

    Jones, Cindy D.; Clark, Sarah K.; Reutzel, D. Ray

    2016-01-01

    This study investigated the affordances of informational texts to serve as model texts for teaching text structure to elementary school children. Content analysis of a random sampling of children's informational texts from top publishers was conducted on text structure organization and on the inclusion of text features as signals of text…

  20. Text Mining for Protein Docking.

    Directory of Open Access Journals (Sweden)

    Varsha D Badal

    2015-12-01

    Full Text Available The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking. Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu. The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound

  1. Important Text Characteristics for Early-Grades Text Complexity

    Science.gov (United States)

    Fitzgerald, Jill; Elmore, Jeff; Koons, Heather; Hiebert, Elfrieda H.; Bowen, Kimberly; Sanford-Moore, Eleanor E.; Stenner, A. Jackson

    2015-01-01

    The Common Core set a standard for all children to read increasingly complex texts throughout schooling. The purpose of the present study was to explore text characteristics specifically in relation to early-grades text complexity. Three hundred fifty primary-grades texts were selected and digitized. Twenty-two text characteristics were identified…

  2. Text Manipulation Techniques and Foreign Language Composition.

    Science.gov (United States)

    Walker, Ronald W.

    1982-01-01

    Discusses an approach to teaching second language composition which emphasizes (1) careful analysis of model texts from a limited, but well-defined perspective and (2) the application of text manipulation techniques developed by the word processing industry to student compositions. (EKN)

  3. Text analysis methods, text analysis apparatuses, and articles of manufacture

    Science.gov (United States)

    Whitney, Paul D; Willse, Alan R; Lopresti, Charles A; White, Amanda M

    2014-10-28

    Text analysis methods, text analysis apparatuses, and articles of manufacture are described according to some aspects. In one aspect, a text analysis method includes accessing information indicative of data content of a collection of text comprising a plurality of different topics, using a computing device, analyzing the information indicative of the data content, and using results of the analysis, identifying a presence of a new topic in the collection of text.

  4. Classroom Texting in College Students

    Science.gov (United States)

    Pettijohn, Terry F.; Frazier, Erik; Rieser, Elizabeth; Vaughn, Nicholas; Hupp-Wilds, Bobbi

    2015-01-01

    A 21-item survey on texting in the classroom was given to 235 college students. Overall, 99.6% of students owned a cellphone and 98% texted daily. Of the 138 students who texted in the classroom, most texted friends or significant others, and indicate the reason for classroom texting is boredom or work. Students who texted sent a mean of 12.21…

  5. Observation of [Formula: see text] and [Formula: see text] decays.

    Science.gov (United States)

    Aaij, R; Adeva, B; Adinolfi, M; Ajaltouni, Z; Akar, S; Albrecht, J; Alessio, F; Alexander, M; Ali, S; Alkhazov, G; Alvarez Cartelle, P; Alves, A A; Amato, S; Amerio, S; Amhis, Y; An, L; Anderlini, L; Andreassi, G; Andreotti, M; Andrews, J E; Appleby, R B; Archilli, F; d'Argent, P; Arnau Romeu, J; Artamonov, A; Artuso, M; Aslanides, E; Auriemma, G; Baalouch, M; Babuschkin, I; Bachmann, S; Back, J J; Badalov, A; Baesso, C; Baker, S; Baldini, W; Barlow, R J; Barschel, C; Barsuk, S; Barter, W; Baszczyk, M; Batozskaya, V; Batsukh, B; Battista, V; Bay, A; Beaucourt, L; Beddow, J; Bedeschi, F; Bediaga, I; Bel, L J; Bellee, V; Belloli, N; Belous, K; Belyaev, I; Ben-Haim, E; Bencivenni, G; Benson, S; Benton, J; Berezhnoy, A; Bernet, R; Bertolin, A; Betancourt, C; Betti, F; Bettler, M-O; van Beuzekom, M; Bezshyiko, Ia; Bifani, S; Billoir, P; Bird, T; Birnkraut, A; Bitadze, A; Bizzeti, A; Blake, T; Blanc, F; Blouw, J; Blusk, S; Bocci, V; Boettcher, T; Bondar, A; Bondar, N; Bonivento, W; Bordyuzhin, I; Borgheresi, A; Borghi, S; Borisyak, M; Borsato, M; Bossu, F; Boubdir, M; Bowcock, T J V; Bowen, E; Bozzi, C; Braun, S; Britsch, M; Britton, T; Brodzicka, J; Buchanan, E; Burr, C; Bursche, A; Buytaert, J; Cadeddu, S; Calabrese, R; Calvi, M; Calvo Gomez, M; Camboni, A; Campana, P; Campora Perez, D H; Capriotti, L; Carbone, A; Carboni, G; Cardinale, R; Cardini, A; Carniti, P; Carson, L; Carvalho Akiba, K; Casse, G; Cassina, L; Castillo Garcia, L; Cattaneo, M; Cauet, Ch; Cavallero, G; Cenci, R; Charles, M; Charpentier, Ph; Chatzikonstantinidis, G; Chefdeville, M; Chen, S; Cheung, S-F; Chobanova, V; Chrzaszcz, M; Cid Vidal, X; Ciezarek, G; Clarke, P E L; Clemencic, M; Cliff, H V; Closier, J; Coco, V; Cogan, J; Cogneras, E; Cogoni, V; Cojocariu, L; Collazuol, G; Collins, P; Comerma-Montells, A; Contu, A; Cook, A; Coombs, G; Coquereau, S; Corti, G; Corvo, M; Costa Sobral, C M; Couturier, B; Cowan, G A; Craik, D C; Crocombe, A; Cruz Torres, M; Cunliffe, S; Currie, R; D'Ambrosio, C; Da Cunha Marinho, F; Dall'Occo, E; Dalseno, J; David, P N Y; Davis, A; De Aguiar Francisco, O; De Bruyn, K; De Capua, S; De Cian, M; De Miranda, J M; De Paula, L; De Serio, M; De Simone, P; Dean, C-T; Decamp, D; Deckenhoff, M; Del Buono, L; Demmer, M; Dendek, A; Derkach, D; Deschamps, O; Dettori, F; Dey, B; Di Canto, A; Dijkstra, H; Dordei, F; Dorigo, M; Dosil Suárez, A; Dovbnya, A; Dreimanis, K; Dufour, L; Dujany, G; Dungs, K; Durante, P; Dzhelyadin, R; Dziurda, A; Dzyuba, A; Déléage, N; Easo, S; Ebert, M; Egede, U; Egorychev, V; Eidelman, S; Eisenhardt, S; Eitschberger, U; Ekelhof, R; Eklund, L; Ely, S; Esen, S; Evans, H M; Evans, T; Falabella, A; Farley, N; Farry, S; Fay, R; Fazzini, D; Ferguson, D; Fernandez Prieto, A; Ferrari, F; Ferreira Rodrigues, F; Ferro-Luzzi, M; Filippov, S; Fini, R A; Fiore, M; Fiorini, M; Firlej, M; Fitzpatrick, C; Fiutowski, T; Fleuret, F; Fohl, K; Fontana, M; Fontanelli, F; Forshaw, D C; Forty, R; Franco Lima, V; Frank, M; Frei, C; Fu, J; Furfaro, E; Färber, C; Gallas Torreira, A; Galli, D; Gallorini, S; Gambetta, S; Gandelman, M; Gandini, P; Gao, Y; Garcia Martin, L M; García Pardiñas, J; Garra Tico, J; Garrido, L; Garsed, P J; Gascon, D; Gaspar, C; Gavardi, L; Gazzoni, G; Gerick, D; Gersabeck, E; Gersabeck, M; Gershon, T; Ghez, Ph; Gianì, S; Gibson, V; Girard, O G; Giubega, L; Gizdov, K; Gligorov, V V; Golubkov, D; Golutvin, A; Gomes, A; Gorelov, I V; Gotti, C; Govorkova, E; Grabalosa Gándara, M; Graciani Diaz, R; Granado Cardoso, L A; Graugés, E; Graverini, E; Graziani, G; Grecu, A; Griffith, P; Grillo, L; Gruberg Cazon, B R; Grünberg, O; Gushchin, E; Guz, Yu; Gys, T; Göbel, C; Hadavizadeh, T; Hadjivasiliou, C; Haefeli, G; Haen, C; Haines, S C; Hall, S; Hamilton, B; Han, X; Hansmann-Menzemer, S; Harnew, N; Harnew, S T; Harrison, J; Hatch, M; He, J; Head, T; Heister, A; Hennessy, K; Henrard, P; Henry, L; Hernando Morata, J A; van Herwijnen, E; Heß, M; Hicheur, A; Hill, D; Hombach, C; Hopchev, H; Hulsbergen, W; Humair, T; Hushchyn, M; Hussain, N; Hutchcroft, D; Idzik, M; Ilten, P; Jacobsson, R; Jaeger, A; Jalocha, J; Jans, E; Jawahery, A; Jiang, F; John, M; Johnson, D; Jones, C R; Joram, C; Jost, B; Jurik, N; Kandybei, S; Kanso, W; Karacson, M; Kariuki, J M; Karodia, S; Kecke, M; Kelsey, M; Kenyon, I R; Kenzie, M; Ketel, T; Khairullin, E; Khanji, B; Khurewathanakul, C; Kirn, T; Klaver, S; Klimaszewski, K; Koliiev, S; Kolpin, M; Komarov, I; Koopman, R F; Koppenburg, P; Kosmyntseva, A; Kozachuk, A; Kozeiha, M; Kravchuk, L; Kreplin, K; Kreps, M; Krokovny, P; Kruse, F; Krzemien, W; Kucewicz, W; Kucharczyk, M; Kudryavtsev, V; Kuonen, A K; Kurek, K; Kvaratskheliya, T; Lacarrere, D; Lafferty, G; Lai, A; Lanfranchi, G; Langenbruch, C; Latham, T; Lazzeroni, C; Le Gac, R; van Leerdam, J; Lees, J-P; Leflat, A; Lefrançois, J; Lefèvre, R; Lemaitre, F; Lemos Cid, E; Leroy, O; Lesiak, T; Leverington, B; Li, Y; Likhomanenko, T; Lindner, R; Linn, C; Lionetto, F; Liu, B; Liu, X; Loh, D; Longstaff, I; Lopes, J H; Lucchesi, D; Lucio Martinez, M; Luo, H; Lupato, A; Luppi, E; Lupton, O; Lusiani, A; Lyu, X; Machefert, F; Maciuc, F; Maev, O; Maguire, K; Malde, S; Malinin, A; Maltsev, T; Manca, G; Mancinelli, G; Manning, P; Maratas, J; Marchand, J F; Marconi, U; Marin Benito, C; Marino, P; Marks, J; Martellotti, G; Martin, M; Martinelli, M; Martinez Santos, D; Martinez Vidal, F; Martins Tostes, D; Massacrier, L M; Massafferri, A; Matev, R; Mathad, A; Mathe, Z; Matteuzzi, C; Mauri, A; Maurin, B; Mazurov, A; McCann, M; McCarthy, J; McNab, A; McNulty, R; Meadows, B; Meier, F; Meissner, M; Melnychuk, D; Merk, M; Merli, A; Michielin, E; Milanes, D A; Minard, M-N; Mitzel, D S; Mogini, A; Molina Rodriguez, J; Monroy, I A; Monteil, S; Morandin, M; Morawski, P; Mordà, A; Morello, M J; Moron, J; Morris, A B; Mountain, R; Muheim, F; Mulder, M; Mussini, M; Müller, D; Müller, J; Müller, K; Müller, V; Naik, P; Nakada, T; Nandakumar, R; Nandi, A; Nasteva, I; Needham, M; Neri, N; Neubert, S; Neufeld, N; Neuner, M; Nguyen, A D; Nguyen, T D; Nguyen-Mau, C; Nieswand, S; Niet, R; Nikitin, N; Nikodem, T; Novoselov, A; O'Hanlon, D P; Oblakowska-Mucha, A; Obraztsov, V; Ogilvy, S; Oldeman, R; Onderwater, C J G; Otalora Goicochea, J M; Otto, A; Owen, P; Oyanguren, A; Pais, P R; Palano, A; Palombo, F; Palutan, M; Panman, J; Papanestis, A; Pappagallo, M; Pappalardo, L L; Parker, W; Parkes, C; Passaleva, G; Pastore, A; Patel, G D; Patel, M; Patrignani, C; Pearce, A; Pellegrino, A; Penso, G; Pepe Altarelli, M; Perazzini, S; Perret, P; Pescatore, L; Petridis, K; Petrolini, A; Petrov, A; Petruzzo, M; Picatoste Olloqui, E; Pietrzyk, B; Pikies, M; Pinci, D; Pistone, A; Piucci, A; Playfer, S; Plo Casasus, M; Poikela, T; Polci, F; Poluektov, A; Polyakov, I; Polycarpo, E; Pomery, G J; Popov, A; Popov, D; Popovici, B; Poslavskii, S; Potterat, C; Price, E; Price, J D; Prisciandaro, J; Pritchard, A; Prouve, C; Pugatch, V; Puig Navarro, A; Punzi, G; Qian, W; Quagliani, R; Rachwal, B; Rademacker, J H; Rama, M; Ramos Pernas, M; Rangel, M S; Raniuk, I; Ratnikov, F; Raven, G; Redi, F; Reichert, S; Dos Reis, A C; Remon Alepuz, C; Renaudin, V; Ricciardi, S; Richards, S; Rihl, M; Rinnert, K; Rives Molina, V; Robbe, P; Rodrigues, A B; Rodrigues, E; Rodriguez Lopez, J A; Rodriguez Perez, P; Rogozhnikov, A; Roiser, S; Rollings, A; Romanovskiy, V; Romero Vidal, A; Ronayne, J W; Rotondo, M; Rudolph, M S; Ruf, T; Ruiz Valls, P; Saborido Silva, J J; Sadykhov, E; Sagidova, N; Saitta, B; Salustino Guimaraes, V; Sanchez Mayordomo, C; Sanmartin Sedes, B; Santacesaria, R; Santamarina Rios, C; Santimaria, M; Santovetti, E; Sarti, A; Satriano, C; Satta, A; Saunders, D M; Savrina, D; Schael, S; Schellenberg, M; Schiller, M; Schindler, H; Schlupp, M; Schmelling, M; Schmelzer, T; Schmidt, B; Schneider, O; Schopper, A; Schubert, K; Schubiger, M; Schune, M-H; Schwemmer, R; Sciascia, B; Sciubba, A; Semennikov, A; Sergi, A; Serra, N; Serrano, J; Sestini, L; Seyfert, P; Shapkin, M; Shapoval, I; Shcheglov, Y; Shears, T; Shekhtman, L; Shevchenko, V; Siddi, B G; Silva Coutinho, R; Silva de Oliveira, L; Simi, G; Simone, S; Sirendi, M; Skidmore, N; Skwarnicki, T; Smith, E; Smith, I T; Smith, J; Smith, M; Snoek, H; Sokoloff, M D; Soler, F J P; Souza De Paula, B; Spaan, B; Spradlin, P; Sridharan, S; Stagni, F; Stahl, M; Stahl, S; Stefko, P; Stefkova, S; Steinkamp, O; Stemmle, S; Stenyakin, O; Stevenson, S; Stoica, S; Stone, S; Storaci, B; Stracka, S; Straticiuc, M; Straumann, U; Sun, L; Sutcliffe, W; Swientek, K; Syropoulos, V; Szczekowski, M; Szumlak, T; T'Jampens, S; Tayduganov, A; Tekampe, T; Tellarini, G; Teubert, F; Thomas, E; van Tilburg, J; Tilley, M J; Tisserand, V; Tobin, M; Tolk, S; Tomassetti, L; Tonelli, D; Topp-Joergensen, S; Toriello, F; Tournefier, E; Tourneur, S; Trabelsi, K; Traill, M; Tran, M T; Tresch, M; Trisovic, A; Tsaregorodtsev, A; Tsopelas, P; Tully, A; Tuning, N; Ukleja, A; Ustyuzhanin, A; Uwer, U; Vacca, C; Vagnoni, V; Valassi, A; Valat, S; Valenti, G; Vallier, A; Vazquez Gomez, R; Vazquez Regueiro, P; Vecchi, S; van Veghel, M; Velthuis, J J; Veltri, M; Veneziano, G; Venkateswaran, A; Vernet, M; Vesterinen, M; Viaud, B; Vieira, D; Vieites Diaz, M; Viemann, H; Vilasis-Cardona, X; Vitti, M; Volkov, V; Vollhardt, A; Voneki, B; Vorobyev, A; Vorobyev, V; Voß, C; de Vries, J A; Vázquez Sierra, C; Waldi, R; Wallace, C; Wallace, R; Walsh, J; Wang, J; Ward, D R; Wark, H M; Watson, N K; Websdale, D; Weiden, A; Whitehead, M; Wicht, J; Wilkinson, G; Wilkinson, M; Williams, M; Williams, M P; Williams, M; Williams, T; Wilson, F F; Wimberley, J; Wishahi, J; Wislicki, W; Witek, M; Wormser, G; Wotton, S A; Wraight, K; Wyllie, K; Xie, Y; Xing, Z; Xu, Z; Yang, Z; Yin, H; Yu, J; Yuan, X; Yushchenko, O; Zarebski, K A; Zavertyaev, M; Zhang, L; Zhang, Y; Zhang, Y; Zhelezov, A; Zheng, Y; Zhokhov, A; Zhu, X; Zhukov, V; Zucchelli, S

    2017-01-01

    The decays [Formula: see text] and [Formula: see text] are observed for the first time using a data sample corresponding to an integrated luminosity of 3.0 fb[Formula: see text], collected by the LHCb experiment in proton-proton collisions at the centre-of-mass energies of 7 and 8[Formula: see text]. The branching fractions relative to that of [Formula: see text] are measured to be [Formula: see text]where the first uncertainties are statistical and the second are systematic.

  6. Mining the Text: 34 Text Features that Can Ease or Obstruct Text Comprehension and Use

    Science.gov (United States)

    White, Sheida

    2012-01-01

    This article presents 34 characteristics of texts and tasks ("text features") that can make continuous (prose), noncontinuous (document), and quantitative texts easier or more difficult for adolescents and adults to comprehend and use. The text features were identified by examining the assessment tasks and associated texts in the national…

  7. Unsupervised information extraction by text segmentation

    CERN Document Server

    Cortez, Eli

    2013-01-01

    A new unsupervised approach to the problem of Information Extraction by Text Segmentation (IETS) is proposed, implemented and evaluated herein. The authors' approach relies on information available on pre-existing data to learn how to associate segments in the input string with attributes of a given domain relying on a very effective set of content-based features. The effectiveness of the content-based features is also exploited to directly learn from test data structure-based features, with no previous human-driven training, a feature unique to the presented approach. Based on the approach, a

  8. From Text to Political Positions: Text analysis across disciplines

    NARCIS (Netherlands)

    Kaal, A.R.; Maks, I.; van Elfrinkhof, A.M.E.

    2014-01-01

    ABSTRACT From Text to Political Positions addresses cross-disciplinary innovation in political text analysis for party positioning. Drawing on political science, computational methods and discourse analysis, it presents a diverse collection of analytical models including pure quantitative and

  9. Text mining from ontology learning to automated text processing applications

    CERN Document Server

    Biemann, Chris

    2014-01-01

    This book comprises a set of articles that specify the methodology of text mining, describe the creation of lexical resources in the framework of text mining and use text mining for various tasks in natural language processing (NLP). The analysis of large amounts of textual data is a prerequisite to build lexical resources such as dictionaries and ontologies and also has direct applications in automated text processing in fields such as history, healthcare and mobile applications, just to name a few. This volume gives an update in terms of the recent gains in text mining methods and reflects

  10. Informational Text and the CCSS

    Science.gov (United States)

    Aspen Institute, 2012

    2012-01-01

    What constitutes an informational text covers a broad swath of different types of texts. Biographies & memoirs, speeches, opinion pieces & argumentative essays, and historical, scientific or technical accounts of a non-narrative nature are all included in what the Common Core State Standards (CCSS) envisions as informational text. Also included…

  11. The Only Safe SMS Texting Is No SMS Texting.

    Science.gov (United States)

    Toth, Cheryl; Sacopulos, Michael J

    2015-01-01

    Many physicians and practice staff use short messaging service (SMS) text messaging to communicate with patients. But SMS text messaging is unencrypted, insecure, and does not meet HIPAA requirements. In addition, the short and abbreviated nature of text messages creates opportunities for misinterpretation, and can negatively impact patient safety and care. Until recently, asking patients to sign a statement that they understand and accept these risks--as well as having policies, device encryption, and cyber insurance in place--would have been enough to mitigate the risk of using SMS text in a medical practice. But new trends and policies have made SMS text messaging unsafe under any circumstance. This article explains these trends and policies, as well as why only secure texting or secure messaging should be used for physician-patient communication.

  12. Predicting Prosody from Text for Text-to-Speech Synthesis

    CERN Document Server

    Rao, K Sreenivasa

    2012-01-01

    Predicting Prosody from Text for Text-to-Speech Synthesis covers the specific aspects of prosody, mainly focusing on how to predict the prosodic information from linguistic text, and then how to exploit the predicted prosodic knowledge for various speech applications. Author K. Sreenivasa Rao discusses proposed methods along with state-of-the-art techniques for the acquisition and incorporation of prosodic knowledge for developing speech systems. Positional, contextual and phonological features are proposed for representing the linguistic and production constraints of the sound units present in the text. This book is intended for graduate students and researchers working in the area of speech processing.

  13. Monitoring interaction and collective text production through text mining

    Directory of Open Access Journals (Sweden)

    Macedo, Alexandra Lorandi

    2014-04-01

    Full Text Available This article presents the Concepts Network tool, developed using text mining technology. The main objective of this tool is to extract and relate terms of greatest incidence from a text and exhibit the results in the form of a graph. The Network was implemented in the Collective Text Editor (CTE which is an online tool that allows the production of texts in synchronized or non-synchronized forms. This article describes the application of the Network both in texts produced collectively and texts produced in a forum. The purpose of the tool is to offer support to the teacher in managing the high volume of data generated in the process of interaction amongst students and in the construction of the text. Specifically, the aim is to facilitate the teacher’s job by allowing him/her to process data in a shorter time than is currently demanded. The results suggest that the Concepts Network can aid the teacher, as it provides indicators of the quality of the text produced. Moreover, messages posted in forums can be analyzed without their content necessarily having to be pre-read.

  14. Text recycling: acceptable or misconduct?

    Science.gov (United States)

    Harriman, Stephanie; Patel, Jigisha

    2014-08-16

    Text recycling, also referred to as self-plagiarism, is the reproduction of an author's own text from a previous publication in a new publication. Opinions on the acceptability of this practice vary, with some viewing it as acceptable and efficient, and others as misleading and unacceptable. In light of the lack of consensus, journal editors often have difficulty deciding how to act upon the discovery of text recycling. In response to these difficulties, we have created a set of guidelines for journal editors on how to deal with text recycling. In this editorial, we discuss some of the challenges of developing these guidelines, and how authors can avoid undisclosed text recycling.

  15. TEXT DEIXIS IN NARRATIVE SEQUENCES

    Directory of Open Access Journals (Sweden)

    Josep Rivera

    2007-06-01

    Full Text Available This study looks at demonstrative descriptions, regarding them as text-deictic procedures which contribute to weave discourse reference. Text deixis is thought of as a metaphorical referential device which maps the ground of utterance onto the text itself. Demonstrative expressions with textual antecedent-triggers, considered as the most important text-deictic units, are identified in a narrative corpus consisting of J. M. Barrie’s Peter Pan and its translation into Catalan. Some linguistic and discourse variables related to DemNPs are analysed to characterise adequately text deixis. It is shown that this referential device is usually combined with abstract nouns, thus categorising and encapsulating (non-nominal complex discourse entities as nouns, while performing a referential cohesive function by means of the text deixis + general noun type of lexical cohesion.

  16. Text against Text: Counterbalancing the Hegemony of Assessment.

    Science.gov (United States)

    Cosgrove, Cornelius

    A study examined whether composition specialists can counterbalance the potential privileging of the assessment perspective, or of self-appointed interpreters of that perspective, through the study of assessment discourse as text. Fourteen assessment texts were examined, most of them journal articles and most of them featuring the common…

  17. SparkText: Biomedical Text Mining on Big Data Framework.

    Directory of Open Access Journals (Sweden)

    Zhan Ye

    Full Text Available Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment.In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM, and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes.This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.

  18. Financial Statement Fraud Detection using Text Mining

    OpenAIRE

    Rajan Gupta; Nasib Singh Gill

    2013-01-01

    Data mining techniques have been used enormously by the researchers’ community in detecting financial statement fraud. Most of the research in this direction has used the numbers (quantitative information) i.e. financial ratios present in the financial statements for detecting fraud. There is very little or no research on the analysis of text such as auditor’s comments or notes present in published reports. In this study we propose a text mining approach for detecting financial statement frau...

  19. Knowledge Representation in Travelling Texts

    DEFF Research Database (Denmark)

    Mousten, Birthe; Locmele, Gunta

    2014-01-01

    Today, information travels fast. Texts travel, too. In a corporate context, the question is how to manage which knowledge elements should travel to a new language area or market and in which form? The decision to let knowledge elements travel or not travel highly depends on the limitation...... and the purpose of the text in a new context as well as on predefined parameters for text travel. For texts used in marketing and in technology, the question is whether culture-bound knowledge representation should be domesticated or kept as foreign elements, or should be mirrored or moulded—or should not travel...... at all! When should semantic and pragmatic elements in a text be replaced and by which other elements? The empirical basis of our work is marketing and technical texts in English, which travel into the Latvian and Danish markets, respectively....

  20. Texting while driving: is speech-based text entry less risky than handheld text entry?

    Science.gov (United States)

    He, J; Chaparro, A; Nguyen, B; Burge, R J; Crandall, J; Chaparro, B; Ni, R; Cao, S

    2014-11-01

    Research indicates that using a cell phone to talk or text while maneuvering a vehicle impairs driving performance. However, few published studies directly compare the distracting effects of texting using a hands-free (i.e., speech-based interface) versus handheld cell phone, which is an important issue for legislation, automotive interface design and driving safety training. This study compared the effect of speech-based versus handheld text entries on simulated driving performance by asking participants to perform a car following task while controlling the duration of a secondary text-entry task. Results showed that both speech-based and handheld text entries impaired driving performance relative to the drive-only condition by causing more variation in speed and lane position. Handheld text entry also increased the brake response time and increased variation in headway distance. Text entry using a speech-based cell phone was less detrimental to driving performance than handheld text entry. Nevertheless, the speech-based text entry task still significantly impaired driving compared to the drive-only condition. These results suggest that speech-based text entry disrupts driving, but reduces the level of performance interference compared to text entry with a handheld device. In addition, the difference in the distraction effect caused by speech-based and handheld text entry is not simply due to the difference in task duration. Copyright © 2014 Elsevier Ltd. All rights reserved.

  1. SparkText: Biomedical Text Mining on Big Data Framework

    Science.gov (United States)

    He, Karen Y.; Wang, Kai

    2016-01-01

    Background Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. Results In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. Conclusions This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research. PMID:27685652

  2. SparkText: Biomedical Text Mining on Big Data Framework.

    Science.gov (United States)

    Ye, Zhan; Tafti, Ahmad P; He, Karen Y; Wang, Kai; He, Max M

    Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.

  3. Text Mining Applications and Theory

    CERN Document Server

    Berry, Michael W

    2010-01-01

    Text Mining: Applications and Theory presents the state-of-the-art algorithms for text mining from both the academic and industrial perspectives.  The contributors span several countries and scientific domains: universities, industrial corporations, and government laboratories, and demonstrate the use of techniques from machine learning, knowledge discovery, natural language processing and information retrieval to design computational models for automated text analysis and mining. This volume demonstrates how advancements in the fields of applied mathematics, computer science, machine learning

  4. Approach

    Directory of Open Access Journals (Sweden)

    Guido Pinto Aguirre

    2008-01-01

    Full Text Available El propósito de este documento es investigar y re-estimar los efectos de los patrones de lactancia, salud y estado nutricional de la mujer y consumo de energía sobre la duración del retorno de la fertilidad de postparto (es decir, retorno de la menstruación de postparto utilizando toda la información relevante en el estudio longitudinal del Instituto de Nutrición de Centroamérica y Panamá y un procedimiento de estimación más adecuado (modelos de riesgo. Los datos utilizados provienen del Estudio Longitudinal llevado a cabo en Guatemala entre 1967 y 1979. En este artículo se utiliza un modelo de riesgo con varios estados que reconoce diferentes caminos y estados en el proceso del retorno de la fertilidad de postparto. El modelo descansa en la existencia de cinco estados (lactancia total, lactancia parcial, destete, mortalidad infantil y menstruación. También incluye de manera explícita nutrición maternal y consumo de energía de la mujer como elementos estratégicos del modelo. El estudio encontró que los efectos de los patrones de lactancia, nutrición de la madre y patrones de trabajo de la mujer (consumo de energía sobre la fertilidad en las áreas rurales de Guatemala son fuertes y significativos. La contribución de este artículo es mostrar que la aplicación de los modelos de riesgo con múltiples estados proporciona estimados que son consistentes con hipótesis que relacionan patrones de lactancia, estado nutricional maternal y estresores maternales externos a procesos que aceleran (desaceleran el retorno de ciclos menstruales normales.

  5. Impact Factors of Online Customer Reviews Usefulness: A Text Semantics Approach%在线商品评论有用性影响因素研究:基于文本语义视角

    Institute of Scientific and Technical Information of China (English)

    陈江涛; 张金隆; 张亚军

    2012-01-01

    针对在线商品评论总体质量不高、缺乏有效评论引导机制的问题,以亚马逊商品在线评论为研究对象,结合文本挖掘技术和实证研究,探究基于文本内容评论有用性的影响因素。通过以手机这一典型商品为例,发现消费者关注手机系统反应、音质、导航、打字体验,希望了解电池、充电器等配件细节,重视商家的配送、退换货、保修、发票等服务,评论文本包含这些信息会提高其有用性。%In view of the problem of low quality and lacking effective guidance mechanism in online review systems, this paper takes product reviews from www. Amazon. cn as the subject and explores the factors influencing review usefulness based on the text content, using text mining and empirical research methods. Taking the mobile phone as an example, the authors find that consumers pay atten- tion to the handset system reaction, acoustic fidelity, GPS, typing experience, and hope knowing accessories such as battery, battery charger, business' s allocation, exchanging products services, maintenance, and receipts. If some of these factors are mentioned, the reviews will be more useful for later customers.

  6. Figure text extraction in biomedical literature.

    Directory of Open Access Journals (Sweden)

    Daehyun Kim

    2011-01-01

    Full Text Available Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engine (http://figuresearch.askHERMES.org to allow bioscientists to access figures efficiently. Since text frequently appears in figures, automatically extracting such text may assist the task of mining information from figures. Little research, however, has been conducted exploring text extraction from biomedical figures.We first evaluated an off-the-shelf Optical Character Recognition (OCR tool on its ability to extract text from figures appearing in biomedical full-text articles. We then developed a Figure Text Extraction Tool (FigTExT to improve the performance of the OCR tool for figure text extraction through the use of three innovative components: image preprocessing, character recognition, and text correction. We first developed image preprocessing to enhance image quality and to improve text localization. Then we adapted the off-the-shelf OCR tool on the improved text localization for character recognition. Finally, we developed and evaluated a novel text correction framework by taking advantage of figure-specific lexicons.The evaluation on 382 figures (9,643 figure texts in total randomly selected from PubMed Central full-text articles shows that FigTExT performed with 84% precision, 98% recall, and 90% F1-score for text localization and with 62.5% precision, 51.0% recall and 56.2% F1-score for figure text extraction. When limiting figure texts to those judged by domain experts to be important content, FigTExT performed with 87.3% precision, 68.8% recall, and 77% F1-score. FigTExT significantly improved the performance of the off-the-shelf OCR tool we used, which on its own performed with 36.6% precision, 19.3% recall, and 25.3% F1-score for

  7. English Metafunction Analysis in Chemistry Text: Characterization of Scientific Text

    Directory of Open Access Journals (Sweden)

    Ahmad Amin Dalimunte, M.Hum

    2013-09-01

    Full Text Available The objectives of this research are to identify what Metafunctions are applied in chemistry text and how they characterize a scientific text. It was conducted by applying content analysis. The data for this research was a twelve-paragraph chemistry text. The data were collected by applying a documentary technique. The document was read and analyzed to find out the Metafunction. The data were analyzed by some procedures: identifying the types of process, counting up the number of the processes, categorizing and counting up the cohesion devices, classifying the types of modulation and determining modality value, finally counting up the number of sentences and clauses, then scoring the grammatical intricacy index. The findings of the research show that Material process (71of 100 is mostly used, circumstance of spatial location (26 of 56 is more dominant than the others. Modality (5 is less used in order to avoid from subjectivity. Impersonality is implied through less use of reference either pronouns (7 or demonstrative (7, conjunctions (60 are applied to develop ideas, and the total number of the clauses are found much more dominant (109 than the total number of the sentences (40 which results high grammatical intricacy index. The Metafunction found indicate that the chemistry text has fulfilled the characteristics of scientific or academic text which truly reflects it as a natural science.

  8. Text Genres in Information Organization

    Science.gov (United States)

    Nahotko, Marek

    2016-01-01

    Introduction: Text genres used by so-called information organizers in the processes of information organization in information systems were explored in this research. Method: The research employed text genre socio-functional analysis. Five genre groups in information organization were distinguished. Every genre group used in information…

  9. Strategies for Translating Vocative Texts

    Directory of Open Access Journals (Sweden)

    Olga COJOCARU

    2014-12-01

    Full Text Available The paper deals with the linguistic and cultural elements of vocative texts and the techniques used in translating them by giving some examples of texts that are typically vocative (i.e. advertisements and instructions for use. Semantic and communicative strategies are popular in translation studies and each of them has its own advantages and disadvantages in translating vocative texts. The advantage of semantic translation is that it takes more account of the aesthetic value of the SL text, while communicative translation attempts to render the exact contextual meaning of the original text in such a way that both content and language are readily acceptable and comprehensible to the readership. Focus is laid on the strategies used in translating vocative texts, strategies that highlight and introduce a cultural context to the target audience, in order to achieve their overall purpose, that is to sell or persuade the reader to behave in a certain way. Thus, in order to do that, a number of advertisements from the field of cosmetics industry and electronic gadgets were selected for analysis. The aim is to gather insights into vocative text translation and to create new perspectives on this field of research, now considered a process of innovation and diversion, especially in areas as important as economy and marketing.

  10. EXPLORING STUDENTS‟ DIFFICULTIES IN READING ACADEMIC TEXTS

    Directory of Open Access Journals (Sweden)

    Ira Ernawati

    2017-04-01

    Full Text Available Academic texts play an important role for university students. However, those texts are considered difficult. This study is intended to investigate students‘ difficulties in reading academic texts. The qualitative approach was employed in this study. The design was a case study. The participants were ten students from fifth semester of CLS: EE (Classroom Language and Strategy: Explaining and Exemplifying class who were selected by using purposive sampling. The data were gathered from students‘ journal reflections, observation, and interview. The finding shows that the students encountered reading difficulties in area of textual factors, namely vocabulary, comprehending specific information, text organization, and grammar and human factors including background knowledge, mood, laziness, and time constraint.

  11. Using ontology network structure in text mining.

    Science.gov (United States)

    Berndt, Donald J; McCart, James A; Luther, Stephen L

    2010-11-13

    Statistical text mining treats documents as bags of words, with a focus on term frequencies within documents and across document collections. Unlike natural language processing (NLP) techniques that rely on an engineered vocabulary or a full-featured ontology, statistical approaches do not make use of domain-specific knowledge. The freedom from biases can be an advantage, but at the cost of ignoring potentially valuable knowledge. The approach proposed here investigates a hybrid strategy based on computing graph measures of term importance over an entire ontology and injecting the measures into the statistical text mining process. As a starting point, we adapt existing search engine algorithms such as PageRank and HITS to determine term importance within an ontology graph. The graph-theoretic approach is evaluated using a smoking data set from the i2b2 National Center for Biomedical Computing, cast as a simple binary classification task for categorizing smoking-related documents, demonstrating consistent improvements in accuracy.

  12. Linguistic Dating of Biblical Texts

    DEFF Research Database (Denmark)

    Ehrensvärd, Martin Gustaf

    2003-01-01

    For two centuries, scholars have pointed to consistent differences in the Hebrew of certain biblical texts and interpreted these differences as reflecting the date of composition of the texts. Until the 1980s, this was quite uncontroversial as the linguistic findings largely confirmed the chronol......For two centuries, scholars have pointed to consistent differences in the Hebrew of certain biblical texts and interpreted these differences as reflecting the date of composition of the texts. Until the 1980s, this was quite uncontroversial as the linguistic findings largely confirmed...... the chronology of the texts established by other means: the Hebrew of Genesis-2 Kings was judged to be early and that of Esther, Daniel, Ezra, Nehemiah, and Chronicles to be late. In the current debate where revisionists have questioned the traditional dating, linguistic arguments in the dating of texts have...... come more into focus. The study critically examines some linguistic arguments adduced to support the traditional position, and reviewing the arguments it points to weaknesses in the linguistic dating of EBH texts to pre-exilic times. When viewing the linguistic evidence in isolation it will be clear...

  13. Stemming Malay Text and Its Application in Automatic Text Categorization

    Science.gov (United States)

    Yasukawa, Michiko; Lim, Hui Tian; Yokoo, Hidetoshi

    In Malay language, there are no conjugations and declensions and affixes have important grammatical functions. In Malay, the same word may function as a noun, an adjective, an adverb, or, a verb, depending on its position in the sentence. Although extensively simple root words are used in informal conversations, it is essential to use the precise words in formal speech or written texts. In Malay, to make sentences clear, derivative words are used. Derivation is achieved mainly by the use of affixes. There are approximately a hundred possible derivative forms of a root word in written language of the educated Malay. Therefore, the composition of Malay words may be complicated. Although there are several types of stemming algorithms available for text processing in English and some other languages, they cannot be used to overcome the difficulties in Malay word stemming. Stemming is the process of reducing various words to their root forms in order to improve the effectiveness of text processing in information systems. It is essential to avoid both over-stemming and under-stemming errors. We have developed a new Malay stemmer (stemming algorithm) for removing inflectional and derivational affixes. Our stemmer uses a set of affix rules and two types of dictionaries: a root-word dictionary and a derivative-word dictionary. The use of set of rules is aimed at reducing the occurrence of under-stemming errors, while that of the dictionaries is believed to reduce the occurrence of over-stemming errors. We performed an experiment to evaluate the application of our stemmer in text mining software. For the experiment, text data used were actual web pages collected from the World Wide Web to demonstrate the effectiveness of our Malay stemming algorithm. The experimental results showed that our stemmer can effectively increase the precision of the extracted Boolean expressions for text categorization.

  14. Anomaly Detection with Text Mining

    Data.gov (United States)

    National Aeronautics and Space Administration — Many existing complex space systems have a significant amount of historical maintenance and problem data bases that are stored in unstructured text forms. The...

  15. Social Studies: Texts and Supplements.

    Science.gov (United States)

    Curriculum Review, 1979

    1979-01-01

    This review of selected social studies texts, series, and supplements, mainly for the secondary level, includes a special section examining eight titles on warfare and terrorism for grades 4-12. (SJL)

  16. Chapter 16: text mining for translational bioinformatics.

    Science.gov (United States)

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  17. GPU-Accelerated Text Mining

    International Nuclear Information System (INIS)

    Cui, X.; Mueller, F.; Zhang, Y.; Potok, Thomas E.

    2009-01-01

    Accelerating hardware devices represent a novel promise for improving the performance for many problem domains but it is not clear for which domains what accelerators are suitable. While there is no room in general-purpose processor design to significantly increase the processor frequency, developers are instead resorting to multi-core chips duplicating conventional computing capabilities on a single die. Yet, accelerators offer more radical designs with a much higher level of parallelism and novel programming environments. This present work assesses the viability of text mining on CUDA. Text mining is one of the key concepts that has become prominent as an effective means to index the Internet, but its applications range beyond this scope and extend to providing document similarity metrics, the subject of this work. We have developed and optimized text search algorithms for GPUs to exploit their potential for massive data processing. We discuss the algorithmic challenges of parallelization for text search problems on GPUs and demonstrate the potential of these devices in experiments by reporting significant speedups. Our study may be one of the first to assess more complex text search problems for suitability for GPU devices, and it may also be one of the first to exploit and report on atomic instruction usage that have recently become available in NVIDIA devices

  18. Comprehending text in literature class

    Directory of Open Access Journals (Sweden)

    Purić Daliborka S.

    2016-01-01

    Full Text Available The paper discusses the problem of understanding a text and the contribution of methodological apparatus in the reader book to comprehension of a text being read in junior classes of elementary school. By using the technique of content analysis from methodological apparatuses in eight reader books for the fourth grade of elementary school, approved for usage in 2014/2015 academic year, and surveying 350 teachers in 33 elementary schools and 11 administrative districts in the Republic of Serbia we examined: (a to what extent the Serbian language text book contents enable junior students to understand a literary text; (b to what extent teachers accept the suggestions offered in the textbook for preparing literature teaching. The results show that a large number of suggestions relate to reading comprehension, but some of categories of understanding are unevenly distributed in the methodological apparatus. On the other hand, the majority of teachers use the methodological apparatus given in a textbook for preparing classes, not only the textbook he or she selected for teaching but also other textbooks for the same grade.

  19. A Guide Text or Many Texts? "That is the Question”

    Directory of Open Access Journals (Sweden)

    Delgado de Valencia Sonia

    2001-08-01

    Full Text Available The use of supplementary materials in the classroom has always been an essential part of the teaching and learning process. To restrict our teaching to the scope of one single textbook means to stand behind the advances of knowledge, in any area and context. Young learners appreciate any new and varied support that expands their knowledge of the world: diaries, letters, panels, free texts, magazines, short stories, poems or literary excerpts, and articles taken from Internet are materials that will allow learnersto share more and work more collaboratively. In this article we are going to deal with some of these materials, with the criteria to select, adapt, and create them that may be of interest to the learner and that may promote reading and writing processes. Since no text can entirely satisfy the needs of students and teachers, the creativity of both parties will be necessary to improve the quality of teaching through the adequate use and adaptation of supplementary materials.

  20. Individual Profiling Using Text Analysis

    Science.gov (United States)

    2016-04-15

    AFRL-AFOSR-UK-TR-2016-0011 Individual Profiling using Text Analysis 140333 Mark Stevenson UNIVERSITY OF SHEFFIELD, DEPARTMENT OF PSYCHOLOGY Final...REPORT TYPE      Final 3.  DATES COVERED (From - To)      15 Sep 2014 to 14 Sep 2015 4.  TITLE AND SUBTITLE Individual Profiling using Text Analysis ...consisted of collections of tweets for a number of Twitter users whose gender, age and personality scores are known. The task was to construct some system

  1. Identifying issue frames in text.

    Directory of Open Access Journals (Sweden)

    Eyal Sagi

    Full Text Available Framing, the effect of context on cognitive processes, is a prominent topic of research in psychology and public opinion research. Research on framing has traditionally relied on controlled experiments and manually annotated document collections. In this paper we present a method that allows for quantifying the relative strengths of competing linguistic frames based on corpus analysis. This method requires little human intervention and can therefore be efficiently applied to large bodies of text. We demonstrate its effectiveness by tracking changes in the framing of terror over time and comparing the framing of abortion by Democrats and Republicans in the U.S.

  2. Finding text in color images

    Science.gov (United States)

    Zhou, Jiangying; Lopresti, Daniel P.; Tasdizen, Tolga

    1998-04-01

    In this paper, we consider the problem of locating and extracting text from WWW images. A previous algorithm based on color clustering and connected components analysis works well as long as the color of each character is relatively uniform and the typography is fairly simple. It breaks down quickly, however, when these assumptions are violated. In this paper, we describe more robust techniques for dealing with this challenging problem. We present an improved color clustering algorithm that measures similarity based on both RGB and spatial proximity. Layout analysis is also incorporated to handle more complex typography. THese changes significantly enhance the performance of our text detection procedure.

  3. Multilingual text induced spelling correction

    NARCIS (Netherlands)

    Reynaert, M.W.C.

    2004-01-01

    We present TISC, a multilingual, language-independent and context-sensitive spelling checking and correction system designed to facilitate the automatic removal of non-word spelling errors in large corpora. Its lexicon is derived from raw text corpora, without supervision, and contains word unigrams

  4. Solar Concepts: A Background Text.

    Science.gov (United States)

    Gorham, Jonathan W.

    This text is designed to provide teachers, students, and the general public with an overview of key solar energy concepts. Various energy terms are defined and explained. Basic thermodynamic laws are discussed. Alternative energy production is described in the context of the present energy situation. Described are the principal contemporary solar…

  5. FTP: Full-Text Publishing?

    Science.gov (United States)

    Jul, Erik

    1992-01-01

    Describes the use of file transfer protocol (FTP) on the INTERNET computer network and considers its use as an electronic publishing system. The differing electronic formats of text files are discussed; the preparation and access of documents are described; and problems are addressed, including a lack of consistency. (LRW)

  6. Quality Inspection of Printed Texts

    DEFF Research Database (Denmark)

    Pedersen, Jesper Ballisager; Nasrollahi, Kamal; Moeslund, Thomas B.

    2016-01-01

    -folded: for costumers of the printing and verification system, the overall grade used to verify if the text is of sufficient quality, while for printer's manufacturer, the detailed character/symbols grades and quality measurements are used for the improvement and optimization of the printing task. The proposed system...

  7. Chinese legal texts – Quantitative Description

    Directory of Open Access Journals (Sweden)

    Ľuboš GAJDOŠ

    2017-06-01

    Full Text Available The aim of the paper is to provide a quantitative description of legal Chinese. This study adopts the approach of corpus-based analyses and it shows basic statistical parameters of legal texts in Chinese, namely the length of a sentence, the proportion of part of speech etc. The research is conducted on the Chinese monolingual corpus Hanku. The paper also discusses the issues of statistical data processing from various corpora, e.g. the tokenisation and part of speech tagging and their relevance to study of registers variation.

  8. Rhetorical structure theory and text analysis

    Science.gov (United States)

    Mann, William C.; Matthiessen, Christian M. I. M.; Thompson, Sandra A.

    1989-11-01

    Recent research on text generation has shown that there is a need for stronger linguistic theories that tell in detail how texts communicate. The prevailing theories are very difficult to compare, and it is also very difficult to see how they might be combined into stronger theories. To make comparison and combination a bit more approachable, we have created a book which is designed to encourage comparison. A dozen different authors or teams, all experienced in discourse research, are given exactly the same text to analyze. The text is an appeal for money by a lobbying organization in Washington, DC. It informs, stimulates and manipulates the reader in a fascinating way. The joint analysis is far more insightful than any one team's analysis alone. This paper is our contribution to the book. Rhetorical Structure Theory (RST), the focus of this paper, is a way to account for the functional potential of text, its capacity to achieve the purposes of speakers and produce effects in hearers. It also shows a way to distinguish coherent texts from incoherent ones, and identifies consequences of text structure.

  9. [Symbol: see text]2 Optimized predictive image coding with [Symbol: see text]∞ bound.

    Science.gov (United States)

    Chuah, Sceuchin; Dumitrescu, Sorina; Wu, Xiaolin

    2013-12-01

    In many scientific, medical, and defense applications of image/video compression, an [Symbol: see text]∞ error bound is required. However, pure[Symbol: see text]∞-optimized image coding, colloquially known as near-lossless image coding, is prone to structured errors such as contours and speckles if the bit rate is not sufficiently high; moreover, most of the previous [Symbol: see text]∞-based image coding methods suffer from poor rate control. In contrast, the [Symbol: see text]2 error metric aims for average fidelity and hence preserves the subtlety of smooth waveforms better than the ∞ error metric and it offers fine granularity in rate control, but pure [Symbol: see text]2-based image coding methods (e.g., JPEG 2000) cannot bound individual errors as the [Symbol: see text]∞-based methods can. This paper presents a new compression approach to retain the benefits and circumvent the pitfalls of the two error metrics. A common approach of near-lossless image coding is to embed into a DPCM prediction loop a uniform scalar quantizer of residual errors. The said uniform scalar quantizer is replaced, in the proposed new approach, by a set of context-based [Symbol: see text]2-optimized quantizers. The optimization criterion is to minimize a weighted sum of the [Symbol: see text]2 distortion and the entropy while maintaining a strict [Symbol: see text]∞ error bound. The resulting method obtains good rate-distortion performance in both [Symbol: see text]2 and [Symbol: see text]∞ metrics and also increases the rate granularity. Compared with JPEG 2000, the new method not only guarantees lower [Symbol: see text]∞ error for all bit rates, but also it achieves higher PSNR for relatively high bit rates.

  10. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...... the accuracy at the same time. The test example is classified using simpler and smaller model. The training examples in a particular cluster share the common vocabulary. At the time of clustering, we do not take into account the labels of the training examples. After the clusters have been created......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...

  11. Linguistic dating of biblical texts

    DEFF Research Database (Denmark)

    Young, Ian; Rezetko, Robert; Ehrensvärd, Martin Gustaf

    Since the beginning of critical scholarship biblical texts have been dated using linguistic evidence. In recent years this has become a controversial topic, especially with the publication of Ian Young (ed.), Biblical Hebrew: Studies in Chronology and Typology (2003). However, until now there has...... been no introduction and comprehensive study of the field. Volume 1 introduces the field of linguistic dating of biblical texts, particularly to intermediate and advanced students of biblical Hebrew who have a reasonable background in the language, having completed at least an introductory course...... in this volume are: What is it that makes Archaic Biblical Hebrew archaic , Early Biblical Hebrew early , and Late Biblical Hebrew late ? Does linguistic typology, i.e. different linguistic characteristics, convert easily and neatly into linguistic chronology, i.e. different historical origins? A large amount...

  12. Text as an Autopoietic System

    DEFF Research Database (Denmark)

    Nicolaisen, Maria Skou

    2016-01-01

    The aim of the present research article is to discuss the possibilities and limitations in addressing text as an autopoietic system. The theory of autopoiesis originated in the field of biology in order to explain the dynamic processes entailed in sustaining living organisms at cellular level. Th....... By comparing the biological with the textual account of autopoietic agency, the end conclusion is that a newly derived concept of sociopoiesis might be better suited for discussing the architecture of textual systems....

  13. The TEXT upgrade vertical interferometer

    International Nuclear Information System (INIS)

    Hallock, G.A.; Gartman, M.L.; Li, W.; Chiang, K.; Shin, S.; Castles, R.L.; Chatterjee, R.; Rahman, A.S.

    1992-01-01

    A far-infrared interferometer has been installed on TEXT upgrade to obtain electron density profiles. The primary system views the plasma vertically through a set of large (60-cm radialx7.62-cm toroidal) diagnostic ports. A 1-cm channel spacing (59 channels total) and fast electronic time response is used, to provide high resolution for radial profiles and perturbation experiments. Initial operation of the vertical system was obtained late in 1991, with six operating channels

  14. An Elementary Approach to Thinking under Uncertainty: A Prototype Text

    Science.gov (United States)

    1982-10-01

    bank employee 3 5𔄀" to 5󈧏" bank president 4 6’ to 6𔃿" C basketball player ( NBA ) over 6𔃿" 21 beautician 22 ... Fred could not decide whether the...that there are 23 National Basketball Association ( NBA ) teams, each of whom carries a roster of 12 players, for a total of 276 NBA players in the United...128 CHAPTER 11: TWO DEMONSTRATIONS .. .. .. .. .. .. . . . . . . . . . 131 The Basketball Player and the

  15. The Balinese Unicode Text Processing

    Directory of Open Access Journals (Sweden)

    Imam Habibi

    2009-06-01

    Full Text Available In principal, the computer only recognizes numbers as the representation of a character. Therefore, there are many encoding systems to allocate these numbers although not all characters are covered. In Europe, every single language even needs more than one encoding system. Hence, a new encoding system known as Unicode has been established to overcome this problem. Unicode provides unique id for each different characters which does not depend on platform, program, and language. Unicode standard has been applied in a number of industries, such as Apple, HP, IBM, JustSystem, Microsoft, Oracle, SAP, Sun, Sybase, and Unisys. In addition, language standards and modern information exchanges such as XML, Java, ECMA Script (JavaScript, LDAP, CORBA 3.0, and WML make use of Unicode as an official tool for implementing ISO/IEC 10646. There are four things to do according to Balinese script: the algorithm of transliteration, searching, sorting, and word boundary analysis (spell checking. To verify the truth of algorithm, some applications are made. These applications can run on Linux/Windows OS platform using J2SDK 1.5 and J2ME WTK2 library. The input and output of the algorithm/application are character sequence that is obtained from keyboard punch and external file. This research produces a module or a library which is able to process the Balinese text based on Unicode standard. The output of this research is the ability, skill, and mastering of 1. Unicode standard (21-bit as a substitution to ASCII (7-bit and ISO8859-1 (8-bit as the former default character set in many applications. 2. The Balinese Unicode text processing algorithm. 3. An experience of working with and learning from an international team that consists of the foremost experts in the area: Michael Everson (Ireland, Peter Constable (Microsoft US, I Made Suatjana, and Ida Bagus Adi Sudewa.

  16. Text mining by Tsallis entropy

    Science.gov (United States)

    Jamaati, Maryam; Mehri, Ali

    2018-01-01

    Long-range correlations between the elements of natural languages enable them to convey very complex information. Complex structure of human language, as a manifestation of natural languages, motivates us to apply nonextensive statistical mechanics in text mining. Tsallis entropy appropriately ranks the terms' relevance to document subject, taking advantage of their spatial correlation length. We apply this statistical concept as a new powerful word ranking metric in order to extract keywords of a single document. We carry out an experimental evaluation, which shows capability of the presented method in keyword extraction. We find that, Tsallis entropy has reliable word ranking performance, at the same level of the best previous ranking methods.

  17. Biased limiter experiments on text

    International Nuclear Information System (INIS)

    Phillips, P.E.; Wootton, A.J.; Rowan, W.L.; Ritz, C.P.; Rhodes, T.L.; Bengtson, R.D.; Hodge, W.L.; Durst, R.D.; McCool, S.C.; Richards, B.; Gentle, K.W.; Schoch, P.; Forster, J.C.; Hickok, R.L.; Evans, T.E.

    1987-01-01

    Experiments using an electrically biased limiter have been performed on the Texas Experimental Tokamak (TEXT). A small movable limiter is inserted past the main poloidal ring limiter (which is electrically connected to the vacuum vessel) and biased at V Lim with respect to it. The floating potential, plasma potential and shear layer position can be controlled. With vertical strokeV Lim vertical stroke ≥ 50 V the plasma density increases. For V Lim Lim > 0 the results obtained are inconclusive. Variation of V Lim changes the electrostatic turbulence which may explain the observed total flux changes. (orig.)

  18. Dynamic Chemical Model for $\\text {H} _2 $/$\\text {O} _2 $ Combustion Developed Through a Community Workflow

    KAUST Repository

    Oreluk, James

    2018-01-30

    Elementary-reaction models for $\\\\text{H}_2$/$\\\\text{O}_2$ combustion were evaluated and optimized through a collaborative workflow, establishing accuracy and characterizing uncertainties. Quantitative findings were the optimized model, the importance of $\\\\text{H}_2 + \\\\text{O}_2(1\\\\Delta) = \\\\text{H} + \\\\text{HO}_2$ in high-pressure flames, and the inconsistency of certain low-temperature shock-tube data. The workflow described here is proposed to be even more important because the approach and publicly available cyberinfrastructure allows future community development of evolving improvements. The workflow steps applied here were to develop an initial reaction set using Burke et al. [2012], Burke et al. [2013], Sellevag et al. [2009], and Konnov [2015]; test it for thermodynamic and kinetics consistency and plausibility against other sets in the literature; assign estimated uncertainties where not stated in the sources; select key data targets (

  19. Dynamic Chemical Model for $\\text {H} _2 $/$\\text {O} _2 $ Combustion Developed Through a Community Workflow

    KAUST Repository

    Oreluk, James; Needham, Craig D.; Baskaran, Sathya; Sarathy, Mani; Burke, Michael P.; West, Richard H.; Frenklach, Michael; Westmoreland, Phillip R.

    2018-01-01

    Elementary-reaction models for $\\text{H}_2$/$\\text{O}_2$ combustion were evaluated and optimized through a collaborative workflow, establishing accuracy and characterizing uncertainties. Quantitative findings were the optimized model, the importance of $\\text{H}_2 + \\text{O}_2(1\\Delta) = \\text{H} + \\text{HO}_2$ in high-pressure flames, and the inconsistency of certain low-temperature shock-tube data. The workflow described here is proposed to be even more important because the approach and publicly available cyberinfrastructure allows future community development of evolving improvements. The workflow steps applied here were to develop an initial reaction set using Burke et al. [2012], Burke et al. [2013], Sellevag et al. [2009], and Konnov [2015]; test it for thermodynamic and kinetics consistency and plausibility against other sets in the literature; assign estimated uncertainties where not stated in the sources; select key data targets (

  20. Transfer Learning beyond Text Classification

    Science.gov (United States)

    Yang, Qiang

    Transfer learning is a new machine learning and data mining framework that allows the training and test data to come from different distributions or feature spaces. We can find many novel applications of machine learning and data mining where transfer learning is necessary. While much has been done in transfer learning in text classification and reinforcement learning, there has been a lack of documented success stories of novel applications of transfer learning in other areas. In this invited article, I will argue that transfer learning is in fact quite ubiquitous in many real world applications. In this article, I will illustrate this point through an overview of a broad spectrum of applications of transfer learning that range from collaborative filtering to sensor based location estimation and logical action model learning for AI planning. I will also discuss some potential future directions of transfer learning.

  1. Difficulties in translation of socio-political texts

    Directory of Open Access Journals (Sweden)

    Артур Нарманович Мамедов

    2013-12-01

    Full Text Available Belonging of Russian socio-political texts to publicistic style assumes being guided by functional approach in order to find most adequate linguistic means by transfer of pragmatic meaning of the source text. Intralinguistic meaning can slightly remain by the interpretation of German texts. Lexical and grammatical transformations help preserving semantic-syntactic structure of the target text which means achievement of the same communicative effect by the translate which is being achieved by the source text.

  2. ERRORS AND DIFFICULTIES IN TRANSLATING LEGAL TEXTS

    Directory of Open Access Journals (Sweden)

    Camelia, CHIRILA

    2014-11-01

    Full Text Available Nowadays the accurate translation of legal texts has become highly important as the mistranslation of a passage in a contract, for example, could lead to lawsuits and loss of money. Consequently, the translation of legal texts to other languages faces many difficulties and only professional translators specialised in legal translation should deal with the translation of legal documents and scholarly writings. The purpose of this paper is to analyze translation from three perspectives: translation quality, errors and difficulties encountered in translating legal texts and consequences of such errors in professional translation. First of all, the paper points out the importance of performing a good and correct translation, which is one of the most important elements to be considered when discussing translation. Furthermore, the paper presents an overview of the errors and difficulties in translating texts and of the consequences of errors in professional translation, with applications to the field of law. The paper is also an approach to the differences between languages (English and Romanian that can hinder comprehension for those who have embarked upon the difficult task of translation. The research method that I have used to achieve the objectives of the paper was the content analysis of various Romanian and foreign authors' works.

  3. A programmed text in statistics

    CERN Document Server

    Hine, J

    1975-01-01

    Exercises for Section 2 42 Physical sciences and engineering 42 43 Biological sciences 45 Social sciences Solutions to Exercises, Section 1 47 Physical sciences and engineering 47 49 Biological sciences 49 Social sciences Solutions to Exercises, Section 2 51 51 PhYSical sciences and engineering 55 Biological sciences 58 Social sciences 62 Tables 2 62 x - tests involving variances 2 63,64 x - one tailed tests 2 65 x - two tailed tests F-distribution 66-69 Preface This project started some years ago when the Nuffield Foundation kindly gave a grant for writing a pro­ grammed text to use with service courses in statistics. The work carried out by Mrs. Joan Hine and Professor G. B. Wetherill at Bath University, together with some other help from time to time by colleagues at Bath University and elsewhere. Testing was done at various colleges and universities, and some helpful comments were received, but we particularly mention King Edwards School, Bath, who provided some sixth formers as 'guinea pigs' for the fir...

  4. The Interplay of Text, Meaning and Practice

    DEFF Research Database (Denmark)

    Kärreman, Dan; Levay, Charlotta

    2017-01-01

    Context: The study of discourses (i.e. verbal interactions or written accounts) is increasingly used in social sciences to gain insight into issues connected to discourse, such as meanings, behaviours and actions. This paper situates discourse analysis in medical education, based on a framework...... settings, with a particular focus on the field of medical education. Methods: The study is based on a literature analysis of discourse analysis approaches published in Medical Education. Results: Findings suggest that empirical studies through discourse analysis can be heuristically understood in terms...... of the links between text, practices and meaning. Conclusions: Discourse analysis provides a more strongly supported argument when it is possible to defend claims on three levels: practice, using observational data; meaning, using ethnographic data, and text, using conversational and textual data....

  5. Text Messaging for Addiction: A Review

    Science.gov (United States)

    Keoleian, Victoria; Polcin, Douglas; Galloway, Gantt P.

    2015-01-01

    Individuals seeking treatment for addiction often experience barriers due to cost, lack of local treatment resources, or either school or work schedule conflicts. Text messaging-based addiction treatment is inexpensive and has the potential to be widely accessible in real time. We conducted a comprehensive literature review identifying 11 published randomized controlled trials (RCTs) evaluating text messaging-based interventions for tobacco smoking, 4 studies for reducing alcohol consumption, 1 pilot study in former methamphetamine (MA) users, and 1 study based on qualitative interviews with cannabis users. Abstinence outcome results in RCTs of smokers willing to make a quit attempt have been positive overall in the short term and as far out as at 6 and 12 months. Studies aimed at reducing alcohol consumption have been promising. More data are needed to evaluate the feasibility, acceptability, and efficacy of this approach for other substance use problems. PMID:25950596

  6. INNER DIALOGICITY OF MEDICAL SCIENTIFIC TEXTS

    Directory of Open Access Journals (Sweden)

    Efremova Nataliya Vladimirovna

    2015-06-01

    Full Text Available The author studies inner dialogicity as an integral property of a scientist's thinking activity, a way of a scientific idea development, one of the cognitive and discursive mechanisms of new knowledge formation, its crystallization and dementalisation in a text, as a way of search for truth. Such approach to dialogicity in the study of a scientific text makes it possible to analyze the cogitative processes proceeding in human consciousness and cognitive activity, allows to fully understand the stated scientific concept, to define pragmatic strategies of the author, to plunge into his reflexive world. On the material of medical scientific texts of N.M. Amosov and F. G. Uglov, famous scientists in the field of cardio surgery, it is established that traces of internal dialogicity manifestation in the textual space of scientists actualize the origin of new knowledge, the change of author's semantic positions, his ability to reflect, compare, analyze his own thoughts and actions, to estimate oneself and the features of thinking process which are realized in logic of a statement of the scientific concept, an explanation of concepts, terms at judgment of the points of view of contemporaries and predecessors, adherents and scientist's opponents, and also orientation to the addressee's presupposition, activization of his cogitative activity. Linguistic, discursive, verbal analysis singles out the impact on the addressee, his mental activity.

  7. Empirical Studies On Machine Learning Based Text Classification Algorithms

    OpenAIRE

    Shweta C. Dharmadhikari; Maya Ingle; Parag Kulkarni

    2011-01-01

    Automatic classification of text documents has become an important research issue now days. Properclassification of text documents requires information retrieval, machine learning and Natural languageprocessing (NLP) techniques. Our aim is to focus on important approaches to automatic textclassification based on machine learning techniques viz. supervised, unsupervised and semi supervised.In this paper we present a review of various text classification approaches under machine learningparadig...

  8. Making School Development Credible. Text, Context, Irony

    Directory of Open Access Journals (Sweden)

    Mats Börjesson

    2012-01-01

    Full Text Available

    The article argues for the importance of an open, reflexive-methodological approach when switching between studying text, context and researcher activity. Close linguistic analysis can benefit from being linked with the researcher’s contextualisation of his empirical material as well as with more distanced readings. The more specific starting point for this article is that school development, like other similar terms such as school improvement and the like, makes use of linguistic building blocks with which whole narratives about today’s and tomorrow’s schools can be constructed. The subject of the study is a short text issued by the Swedish Schools Inspectorate (Skolinspektionen. Government language changes according to the authorities’ role in society and their own definitions of their functions, and an important aspect here is the legitimacy of the authorities’ texts. By means of various kinds of close linguistic analysis, the above-mentioned text is studied with regard to choice of categories, hierarchies of modalisation and the rhetorical effects of different types of formulations in a broader political-social landscape. The article concludes with a reflective discussion on the relationship between government language and irony as a stylistic device – a device that is based on the results of the close empirical analysis.[i]



    [i] The article is part of the project ”School  Development as Narrative”, funded by the Swedish Research Council. The author would like to thank the two reviewers for very valuable comments.

  9. Text Readability and Intuitive Simplification: A Comparison of Readability Formulas

    Science.gov (United States)

    Crossley, Scott A.; Allen, David B.; McNamara, Danielle S.

    2011-01-01

    Texts are routinely simplified for language learners with authors relying on a variety of approaches and materials to assist them in making the texts more comprehensible. Readability measures are one such tool that authors can use when evaluating text comprehensibility. This study compares the Coh-Metrix Second Language (L2) Reading Index, a…

  10. Intergeneric Derivation: on the Genealogy of an LSP text

    DEFF Research Database (Denmark)

    Askehave, Inger; Kastberg, Peter

    2001-01-01

    is derived from another text or to establish what aspects of the text have been derived, one must gain control over external variables that are not easily controllable. In our approach, we suggest a method that - while controlling external variables - is designed to isolate a suitable text corpus. Contrary...

  11. Helios: Understanding Solar Evolution Through Text Analytics

    Energy Technology Data Exchange (ETDEWEB)

    Randazzese, Lucien [SRI International, Menlo Park, CA (United States)

    2016-12-02

    This proof-of-concept project focused on developing, testing, and validating a range of bibliometric, text analytic, and machine-learning based methods to explore the evolution of three photovoltaic (PV) technologies: Cadmium Telluride (CdTe), Dye-Sensitized solar cells (DSSC), and Multi-junction solar cells. The analytical approach to the work was inspired by previous work by the same team to measure and predict the scientific prominence of terms and entities within specific research domains. The goal was to create tools that could assist domain-knowledgeable analysts in investigating the history and path of technological developments in general, with a focus on analyzing step-function changes in performance, or “breakthroughs,” in particular. The text-analytics platform developed during this project was dubbed Helios. The project relied on computational methods for analyzing large corpora of technical documents. For this project we ingested technical documents from the following sources into Helios: Thomson Scientific Web of Science (papers), the U.S. Patent & Trademark Office (patents), the U.S. Department of Energy (technical documents), the U.S. National Science Foundation (project funding summaries), and a hand curated set of full-text documents from Thomson Scientific and other sources.

  12. Lexical Density Of English Reading Texts For Senior High School

    OpenAIRE

    Nesia, Bersyebah Herljimsi; Ginting, Siti Aisah

    2014-01-01

    This study deals with the lexical density especially the lexical items of English reading texts in the textbook for senior high school. The objectives of the study are to find out the lexical density especially the lexical items which formed in the reading texts of Look Ahead textbook and the type of genre which has the highest lexical density of the reading texts. This study was conducted by descriptive method with qualitative approach. The data of this research were the English reading text...

  13. Chemical-text hybrid search engines.

    Science.gov (United States)

    Zhou, Yingyao; Zhou, Bin; Jiang, Shumei; King, Frederick J

    2010-01-01

    As the amount of chemical literature increases, it is critical that researchers be enabled to accurately locate documents related to a particular aspect of a given compound. Existing solutions, based on text and chemical search engines alone, suffer from the inclusion of "false negative" and "false positive" results, and cannot accommodate diverse repertoire of formats currently available for chemical documents. To address these concerns, we developed an approach called Entity-Canonical Keyword Indexing (ECKI), which converts a chemical entity embedded in a data source into its canonical keyword representation prior to being indexed by text search engines. We implemented ECKI using Microsoft Office SharePoint Server Search, and the resultant hybrid search engine not only supported complex mixed chemical and keyword queries but also was applied to both intranet and Internet environments. We envision that the adoption of ECKI will empower researchers to pose more complex search questions that were not readily attainable previously and to obtain answers at much improved speed and accuracy.

  14. Text summarization as a decision support aid

    Directory of Open Access Journals (Sweden)

    Workman T

    2012-05-01

    Full Text Available Abstract Background PubMed data potentially can provide decision support information, but PubMed was not exclusively designed to be a point-of-care tool. Natural language processing applications that summarize PubMed citations hold promise for extracting decision support information. The objective of this study was to evaluate the efficiency of a text summarization application called Semantic MEDLINE, enhanced with a novel dynamic summarization method, in identifying decision support data. Methods We downloaded PubMed citations addressing the prevention and drug treatment of four disease topics. We then processed the citations with Semantic MEDLINE, enhanced with the dynamic summarization method. We also processed the citations with a conventional summarization method, as well as with a baseline procedure. We evaluated the results using clinician-vetted reference standards built from recommendations in a commercial decision support product, DynaMed. Results For the drug treatment data, Semantic MEDLINE enhanced with dynamic summarization achieved average recall and precision scores of 0.848 and 0.377, while conventional summarization produced 0.583 average recall and 0.712 average precision, and the baseline method yielded average recall and precision values of 0.252 and 0.277. For the prevention data, Semantic MEDLINE enhanced with dynamic summarization achieved average recall and precision scores of 0.655 and 0.329. The baseline technique resulted in recall and precision scores of 0.269 and 0.247. No conventional Semantic MEDLINE method accommodating summarization for prevention exists. Conclusion Semantic MEDLINE with dynamic summarization outperformed conventional summarization in terms of recall, and outperformed the baseline method in both recall and precision. This new approach to text summarization demonstrates potential in identifying decision support data for multiple needs.

  15. Research on Illustrations in Text: Issues and Perspectives.

    Science.gov (United States)

    Duchastel, Philippe C.

    1980-01-01

    Explores the problems of research on the effects of illustrations in text and other teaching materials. Several research frameworks are described, and a functional approach is suggested as a method of improvement. (BK)

  16. Mining consumer health vocabulary from community-generated text.

    Science.gov (United States)

    Vydiswaran, V G Vinod; Mei, Qiaozhu; Hanauer, David A; Zheng, Kai

    2014-01-01

    Community-generated text corpora can be a valuable resource to extract consumer health vocabulary (CHV) and link them to professional terminologies and alternative variants. In this research, we propose a pattern-based text-mining approach to identify pairs of CHV and professional terms from Wikipedia, a large text corpus created and maintained by the community. A novel measure, leveraging the ratio of frequency of occurrence, was used to differentiate consumer terms from professional terms. We empirically evaluated the applicability of this approach using a large data sample consisting of MedLine abstracts and all posts from an online health forum, MedHelp. The results show that the proposed approach is able to identify synonymous pairs and label the terms as either consumer or professional term with high accuracy. We conclude that the proposed approach provides great potential to produce a high quality CHV to improve the performance of computational applications in processing consumer-generated health text.

  17. Doing Mathematics with Purpose: Mathematical Text Types

    Science.gov (United States)

    Dostal, Hannah M.; Robinson, Richard

    2018-01-01

    Mathematical literacy includes learning to read and write different types of mathematical texts as part of purposeful mathematical meaning making. Thus in this article, we describe how learning to read and write mathematical texts (proof text, algorithmic text, algebraic/symbolic text, and visual text) supports the development of students'…

  18. The socio-demographics of texting

    DEFF Research Database (Denmark)

    Ling, Richard; Bertel, Troels Fibæk; Sundsøy, Pål

    2012-01-01

    Who texts, and with whom do they text? This article examines the use of texting using metered traffic data from a large dataset (nearly 400 million anonymous text messages). We ask 1) How much do different age groups use mobile phone based texting (SMS)? 2) How wide is the circle of texting...

  19. Machine printed text and handwriting identification in noisy document images.

    Science.gov (United States)

    Zheng, Yefeng; Li, Huiping; Doermann, David

    2004-03-01

    In this paper, we address the problem of the identification of text in noisy document images. We are especially focused on segmenting and identifying between handwriting and machine printed text because: 1) Handwriting in a document often indicates corrections, additions, or other supplemental information that should be treated differently from the main content and 2) the segmentation and recognition techniques requested for machine printed and handwritten text are significantly different. A novel aspect of our approach is that we treat noise as a separate class and model noise based on selected features. Trained Fisher classifiers are used to identify machine printed text and handwriting from noise and we further exploit context to refine the classification. A Markov Random Field-based (MRF) approach is used to model the geometrical structure of the printed text, handwriting, and noise to rectify misclassifications. Experimental results show that our approach is robust and can significantly improve page segmentation in noisy document collections.

  20. A text-mining system for extracting metabolic reactions from full-text articles.

    Science.gov (United States)

    Czarnecki, Jan; Nobeli, Irene; Smith, Adrian M; Shepherd, Adrian J

    2012-07-23

    Increasingly biological text mining research is focusing on the extraction of complex relationships relevant to the construction and curation of biological networks and pathways. However, one important category of pathway - metabolic pathways - has been largely neglected.Here we present a relatively simple method for extracting metabolic reaction information from free text that scores different permutations of assigned entities (enzymes and metabolites) within a given sentence based on the presence and location of stemmed keywords. This method extends an approach that has proved effective in the context of the extraction of protein-protein interactions. When evaluated on a set of manually-curated metabolic pathways using standard performance criteria, our method performs surprisingly well. Precision and recall rates are comparable to those previously achieved for the well-known protein-protein interaction extraction task. We conclude that automated metabolic pathway construction is more tractable than has often been assumed, and that (as in the case of protein-protein interaction extraction) relatively simple text-mining approaches can prove surprisingly effective. It is hoped that these results will provide an impetus to further research and act as a useful benchmark for judging the performance of more sophisticated methods that are yet to be developed.

  1. Terminology extraction from medical texts in Polish.

    Science.gov (United States)

    Marciniak, Małgorzata; Mykowiecka, Agnieszka

    2014-01-01

    Hospital documents contain free text describing the most important facts relating to patients and their illnesses. These documents are written in specific language containing medical terminology related to hospital treatment. Their automatic processing can help in verifying the consistency of hospital documentation and obtaining statistical data. To perform this task we need information on the phrases we are looking for. At the moment, clinical Polish resources are sparse. The existing terminologies, such as Polish Medical Subject Headings (MeSH), do not provide sufficient coverage for clinical tasks. It would be helpful therefore if it were possible to automatically prepare, on the basis of a data sample, an initial set of terms which, after manual verification, could be used for the purpose of information extraction. Using a combination of linguistic and statistical methods for processing over 1200 children hospital discharge records, we obtained a list of single and multiword terms used in hospital discharge documents written in Polish. The phrases are ordered according to their presumed importance in domain texts measured by the frequency of use of a phrase and the variety of its contexts. The evaluation showed that the automatically identified phrases cover about 84% of terms in domain texts. At the top of the ranked list, only 4% out of 400 terms were incorrect while out of the final 200, 20% of expressions were either not domain related or syntactically incorrect. We also observed that 70% of the obtained terms are not included in the Polish MeSH. Automatic terminology extraction can give results which are of a quality high enough to be taken as a starting point for building domain related terminological dictionaries or ontologies. This approach can be useful for preparing terminological resources for very specific subdomains for which no relevant terminologies already exist. The evaluation performed showed that none of the tested ranking procedures were

  2. The Holy Text and Violence : Levinas and Fundamentalism

    NARCIS (Netherlands)

    Poorthuis, Marcel; Breitlin, Andris; Bremmers, Chris; Cools, Arthur

    2015-01-01

    Levinas'rejection of a historical ciritcal approach to sacred texts as well as his depreciation of Spinoza's view of the Bible might bring him close to fundamentalism. A thorough analysis is necessary to demonstrate essential differences. Levinas'rejection of a historical ciritcal approach to sacred

  3. Bengali text summarization by sentence extraction

    OpenAIRE

    Sarkar, Kamal

    2012-01-01

    Text summarization is a process to produce an abstract or a summary by selecting significant portion of the information from one or more texts. In an automatic text summarization process, a text is given to the computer and the computer returns a shorter less redundant extract or abstract of the original text(s). Many techniques have been developed for summarizing English text(s). But, a very few attempts have been made for Bengali text summarization. This paper presents a method for Bengali ...

  4. SIAM 2007 Text Mining Competition dataset

    Data.gov (United States)

    National Aeronautics and Space Administration — Subject Area: Text Mining Description: This is the dataset used for the SIAM 2007 Text Mining competition. This competition focused on developing text mining...

  5. Measurement of [Formula: see text] polarisation in [Formula: see text] collisions at [Formula: see text] = 7 TeV.

    Science.gov (United States)

    Aaij, R; Adeva, B; Adinolfi, M; Affolder, A; Ajaltouni, Z; Albrecht, J; Alessio, F; Alexander, M; Ali, S; Alkhazov, G; Alvarez Cartelle, P; Alves, A A; Amato, S; Amerio, S; Amhis, Y; An, L; Anderlini, L; Anderson, J; Andreassen, R; Andreotti, M; Andrews, J E; Appleby, R B; Aquines Gutierrez, O; Archilli, F; Artamonov, A; Artuso, M; Aslanides, E; Auriemma, G; Baalouch, M; Bachmann, S; Back, J J; Badalov, A; Balagura, V; Baldini, W; Barlow, R J; Barschel, C; Barsuk, S; Barter, W; Batozskaya, V; Bauer, Th; Bay, A; Beddow, J; Bedeschi, F; Bediaga, I; Belogurov, S; Belous, K; Belyaev, I; Ben-Haim, E; Bencivenni, G; Benson, S; Benton, J; Berezhnoy, A; Bernet, R; Bettler, M-O; van Beuzekom, M; Bien, A; Bifani, S; Bird, T; Bizzeti, A; Bjørnstad, P M; Blake, T; Blanc, F; Blouw, J; Blusk, S; Bocci, V; Bondar, A; Bondar, N; Bonivento, W; Borghi, S; Borgia, A; Borsato, M; Bowcock, T J V; Bowen, E; Bozzi, C; Brambach, T; van den Brand, J; Bressieux, J; Brett, D; Britsch, M; Britton, T; Brook, N H; Brown, H; Bursche, A; Busetto, G; Buytaert, J; Cadeddu, S; Calabrese, R; Callot, O; Calvi, M; Calvo Gomez, M; Camboni, A; Campana, P; Campora Perez, D; Carbone, A; Carboni, G; Cardinale, R; Cardini, A; Carranza-Mejia, H; Carson, L; Carvalho Akiba, K; Casse, G; Cassina, L; Castillo Garcia, L; Cattaneo, M; Cauet, Ch; Cenci, R; Charles, M; Charpentier, Ph; Cheung, S-F; Chiapolini, N; Chrzaszcz, M; Ciba, K; Cid Vidal, X; Ciezarek, G; Clarke, P E L; Clemencic, M; Cliff, H V; Closier, J; Coca, C; Coco, V; Cogan, J; Cogneras, E; Collins, P; Comerma-Montells, A; Contu, A; Cook, A; Coombes, M; Coquereau, S; Corti, G; Corvo, M; Counts, I; Couturier, B; Cowan, G A; Craik, D C; Cruz Torres, M; Cunliffe, S; Currie, R; D'Ambrosio, C; Dalseno, J; David, P; David, P N Y; Davis, A; De Bruyn, K; De Capua, S; De Cian, M; De Miranda, J M; De Paula, L; De Silva, W; De Simone, P; Decamp, D; Deckenhoff, M; Del Buono, L; Déléage, N; Derkach, D; Deschamps, O; Dettori, F; Di Canto, A; Dijkstra, H; Donleavy, S; Dordei, F; Dorigo, M; Dosil Suárez, A; Dossett, D; Dovbnya, A; Dupertuis, F; Durante, P; Dzhelyadin, R; Dziurda, A; Dzyuba, A; Easo, S; Egede, U; Egorychev, V; Eidelman, S; Eisenhardt, S; Eitschberger, U; Ekelhof, R; Eklund, L; El Rifai, I; Elsasser, Ch; Esen, S; Evans, T; Falabella, A; Färber, C; Farinelli, C; Farry, S; Ferguson, D; Fernandez Albor, V; Ferreira Rodrigues, F; Ferro-Luzzi, M; Filippov, S; Fiore, M; Fiorini, M; Firlej, M; Fitzpatrick, C; Fiutowski, T; Fontana, M; Fontanelli, F; Forty, R; Francisco, O; Frank, M; Frei, C; Frosini, M; Fu, J; Furfaro, E; Gallas Torreira, A; Galli, D; Gandelman, M; Gandini, P; Gao, Y; Garofoli, J; Garra Tico, J; Garrido, L; Gaspar, C; Gauld, R; Gavardi, L; Gersabeck, E; Gersabeck, M; Gershon, T; Ghez, Ph; Gianelle, A; Giani, S; Gibson, V; Giubega, L; Gligorov, V V; Göbel, C; Golubkov, D; Golutvin, A; Gomes, A; Gordon, H; Gotti, C; Grabalosa Gándara, M; Graciani Diaz, R; Granado Cardoso, L A; Graugés, E; Graziani, G; Grecu, A; Greening, E; Gregson, S; Griffith, P; Grillo, L; Grünberg, O; Gui, B; Gushchin, E; Guz, Yu; Gys, T; Hadjivasiliou, C; Haefeli, G; Haen, C; Haines, S C; Hall, S; Hamilton, B; Hampson, T; Han, X; Hansmann-Menzemer, S; Harnew, N; Harnew, S T; Harrison, J; Hartmann, T; He, J; Head, T; Heijne, V; Hennessy, K; Henrard, P; Henry, L; Hernando Morata, J A; van Herwijnen, E; Heß, M; Hicheur, A; Hill, D; Hoballah, M; Hombach, C; Hulsbergen, W; Hunt, P; Hussain, N; Hutchcroft, D; Hynds, D; Iakovenko, V; Idzik, M; Ilten, P; Jacobsson, R; Jaeger, A; Jalocha, J; Jans, E; Jaton, P; Jawahery, A; Jezabek, M; Jing, F; John, M; Johnson, D; Jones, C R; Joram, C; Jost, B; Jurik, N; Kaballo, M; Kandybei, S; Kanso, W; Karacson, M; Karbach, T M; Kelsey, M; Kenyon, I R; Ketel, T; Khanji, B; Khurewathanakul, C; Klaver, S; Kochebina, O; Kolpin, M; Komarov, I; Koopman, R F; Koppenburg, P; Korolev, M; Kozlinskiy, A; Kravchuk, L; Kreplin, K; Kreps, M; Krocker, G; Krokovny, P; Kruse, F; Kucharczyk, M; Kudryavtsev, V; Kurek, K; Kvaratskheliya, T; La Thi, V N; Lacarrere, D; Lafferty, G; Lai, A; Lambert, D; Lambert, R W; Lanciotti, E; Lanfranchi, G; Langenbruch, C; Latham, T; Lazzeroni, C; Le Gac, R; van Leerdam, J; Lees, J-P; Lefèvre, R; Leflat, A; Lefrançois, J; Leo, S; Leroy, O; Lesiak, T; Leverington, B; Li, Y; Liles, M; Lindner, R; Linn, C; Lionetto, F; Liu, B; Liu, G; Lohn, S; Longstaff, I; Longstaff, I; Lopes, J H; Lopez-March, N; Lowdon, P; Lu, H; Lucchesi, D; Luisier, J; Luo, H; Lupato, A; Luppi, E; Lupton, O; Machefert, F; Machikhiliyan, I V; Maciuc, F; Maev, O; Malde, S; Manca, G; Mancinelli, G; Manzali, M; Maratas, J; Marchand, J F; Marconi, U; Marino, P; Märki, R; Marks, J; Martellotti, G; Martens, A; Martín Sánchez, A; Martinelli, M; Martinez Santos, D; Martinez Vidal, F; Martins Tostes, D; Massafferri, A; Matev, R; Mathe, Z; Matteuzzi, C; Mazurov, A; McCann, M; McCarthy, J; McNab, A; McNulty, R; McSkelly, B; Meadows, B; Meier, F; Meissner, M; Merk, M; Milanes, D A; Minard, M-N; Molina Rodriguez, J; Monteil, S; Moran, D; Morandin, M; Morawski, P; Mordà, A; Morello, M J; Moron, J; Mountain, R; Muheim, F; Müller, K; Muresan, R; Muster, B; Naik, P; Nakada, T; Nandakumar, R; Nasteva, I; Needham, M; Neri, N; Neubert, S; Neufeld, N; Neuner, M; Nguyen, A D; Nguyen, T D; Nguyen-Mau, C; Nicol, M; Niess, V; Niet, R; Nikitin, N; Nikodem, T; Novoselov, A; Oblakowska-Mucha, A; Obraztsov, V; Oggero, S; Ogilvy, S; Okhrimenko, O; Oldeman, R; Onderwater, G; Orlandea, M; Otalora Goicochea, J M; Owen, P; Oyanguren, A; Pal, B K; Palano, A; Palombo, F; Palutan, M; Panman, J; Papanestis, A; Pappagallo, M; Parkes, C; Parkinson, C J; Passaleva, G; Patel, G D; Patel, M; Patrignani, C; Pazos Alvarez, A; Pearce, A; Pellegrino, A; Penso, G; Pepe Altarelli, M; Perazzini, S; Perez Trigo, E; Perret, P; Perrin-Terrin, M; Pescatore, L; Pesen, E; Petridis, K; Petrolini, A; Picatoste Olloqui, E; Pietrzyk, B; Pilař, T; Pinci, D; Pistone, A; Playfer, S; Plo Casasus, M; Polci, F; Polok, G; Poluektov, A; Polycarpo, E; Popov, A; Popov, D; Popovici, B; Potterat, C; Powell, A; Prisciandaro, J; Pritchard, A; Prouve, C; Pugatch, V; Puig Navarro, A; Punzi, G; Qian, W; Rachwal, B; Rademacker, J H; Rakotomiaramanana, B; Rama, M; Rangel, M S; Raniuk, I; Rauschmayr, N; Raven, G; Redford, S; Reichert, S; Reid, M M; Dos Reis, A C; Ricciardi, S; Richards, A; Rinnert, K; Rives Molina, V; Roa Romero, D A; Robbe, P; Rodrigues, A B; Rodrigues, E; Rodriguez Perez, P; Roiser, S; Romanovsky, V; Romero Vidal, A; Rotondo, M; Rouvinet, J; Ruf, T; Ruffini, F; Ruiz, H; Ruiz Valls, P; Sabatino, G; Saborido Silva, J J; Sagidova, N; Sail, P; Saitta, B; Salustino Guimaraes, V; Sanchez Mayordomo, C; Sanmartin Sedes, B; Santacesaria, R; Santamarina Rios, C; Santovetti, E; Sapunov, M; Sarti, A; Satriano, C; Satta, A; Savrie, M; Savrina, D; Schiller, M; Schindler, H; Schlupp, M; Schmelling, M; Schmidt, B; Schneider, O; Schopper, A; Schune, M-H; Schwemmer, R; Sciascia, B; Sciubba, A; Seco, M; Semennikov, A; Senderowska, K; Sepp, I; Serra, N; Serrano, J; Sestini, L; Seyfert, P; Shapkin, M; Shapoval, I; Shcheglov, Y; Shears, T; Shekhtman, L; Shevchenko, V; Shires, A; Silva Coutinho, R; Simi, G; Sirendi, M; Skidmore, N; Skwarnicki, T; Smith, N A; Smith, E; Smith, E; Smith, J; Smith, M; Snoek, H; Sokoloff, M D; Soler, F J P; Soomro, F; Souza, D; Souza De Paula, B; Spaan, B; Sparkes, A; Spinella, F; Spradlin, P; Stagni, F; Stahl, S; Steinkamp, O; Stenyakin, O; Stevenson, S; Stoica, S; Stone, S; Storaci, B; Stracka, S; Straticiuc, M; Straumann, U; Stroili, R; Subbiah, V K; Sun, L; Sutcliffe, W; Swientek, K; Swientek, S; Syropoulos, V; Szczekowski, M; Szczypka, P; Szilard, D; Szumlak, T; T'Jampens, S; Teklishyn, M; Tellarini, G; Teodorescu, E; Teubert, F; Thomas, C; Thomas, E; van Tilburg, J; Tisserand, V; Tobin, M; Tolk, S; Tomassetti, L; Tonelli, D; Topp-Joergensen, S; Torr, N; Tournefier, E; Tourneur, S; Tran, M T; Tresch, M; Tsaregorodtsev, A; Tsopelas, P; Tuning, N; Ubeda Garcia, M; Ukleja, A; Ustyuzhanin, A; Uwer, U; Vagnoni, V; Valenti, G; Vallier, A; Vazquez Gomez, R; Vazquez Regueiro, P; Vázquez Sierra, C; Vecchi, S; Velthuis, J J; Veltri, M; Veneziano, G; Vesterinen, M; Viaud, B; Vieira, D; Vieites Diaz, M; Vilasis-Cardona, X; Vollhardt, A; Volyanskyy, D; Voong, D; Vorobyev, A; Vorobyev, V; Voß, C; Voss, H; de Vries, J A; Waldi, R; Wallace, C; Wallace, R; Walsh, J; Wandernoth, S; Wang, J; Ward, D R; Watson, N K; Webber, A D; Websdale, D; Whitehead, M; Wicht, J; Wiedner, D; Wiggers, L; Wilkinson, G; Williams, M P; Williams, M; Wilson, F F; Wimberley, J; Wishahi, J; Wislicki, W; Witek, M; Wormser, G; Wotton, S A; Wright, S; Wu, S; Wyllie, K; Xie, Y; Xing, Z; Xu, Z; Yang, Z; Yuan, X; Yushchenko, O; Zangoli, M; Zavertyaev, M; Zhang, F; Zhang, L; Zhang, W C; Zhang, Y; Zhelezov, A; Zhokhov, A; Zhong, L; Zvyagin, A

    The polarisation of prompt [Formula: see text] mesons is measured by performing an angular analysis of [Formula: see text] decays using proton-proton collision data, corresponding to an integrated luminosity of 1.0[Formula: see text], collected by the LHCb detector at a centre-of-mass energy of 7 TeV. The polarisation is measured in bins of transverse momentum [Formula: see text] and rapidity [Formula: see text] in the kinematic region [Formula: see text] and [Formula: see text], and is compared to theoretical models. No significant polarisation is observed.

  6. MANAGING THE TRANSLATION OF ECONOMIC TEXTS

    Directory of Open Access Journals (Sweden)

    Pop Anamaria Mirabela

    2012-12-01

    Full Text Available Theoretically, translation may pass as science; practically, it seems closer to art. Translation is a challenging activity requiring a set of abilities and posing few difficulties that appear during the translation process. This paper investigates the extent to which sub-technical vocabulary can constitute a problem to Romanian students of economics reading in English, by looking at the translations produced as independent or pair work during English classes and analyzing the various errors which may appeared. The exigencies required by the efficient business communication have increased in the past few decades because of rising international trade, increased migration, globalization, the recognition of linguistic minorities, and the expansion of the mass media and technology. All these led us to approach the topic of translation which is actually a job that requires skills, stages of research necessary for disclosure of transfer characteristic into the target language, training, experience and a good sense of languages. The paper defines the theoretical issues and terminology: translation, types of translation, economic texts and then focuses on the presentation of the practical work carried out throughout the academic year of second year students. Considering that only 28% of the entire European population can read English, and even less people in South America and Asia can, it is obvious that an effective communication of business matters relies on an accurate understanding of terminology. Economics is a field of knowledge in accelerated scientific and technological development. As there is a permanent and ever increasing need to quickly update their knowledge, economists read and learn directly in the original language of the publication and stick to it in daily usage, including conferences, scientific events and articles written in Romanian. Besides researching properly the markets, finding distribution channels, and dealing with legal

  7. The text plan concept: contributions to the writing planning process

    Directory of Open Access Journals (Sweden)

    Ana Lúcia Tinoco Cabral

    2013-12-01

    Full Text Available Students - at different levels, ranging from early grades up to PhD - face problems both on comprehension and text production. This paper focuses on the text plan concept according to the DTA (Discourse Text Analysis approach, i.e., a principle of organization that allows students to put into practice the production intention as well as to arrange text information while producing; being responsible for the text compositional structure (Adam, 2008. The study analyzes the relation between text plan and the writing planning process, in which the first one provides the second with theoretical support. In order to develop such research, the study covers some issues related to the reading skill, analyzes an argumentative text as per its textual plan, and presents some reflections on the writing process, focusing on the relation between textual plan and the writing planning process.

  8. Examining Text Complexity in the Early Grades

    Science.gov (United States)

    Fitzgerald, Jill; Elmore, Jeff; Hiebert, Elfrieda H.; Koons, Heather H.; Bowen, Kimberly; Sanford-Moore, Eleanor E.; Stenner, A. Jackson

    2016-01-01

    The Common Core raises the stature of texts to new heights, creating a hubbub. The fuss is especially messy at the early grades, where children are expected to read more complex texts than in the past. But early-grades teachers have been given little actionable guidance about text complexity. The authors recently examined early-grades texts to…

  9. Text mining in livestock animal science: introducing the potential of text mining to animal sciences.

    Science.gov (United States)

    Sahadevan, S; Hofmann-Apitius, M; Schellander, K; Tesfaye, D; Fluck, J; Friedrich, C M

    2012-10-01

    In biological research, establishing the prior art by searching and collecting information already present in the domain has equal importance as the experiments done. To obtain a complete overview about the relevant knowledge, researchers mainly rely on 2 major information sources: i) various biological databases and ii) scientific publications in the field. The major difference between the 2 information sources is that information from databases is available, typically well structured and condensed. The information content in scientific literature is vastly unstructured; that is, dispersed among the many different sections of scientific text. The traditional method of information extraction from scientific literature occurs by generating a list of relevant publications in the field of interest and manually scanning these texts for relevant information, which is very time consuming. It is more than likely that in using this "classical" approach the researcher misses some relevant information mentioned in the literature or has to go through biological databases to extract further information. Text mining and named entity recognition methods have already been used in human genomics and related fields as a solution to this problem. These methods can process and extract information from large volumes of scientific text. Text mining is defined as the automatic extraction of previously unknown and potentially useful information from text. Named entity recognition (NER) is defined as the method of identifying named entities (names of real world objects; for example, gene/protein names, drugs, enzymes) in text. In animal sciences, text mining and related methods have been briefly used in murine genomics and associated fields, leaving behind other fields of animal sciences, such as livestock genomics. The aim of this work was to develop an information retrieval platform in the livestock domain focusing on livestock publications and the recognition of relevant data from

  10. SAW Classification Algorithm for Chinese Text Classification

    OpenAIRE

    Xiaoli Guo; Huiyu Sun; Tiehua Zhou; Ling Wang; Zhaoyang Qu; Jiannan Zang

    2015-01-01

    Considering the explosive growth of data, the increased amount of text data’s effect on the performance of text categorization forward the need for higher requirements, such that the existing classification method cannot be satisfied. Based on the study of existing text classification technology and semantics, this paper puts forward a kind of Chinese text classification oriented SAW (Structural Auxiliary Word) algorithm. The algorithm uses the special space effect of Chinese text where words...

  11. Extractive text summarization system to aid data extraction from full text in systematic review development.

    Science.gov (United States)

    Bui, Duy Duc An; Del Fiol, Guilherme; Hurdle, John F; Jonnalagadda, Siddhartha

    2016-12-01

    Extracting data from publication reports is a standard process in systematic review (SR) development. However, the data extraction process still relies too much on manual effort which is slow, costly, and subject to human error. In this study, we developed a text summarization system aimed at enhancing productivity and reducing errors in the traditional data extraction process. We developed a computer system that used machine learning and natural language processing approaches to automatically generate summaries of full-text scientific publications. The summaries at the sentence and fragment levels were evaluated in finding common clinical SR data elements such as sample size, group size, and PICO values. We compared the computer-generated summaries with human written summaries (title and abstract) in terms of the presence of necessary information for the data extraction as presented in the Cochrane review's study characteristics tables. At the sentence level, the computer-generated summaries covered more information than humans do for systematic reviews (recall 91.2% vs. 83.8%, p<0.001). They also had a better density of relevant sentences (precision 59% vs. 39%, p<0.001). At the fragment level, the ensemble approach combining rule-based, concept mapping, and dictionary-based methods performed better than individual methods alone, achieving an 84.7% F-measure. Computer-generated summaries are potential alternative information sources for data extraction in systematic review development. Machine learning and natural language processing are promising approaches to the development of such an extractive summarization system. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. Text analysis with R for students of literature

    CERN Document Server

    Jockers, Matthew L

    2014-01-01

    Text Analysis with R for Students of Literature is written with students and scholars of literature in mind but will be applicable to other humanists and social scientists wishing to extend their methodological tool kit to include quantitative and computational approaches to the study of text. Computation provides access to information in text that we simply cannot gather using traditional qualitative methods of close reading and human synthesis. Text Analysis with R for Students of Literature provides a practical introduction to computational text analysis using the open source programming language R. R is extremely popular throughout the sciences and because of its accessibility, R is now used increasingly in other research areas. Readers begin working with text right away and each chapter works through a new technique or process such that readers gain a broad exposure to core R procedures and a basic understanding of the possibilities of computational text analysis at both the micro and macro scale. Each c...

  13. VideoSET: Video Summary Evaluation through Text

    OpenAIRE

    Yeung, Serena; Fathi, Alireza; Fei-Fei, Li

    2014-01-01

    In this paper we present VideoSET, a method for Video Summary Evaluation through Text that can evaluate how well a video summary is able to retain the semantic information contained in its original video. We observe that semantics is most easily expressed in words, and develop a text-based approach for the evaluation. Given a video summary, a text representation of the video summary is first generated, and an NLP-based metric is then used to measure its semantic distance to ground-truth text ...

  14. A Proposed Arabic Handwritten Text Normalization Method

    Directory of Open Access Journals (Sweden)

    Tarik Abu-Ain

    2014-11-01

    Full Text Available Text normalization is an important technique in document image analysis and recognition. It consists of many preprocessing stages, which include slope correction, text padding, skew correction, and straight the writing line. In this side, text normalization has an important role in many procedures such as text segmentation, feature extraction and characters recognition. In the present article, a new method for text baseline detection, straightening, and slant correction for Arabic handwritten texts is proposed. The method comprises a set of sequential steps: first components segmentation is done followed by components text thinning; then, the direction features of the skeletons are extracted, and the candidate baseline regions are determined. After that, selection of the correct baseline region is done, and finally, the baselines of all components are aligned with the writing line.  The experiments are conducted on IFN/ENIT benchmark Arabic dataset. The results show that the proposed method has a promising and encouraging performance.

  15. Partition of Ni between olivine and sulfide: the effect of temperature, f_{{text{O}}_{text{2}} } and f_{{text{S}}_{text{2}} }

    Science.gov (United States)

    Fleet, M. E.; Macrae, N. D.

    1987-03-01

    The experimental distribution coefficient for Ni/ Fe exchange between olivine and monosulfide (KD3) is 35.6±1.1 at 1385° C, f_{{text{O}}_{text{2}} } = 10^{ - 8.87} ,f_{{text{S}}_{text{2}} } = 10^{ - 1.02} , and olivine of composition Fo96 to Fo92. These are the physicochemical conditions appropriate to hypothesized sulfur-saturated komatiite magma. The present experiments equilibrated natural olivine grains with sulfide-oxide liquid in the presence of a (Mg, Fe)-alumino-silicate melt. By a variety of different experimental procedures, K D3 is shown to be essentially constant at about 30 to 35 in the temperature range 900 to 1400° C, for olivine of composition Fo97 to FoO, monosulfide composition with up to 70 mol. % NiS, and a wide range of f_{{text{O}}_{text{2}} } and f_{{text{S}}_{text{2}} }.

  16. Arabic text classification using Polynomial Networks

    Directory of Open Access Journals (Sweden)

    Mayy M. Al-Tahrawi

    2015-10-01

    Full Text Available In this paper, an Arabic statistical learning-based text classification system has been developed using Polynomial Neural Networks. Polynomial Networks have been recently applied to English text classification, but they were never used for Arabic text classification. In this research, we investigate the performance of Polynomial Networks in classifying Arabic texts. Experiments are conducted on a widely used Arabic dataset in text classification: Al-Jazeera News dataset. We chose this dataset to enable direct comparisons of the performance of Polynomial Networks classifier versus other well-known classifiers on this dataset in the literature of Arabic text classification. Results of experiments show that Polynomial Networks classifier is a competitive algorithm to the state-of-the-art ones in the field of Arabic text classification.

  17. Comprehension challenges in the fourth grade: The roles of text cohesion, text genre, and readers’ prior knowledge

    Directory of Open Access Journals (Sweden)

    Danielle S. McNamara

    2011-07-01

    Full Text Available We examined young readers’ comprehension as a function of text genre (narrative, science, text cohesion (high, low, and readers’ abilities (reading decoding skills and world knowledge. The overarching purpose of this study was to contribute to our understanding of the fourth grade slump. Children in grade 4 read four texts, including one high and one low cohesion text from each genre. Comprehension of each text was assessed with 12 multiple-choice questions and free and cued recall. Comprehension was enhanced by increased knowledge: high knowledge readers showed better comprehension than low knowledge readers and narratives were comprehended better than science texts. Interactions between readers’ knowledge levels and text characteristics indicated that the children showed larger effects of knowledge for science than for narrative texts, and those with more knowledge better understood the low cohesion, narrative texts, showing a reverse cohesion effect. Decoding skill benefited comprehension, but effects of text genre and cohesion depended less on decoding skill than prior knowledge. Overall, the study indicates that the fourth grade slump is at least partially attributable to the emergence of complex dependencies between the nature of the text and the reader’s prior knowledge. The results also suggested that simply adding cohesion cues, and not explanatory information, is not likely to be sufficient for young readers as an approach to improving comprehension of challenging texts.

  18. Comprehension challenges in the fourth grade: The roles of text cohesion, text genre, and readers’ prior knowledge

    Directory of Open Access Journals (Sweden)

    Danielle S. McNAMARA

    2011-11-01

    Full Text Available We examined young readers’ comprehension as a function of text genre (narrative, science, text cohesion (high, low, and readers’ abilities (reading decoding skills and world knowledge. The overarching purpose of this study was to contribute to our understanding of the fourth grade slump. Children in grade 4 read four texts, including one high and one low cohesion text from each genre. Comprehension of each text was assessed with 12 multiple-choice questions and free and cued recall. Comprehension was enhanced by increased knowledge: high knowledge readers showed bettercomprehension than low knowledge readers and narratives were comprehended better than science texts. Interactions between readers’ knowledge levels and text characteristics indicated that thechildren showed larger effects of knowledge for science than for narrative texts, and those with more knowledge better understood the low cohesion, narrative texts, showing a reverse cohesion effect.Decoding skill benefited comprehension, but effects of text genre and cohesion depended less on decoding skill than prior knowledge. Overall, the study indicates that the fourth grade slump is at leastpartially attributable to the emergence of complex dependencies between the nature of the text and the reader’s prior knowledge. The results also suggested that simply adding cohesion cues, and notexplanatory information, is not likely to be sufficient for young readers as an approach to improving comprehension of challenging texts.

  19. Multimodal Diversity of Postmodernist Fiction Text

    Directory of Open Access Journals (Sweden)

    U. I. Tykha

    2016-12-01

    Full Text Available The article is devoted to the analysis of structural and functional manifestations of multimodal diversity in postmodernist fiction texts. Multimodality is defined as the coexistence of more than one semiotic mode within a certain context. Multimodal texts feature a diversity of semiotic modes in the communication and development of their narrative. Such experimental texts subvert conventional patterns by introducing various semiotic resources – verbal or non-verbal.

  20. Systematic text condensation: a strategy for qualitative analysis.

    Science.gov (United States)

    Malterud, Kirsti

    2012-12-01

    To present background, principles, and procedures for a strategy for qualitative analysis called systematic text condensation and discuss this approach compared with related strategies. Giorgi's psychological phenomenological analysis is the point of departure and inspiration for systematic text condensation. The basic elements of Giorgi's method and the elaboration of these in systematic text condensation are presented, followed by a detailed description of procedures for analysis according to systematic text condensation. Finally, similarities and differences compared with other frequently applied methods for qualitative analysis are identified, as the foundation of a discussion of strengths and limitations of systematic text condensation. Systematic text condensation is a descriptive and explorative method for thematic cross-case analysis of different types of qualitative data, such as interview studies, observational studies, and analysis of written texts. The method represents a pragmatic approach, although inspired by phenomenological ideas, and various theoretical frameworks can be applied. The procedure consists of the following steps: 1) total impression - from chaos to themes; 2) identifying and sorting meaning units - from themes to codes; 3) condensation - from code to meaning; 4) synthesizing - from condensation to descriptions and concepts. Similarities and differences comparing systematic text condensation with other frequently applied qualitative methods regarding thematic analysis, theoretical methodological framework, analysis procedures, and taxonomy are discussed. Systematic text condensation is a strategy for analysis developed from traditions shared by most of the methods for analysis of qualitative data. The method offers the novice researcher a process of intersubjectivity, reflexivity, and feasibility, while maintaining a responsible level of methodological rigour.

  1. Learning From Short Text Streams With Topic Drifts.

    Science.gov (United States)

    Li, Peipei; He, Lu; Wang, Haiyan; Hu, Xuegang; Zhang, Yuhong; Li, Lei; Wu, Xindong

    2017-09-18

    Short text streams such as search snippets and micro blogs have been popular on the Web with the emergence of social media. Unlike traditional normal text streams, these data present the characteristics of short length, weak signal, high volume, high velocity, topic drift, etc. Short text stream classification is hence a very challenging and significant task. However, this challenge has received little attention from the research community. Therefore, a new feature extension approach is proposed for short text stream classification with the help of a large-scale semantic network obtained from a Web corpus. It is built on an incremental ensemble classification model for efficiency. First, more semantic contexts based on the senses of terms in short texts are introduced to make up of the data sparsity using the open semantic network, in which all terms are disambiguated by their semantics to reduce the noise impact. Second, a concept cluster-based topic drifting detection method is proposed to effectively track hidden topic drifts. Finally, extensive studies demonstrate that as compared to several well-known concept drifting detection methods in data stream, our approach can detect topic drifts effectively, and it enables handling short text streams effectively while maintaining the efficiency as compared to several state-of-the-art short text classification approaches.

  2. Rational kernels for Arabic Root Extraction and Text Classification

    Directory of Open Access Journals (Sweden)

    Attia Nehar

    2016-04-01

    Full Text Available In this paper, we address the problems of Arabic Text Classification and root extraction using transducers and rational kernels. We introduce a new root extraction approach on the basis of the use of Arabic patterns (Pattern Based Stemmer. Transducers are used to model these patterns and root extraction is done without relying on any dictionary. Using transducers for extracting roots, documents are transformed into finite state transducers. This document representation allows us to use and explore rational kernels as a framework for Arabic Text Classification. Root extraction experiments are conducted on three word collections and yield 75.6% of accuracy. Classification experiments are done on the Saudi Press Agency dataset and N-gram kernels are tested with different values of N. Accuracy and F1 report 90.79% and 62.93% respectively. These results show that our approach, when compared with other approaches, is promising specially in terms of accuracy and F1.

  3. Youth Texting: Help or Hindrance to Literacy?

    Science.gov (United States)

    Zebroff, Dmitri

    2018-01-01

    An extensive amount of research has been performed in recent years into the widespread practice of text messaging in youth. As part of this broad area of research, the associations between youth texting and literacy have been investigated in a variety of contexts. A comprehensive, semi-systematic review of the literature into texting and literacy…

  4. Choices of texts for literary education

    DEFF Research Database (Denmark)

    Skyggebjerg, Anna Karlskov

    This paper charts the general implications of the choice of texts for literature teaching in the Danish school system, especially in Grades 8 and 9. It will analyze and discuss the premises of the choice of texts, and the possibilities of a certain choice of text in a concrete classroom situation...

  5. Effects of Text Messaging on Academic Performance

    OpenAIRE

    Barks Amanda; Searight H. Russell; Ratwik Susan

    2011-01-01

    University students frequently send and receive cellular phone text messages during classroominstruction. Cognitive psychology research indicates that multi-tasking is frequently associatedwith performance cost. However, university students often have considerable experience withelectronic multi-tasking and may believe that they can devote necessary attention to a classroomlecture while sending and receiving text messages. In the current study, university students whoused text messaging were ...

  6. Text-Picture Relations in Cooking Instructions

    NARCIS (Netherlands)

    van der Sluis, Ielka; Leito, Shadira; Redeker, Gisela; Bunt, Harry

    2016-01-01

    Like many other instructions, recipes on packages with ready-to-use ingredients for a dish combine a series of pictures with short text paragraphs. The information presentation in such multimodal instructions can be compact (either text or picture) and/or cohesive (text and picture). In an

  7. Academic Journal Embargoes and Full Text Databases.

    Science.gov (United States)

    Brooks, Sam

    2003-01-01

    Documents the reasons for embargoes of academic journals in full text databases (i.e., publisher-imposed delays on the availability of full text content) and provides insight regarding common misconceptions. Tables present data on selected journals covering a cross-section of subjects and publishers and comparing two full text business databases.…

  8. Inclusion in the Workplace - Text Version | NREL

    Science.gov (United States)

    Careers » Inclusion in the Workplace - Text Version Inclusion in the Workplace - Text Version This is the text version for the Inclusion: Leading by Example video. I'm Martin Keller. I'm the NREL of the laboratory. Another very important element in inclusion is diversity. Because if we have a

  9. Effects of Text Messaging on Academic Performance

    Directory of Open Access Journals (Sweden)

    Barks Amanda

    2011-12-01

    Full Text Available University students frequently send and receive cellular phone text messages during classroominstruction. Cognitive psychology research indicates that multi-tasking is frequently associatedwith performance cost. However, university students often have considerable experience withelectronic multi-tasking and may believe that they can devote necessary attention to a classroomlecture while sending and receiving text messages. In the current study, university students whoused text messaging were randomly assigned to one of two conditions: 1. a group that sent andreceived text messages during a lecture or, 2. a group that did not engage in text messagingduring the lecture. Participants who engaged in text messaging demonstrated significantlypoorer performance on a test covering lecture content compared with the group that did notsend and receive text messages. Participants exhibiting higher levels of text messaging skill hadsignificantly lower test scores than participants who were less proficient at text messaging. It ishypothesized that in terms of retention of lecture material, more frequent task shifting by thosewith greater text messaging proficiency contributed to poorer performance. Overall, the findingsdo not support the view, held by many university students, that this form of multitasking has littleeffect on the acquisition of lecture content. Results provide empirical support for teachers andprofessors who ban text messaging in the classroom.

  10. The artists' text as work of art

    NARCIS (Netherlands)

    van Rijn, I.A.M.J.

    2017-01-01

    Artists’ texts are texts written and produced by visual artists. Their number increasing since the 2000s, it becomes important to clarify their obscure relationship to art institutions. Analysing and comparing four different artists’ texts on a textual level, this research proposes an alternative to

  11. An Embedded Application for Degraded Text Recognition

    Directory of Open Access Journals (Sweden)

    Thillou Céline

    2005-01-01

    Full Text Available This paper describes a mobile device which tries to give the blind or visually impaired access to text information. Three key technologies are required for this system: text detection, optical character recognition, and speech synthesis. Blind users and the mobile environment imply two strong constraints. First, pictures will be taken without control on camera settings and a priori information on text (font or size and background. The second issue is to link several techniques together with an optimal compromise between computational constraints and recognition efficiency. We will present the overall description of the system from text detection to OCR error correction.

  12. The Instructional Text like a Textual Genre

    Directory of Open Access Journals (Sweden)

    Adiane Fogali Marinello

    2011-07-01

    Full Text Available This article analyses the instructional text as a textual genre and is part of the research called Reading and text production from the textual genre perspective, done at Universidade de Caxias do Sul, Campus Universitário da Região dos Vinhedos. Firstly, some theoretical assumptions about textual genre are presented, then, the instructional text is characterized. After that an instructional text is analyzed and, finally, some activities related to reading and writing of the mentioned genre directed to High School and University students are suggested.

  13. Text segmentation in degraded historical document images

    Directory of Open Access Journals (Sweden)

    A.S. Kavitha

    2016-07-01

    Full Text Available Text segmentation from degraded Historical Indus script images helps Optical Character Recognizer (OCR to achieve good recognition rates for Hindus scripts; however, it is challenging due to complex background in such images. In this paper, we present a new method for segmenting text and non-text in Indus documents based on the fact that text components are less cursive compared to non-text ones. To achieve this, we propose a new combination of Sobel and Laplacian for enhancing degraded low contrast pixels. Then the proposed method generates skeletons for text components in enhanced images to reduce computational burdens, which in turn helps in studying component structures efficiently. We propose to study the cursiveness of components based on branch information to remove false text components. The proposed method introduces the nearest neighbor criterion for grouping components in the same line, which results in clusters. Furthermore, the proposed method classifies these clusters into text and non-text cluster based on characteristics of text components. We evaluate the proposed method on a large dataset containing varieties of images. The results are compared with the existing methods to show that the proposed method is effective in terms of recall and precision.

  14. The nuclear modification of charged particles in Pb-Pb at $\\sqrt{\\text{s}_\\text{NN}} = \\text{5.02}\\,\\text{TeV}$ measured with ALICE

    CERN Document Server

    Gronefeld, Julius

    2016-09-21

    The study of inclusive charged-particle production in heavy-ion collisions provides insights into the density of the medium and the energy-loss mechanisms. The observed suppression of high-$\\textit{p}_\\text{T}$ yield is generally attributed to energy loss of partons as they propagate through a deconfined state of quarks and gluons - Quark-Gluon Plasma (QGP) - predicted by QCD. Such measurements allow the characterization of the QGP by comparison with models. In these proceedings, results on high-$\\textit{p}_\\text{T}$ particle production measured by ALICE in Pb-Pb collisions at $ \\sqrt{\\text{s}_\\text{NN}}\\, = 5.02\\ \\rm{TeV}$ as well as well in pp at $\\sqrt{\\text{s}}\\,=5.02\\ \\rm{TeV}$ are presented for the first time. The nuclear modification factors ($\\text{R}_\\text{AA}$) in Pb-Pb collisions are presented and compared with model calculations.

  15. Gender Analysis On Islamic Texts: A Study On Its Accuracy

    Directory of Open Access Journals (Sweden)

    Muchammad Ichsan

    2014-06-01

    Full Text Available Gender equality movement is spreading all over the world, including in Indonesia where Muslim gender activists have made hard efforts to ensure gender fairness and equality among people. One of their efforts is emphasizing the urgency of reinterpreting Islamic texts. They insist on the reinterpretation of Islamic texts based on gender perspective and analysis due to the existence of many Islamic texts that trespass the principles of gender equality and fairness they have been fighting for. This paper aims at assuring and examining the accuracy of using gender perspective as a tool for analyzing the Islamic text. It is found that using gender perspective and analysis for reinterpreting Islamic texts is not in line with the Islamic principles and will only produce laws and points of views which deviate from Islamic teachings. To reach the goals of this study, a descriptive-analytical approach is employed.

  16. Intertextuality: On the use of the Bible in mystical texts

    Directory of Open Access Journals (Sweden)

    Kees Waaijman

    2010-11-01

    Full Text Available This article discussed the use of the Bible in mystical texts by focusing on intertextuality as a literary approach which analyses the intersection of texts. It investigated how mystical texts, as phenotexts, relate to the Bible as archetext: firstly, the intertextual relations affect the surface of the text in a mono-causal way and secondly, they govern the production of meaning reciprocally. The article also discussed forms of intersection (quotations, collage, allusions and reproduction before it analysed the three intertextual strategies producing meaning: participation, detachment and change or rearrangement. Finally, six functions and dimensions of meaning were delineated in the intertextual dynamic between the Bible and the mystical texts. In these the Bible serves as an authoritative framework for argumentation, as a guide and blueprint of the mystical way, as a vocabulary of mystical experience, as an initiation into the divine infinity, as the place of mystical transformation in love and as the articulation of transformation in glory.

  17. Discovering gene annotations in biomedical text databases

    Directory of Open Access Journals (Sweden)

    Ozsoyoglu Gultekin

    2008-03-01

    Full Text Available Abstract Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i automating the annotation of genomic entities with Gene Ontology concepts, and (ii providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate

  18. Arabic text preprocessing for the natural language processing applications

    International Nuclear Information System (INIS)

    Awajan, A.

    2007-01-01

    A new approach for processing vowelized and unvowelized Arabic texts in order to prepare them for Natural Language Processing (NLP) purposes is described. The developed approach is rule-based and made up of four phases: text tokenization, word light stemming, word's morphological analysis and text annotation. The first phase preprocesses the input text in order to isolate the words and represent them in a formal way. The second phase applies a light stemmer in order to extract the stem of each word by eliminating the prefixes and suffixes. The third phase is a rule-based morphological analyzer that determines the root and the morphological pattern for each extracted stem. The last phase produces an annotated text where each word is tagged with its morphological attributes. The preprocessor presented in this paper is capable of dealing with vowelized and unvowelized words, and provides the input words along with relevant linguistics information needed by different applications. It is designed to be used with different NLP applications such as machine translation text summarization, text correction, information retrieval and automatic vowelization of Arabic Text. (author)

  19. Populating the Semantic Web by Macro-reading Internet Text

    Science.gov (United States)

    Mitchell, Tom M.; Betteridge, Justin; Carlson, Andrew; Hruschka, Estevam; Wang, Richard

    A key question regarding the future of the semantic web is "how will we acquire structured information to populate the semantic web on a vast scale?" One approach is to enter this information manually. A second approach is to take advantage of pre-existing databases, and to develop common ontologies, publishing standards, and reward systems to make this data widely accessible. We consider here a third approach: developing software that automatically extracts structured information from unstructured text present on the web. We also describe preliminary results demonstrating that machine learning algorithms can learn to extract tens of thousands of facts to populate a diverse ontology, with imperfect but reasonably good accuracy.

  20. Relating interesting quantitative time series patterns with text events and text features

    Science.gov (United States)

    Wanner, Franz; Schreck, Tobias; Jentner, Wolfgang; Sharalieva, Lyubka; Keim, Daniel A.

    2013-12-01

    In many application areas, the key to successful data analysis is the integrated analysis of heterogeneous data. One example is the financial domain, where time-dependent and highly frequent quantitative data (e.g., trading volume and price information) and textual data (e.g., economic and political news reports) need to be considered jointly. Data analysis tools need to support an integrated analysis, which allows studying the relationships between textual news documents and quantitative properties of the stock market price series. In this paper, we describe a workflow and tool that allows a flexible formation of hypotheses about text features and their combinations, which reflect quantitative phenomena observed in stock data. To support such an analysis, we combine the analysis steps of frequent quantitative and text-oriented data using an existing a-priori method. First, based on heuristics we extract interesting intervals and patterns in large time series data. The visual analysis supports the analyst in exploring parameter combinations and their results. The identified time series patterns are then input for the second analysis step, in which all identified intervals of interest are analyzed for frequent patterns co-occurring with financial news. An a-priori method supports the discovery of such sequential temporal patterns. Then, various text features like the degree of sentence nesting, noun phrase complexity, the vocabulary richness, etc. are extracted from the news to obtain meta patterns. Meta patterns are defined by a specific combination of text features which significantly differ from the text features of the remaining news data. Our approach combines a portfolio of visualization and analysis techniques, including time-, cluster- and sequence visualization and analysis functionality. We provide two case studies, showing the effectiveness of our combined quantitative and textual analysis work flow. The workflow can also be generalized to other

  1. Adaptive Text Entry for Mobile Devices

    DEFF Research Database (Denmark)

    Proschowsky, Morten Smidt

    The reduced size of many mobile devices makes it difficult to enter text with them. The text entry methods are often slow or complicated to use. This affects the performance and user experience of all applications and services on the device. This work introduces new easy-to-use text entry methods...... for mobile devices and a framework for adaptive context-aware language models. Based on analysis of current text entry methods, the requirements to the new text entry methods are established. Transparent User guided Prediction (TUP) is a text entry method for devices with one dimensional touch input. It can...... be touch sensitive wheels, sliders or similar input devices. The interaction design of TUP is done with a combination of high level task models and low level models of human motor behaviour. Three prototypes of TUP are designed and evaluated by more than 30 users. Observations from the evaluations are used...

  2. Planning Multisentential English Text Using Communicative Acts

    Science.gov (United States)

    1990-12-01

    Composition, Vol. XI in series Advances in Discourse Processing, Alex Publishing Corporation. de Joia , A. and Stenton, A. 1980. Terms in Linguistics: A Guide to...investigate how attentional constraints relate to text planning and linguistic realization. 14 SUBJECT TE1MS I I N& De OF PAGES Natural Language Generation...surface form? Page I 4. What is the relation of communicative intentions to text structure and surface form? 5. What effects can texts be designed to have

  3. Text Mining of Supreme Administrative Court Jurisdictions

    OpenAIRE

    Feinerer, Ingo; Hornik, Kurt

    2007-01-01

    Within the last decade text mining, i.e., extracting sensitive information from text corpora, has become a major factor in business intelligence. The automated textual analysis of law corpora is highly valuable because of its impact on a company's legal options and the raw amount of available jurisdiction. The study of supreme court jurisdiction and international law corpora is equally important due to its effects on business sectors. In this paper we use text mining methods to investigate Au...

  4. Science and Technology Text Mining Basic Concepts

    National Research Council Canada - National Science Library

    Losiewicz, Paul

    2003-01-01

    ...). It then presents some of the most widely used data and text mining techniques, including clustering and classification methods, such as nearest neighbor, relational learning models, and genetic...

  5. Using Unlabeled Data to Improve Text Classification

    National Research Council Canada - National Science Library

    Nigam, Kamal P

    2001-01-01

    .... This dissertation demonstrates that supervised learning algorithms that use a small number of labeled examples and many inexpensive unlabeled examples can create high-accuracy text classifiers...

  6. 基于用户意图理解的社交网络跨媒体搜索与挖掘%Social network cross-media searching and mining based on user intention

    Institute of Scientific and Technical Information of China (English)

    崔婉秋; 杜军平; 周南; 梁美玉

    2017-01-01

    With the popularity of online social networks, users not only have higher requirements for speed and real-time performance of information acquisition but also increased demand for personalized and accurate searching. To improve the quality of the search engine and accuracy of the result list, it is necessary to deeply mine the search intentions of the users. This paper summarizes the current situation in precise cross-media searching and mining based on user search in-tentions. We focus on multi-modal information perceptions based on an online social network knowledge graph, deep semantic learning and analysis of cross-media data for user search intention matching, and precise online social network searching and mining based on users' search intentions. Finally, future research problems and possible challenges are discussed.%随着在线社交网络的盛行,网络用户不仅对信息资讯的获取速度和实时性提出了更高的要求,对个性化和精确化的搜索需求日益增长.为了提升搜索引擎的质量以及其结果列表的准确性,需要深层次地挖掘用户搜索意图.本文分析了用户搜索意图理解在线社交网络跨媒体进行精准搜索与挖掘的研究现状,包括知识图谱在线社交网络多模态信息感知、面向用户搜索意图匹配的跨媒体大数据深度语义学习方面的应用,以及用户搜索意图理解的在线社交网络精准搜索与挖掘的应用等.最后,对未来研究存在的问题和可能面临的挑战进行了展望.

  7. Creating texts an introduction to the study of composition

    CERN Document Server

    Nash, Walter

    2014-01-01

    Creating Texts emphasises a practical approach to composition and enables students to understand what is involved in the creation of a text and to learn from the practice of other writers. Extensively rewritten and updated from Walter Nash's earlier volume, Designs in Prose, attention is paid to the general theory of composition, in both traditional and original terms, so that students are made familiar with the basic resources of composition, in grammar and in the lexicon.The essence of every chapter is the discussion of examples of text, sometimes devised by the authors

  8. Classifying Written Texts Through Rhythmic Features

    NARCIS (Netherlands)

    Balint, Mihaela; Dascalu, Mihai; Trausan-Matu, Stefan

    2016-01-01

    Rhythm analysis of written texts focuses on literary analysis and it mainly considers poetry. In this paper we investigate the relevance of rhythmic features for categorizing texts in prosaic form pertaining to different genres. Our contribution is threefold. First, we define a set of rhythmic

  9. Text comprehension strategy instruction with poor readers

    NARCIS (Netherlands)

    Van den Bos, K.P.; Aarnoudse, C.C.; Brand-Gruwel, S.

    1998-01-01

    The goal of this study was to investigate the effects of teaching text comprehension strategies to children with decoding and reading comprehension problems and with a poor or normal listening ability. Two experiments are reported. Four text comprehension strategies, viz., question generation,

  10. Teachers' Texts in Culturally Responsive Teaching

    Science.gov (United States)

    Kesler, Ted

    2011-01-01

    In this paper, the author shares three teaching stories that demonstrate the social, cultural, political, and historical factors of all texts in specific interpretive communities. The author shows how the texts that comprised his curriculum constructed particular subject positions that inevitably included some students but marginalized and…

  11. Readability Revisited? The Implications of Text Complexity

    Science.gov (United States)

    Wray, David; Janan, Dahlia

    2013-01-01

    The concept of readability has had a variable history, moving from a position where it was considered as a very important topic for those responsible for producing texts and matching those texts to the abilities and needs of learners, to its current declining visibility in the education literature. Some important work has been coming from the USA…

  12. Tipster Text Phase 2 Architecture Design

    Science.gov (United States)

    1996-06-19

    TIPSTER Text Phase II Architecture Design Version 2.1p 19 June 1996 Ralph Grishman New York University grishman @cs.nyu.edu and the TIPSTER...1996 2. REPORT TYPE 3. DATES COVERED 00-00-1996 to 00-00-1996 4. TITLE AND SUBTITLE TIPSTER Text Phase II Architecture Design 5a. CONTRACT

  13. Using Digital Texts to Promote Fluent Reading

    Science.gov (United States)

    Thoermer, Andrea; Williams, Lunetta

    2012-01-01

    Fluency is a critical skill of adept readers. As listening to read alouds and performing Readers Theatre scripts are two prevalent strategies that can increase students' fluency skills, this article provides suggestions in using these strategies with digital texts through free, online resources. Digital texts can be accessed using a desktop,…

  14. Interest, Inferences, and Learning from Texts

    Science.gov (United States)

    Clinton, Virginia; van den Broek, Paul

    2012-01-01

    Topic interest and learning from texts have been found to be positively associated with each other. However, the reason for this positive association is not well understood. The purpose of this study is to examine a cognitive process, inference generation, that could explain the positive association between interest and learning from texts. In…

  15. Text Fabric: What, How, and Why

    NARCIS (Netherlands)

    Erwich, C.M.; Kingham, Cody

    Text-Fabric (TF) is a promising new framework for the Eep Talstra Center for Bible and Computer corpus plus (linguistic) annotations. TF is a Python 3.x software package that provides scientific, accessible and reproducible ways of processing Biblical Hebrew text data. It also allows sharing the

  16. An Intelligent System For Arabic Text Categorization

    NARCIS (Netherlands)

    Syiam, M.M.; Tolba, Mohamed F.; Fayed, Z.T.; Abdel-Wahab, Mohamed S.; Ghoniemy, Said A.; Habib, Mena Badieh

    Text Categorization (classification) is the process of classifying documents into a predefined set of categories based on their content. In this paper, an intelligent Arabic text categorization system is presented. Machine learning algorithms are used in this system. Many algorithms for stemming and

  17. Flexible frontiers for text division into rows

    Directory of Open Access Journals (Sweden)

    Dan L. Lacrămă

    2009-01-01

    Full Text Available This paper presents an original solution for flexible hand-written text division into rows. Unlike the standard procedure, the proposed method avoids the isolated characters extensions amputation and reduces the recognition error rate in the final stage.

  18. Lexical Information in Memory for Text.

    Science.gov (United States)

    Hayes-Roth, Barbara

    Cued-recall and two-alternative, forced-choice recognition measures were used to evaluate subjects' retention of the specific wordings of studied texts. Results obtained after 10-minute and 24 hour retention intervals suggest that the studied wordings of texts are functional components of their memory representations. Theories that assume…

  19. Undergraduates' Text Messaging Language and Literacy Skills

    Science.gov (United States)

    Grace, Abbie; Kemp, Nenagh; Martin, Frances Heritage; Parrila, Rauno

    2014-01-01

    Research investigating whether people's literacy skill is being affected by the use of text messaging language has produced largely positive results for children, but mixed results for adults. We asked 150 undergraduate university students in Western Canada and 86 in South Eastern Australia to supply naturalistic text messages and to complete…

  20. Language Skills in Classical Chinese Text Comprehension

    Science.gov (United States)

    Lau, Kit-ling

    2018-01-01

    This study used both quantitative and qualitative methods to explore the role of lower- and higher-level language skills in classical Chinese (CC) text comprehension. A CC word and sentence translation test, text comprehension test, and questionnaire were administered to 393 Secondary Four students; and 12 of these were randomly selected to…

  1. Text Structure and Retention of Prose.

    Science.gov (United States)

    Zimmer, John W.

    1985-01-01

    The effects of text structure were studied using two kinds of reading materials: a standard text with headings and illustrations, as well as a nonstructured manuscript. The manuscript readers scored higher on delayed tests, generated more relevant ideas, and wrote better essays both immediately and after a delay. (Author/GDC)

  2. The Weaknesses of Full-Text Searching

    Science.gov (United States)

    Beall, Jeffrey

    2008-01-01

    This paper provides a theoretical critique of the deficiencies of full-text searching in academic library databases. Because full-text searching relies on matching words in a search query with words in online resources, it is an inefficient method of finding information in a database. This matching fails to retrieve synonyms, and it also retrieves…

  3. Application of LSP Texts in Translator Training

    Science.gov (United States)

    Ilynska, Larisa; Smirnova, Tatjana; Platonova, Marina

    2017-01-01

    The paper presents discussion of the results of extensive empirical research into efficient methods of educating and training translators of LSP (language for special purposes) texts. The methodology is based on using popular LSP texts in the respective fields as one of the main media for translator training. The aim of the paper is to investigate…

  4. Modeling text with generalizable Gaussian mixtures

    DEFF Research Database (Denmark)

    Hansen, Lars Kai; Sigurdsson, Sigurdur; Kolenda, Thomas

    2000-01-01

    We apply and discuss generalizable Gaussian mixture (GGM) models for text mining. The model automatically adapts model complexity for a given text representation. We show that the generalizability of these models depends on the dimensionality of the representation and the sample size. We discuss...

  5. Text mining for the biocuration workflow.

    Science.gov (United States)

    Hirschman, Lynette; Burns, Gully A P C; Krallinger, Martin; Arighi, Cecilia; Cohen, K Bretonnel; Valencia, Alfonso; Wu, Cathy H; Chatr-Aryamontri, Andrew; Dowell, Karen G; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on 'Text Mining for the BioCuration Workflow' at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community.

  6. A text in Romani from 1622

    DEFF Research Database (Denmark)

    Bakker, Peter

    2015-01-01

    this is a reprint of a 2012 article: A new old text in Romani: Lord's Prayer, 1622. International Journal of Romani Language and Culture 2 (2011): 193-212.......this is a reprint of a 2012 article: A new old text in Romani: Lord's Prayer, 1622. International Journal of Romani Language and Culture 2 (2011): 193-212....

  7. Where Full-Text Is Viable.

    Science.gov (United States)

    Cotton, P. L.

    1987-01-01

    Defines two types of online databases: source, referring to those intended to be complete in themselves, whether full-text or abstracts; and bibliographic, meaning those that are not complete. Predictions are made about the future growth rate of these two types of databases, as well as full-text versus abstract databases. (EM)

  8. The Medline/full-text research project.

    Science.gov (United States)

    McKinin, E J; Sievert, M; Johnson, E D; Mitchell, J A

    1991-05-01

    This project was designed to test the relative efficacy of index terms and full-text for the retrieval of documents in those MEDLINE journals for which full-text searching was also available. The full-text files used were MEDIS from Mead Data Central and CCML from BRS Information Technologies. One hundred clinical medical topics were searched in these two files as well as the MEDLINE file to accumulate the necessary data. It was found that full-text identified significantly more relevant articles than did the indexed file, MEDLINE. The full-text searches, however, lacked the precision of searches done in the indexed file. Most relevant items missed in the full-text files, but identified in MEDLINE, were missed because the searcher failed to account for some aspect of natural language, used a logical or positional operator that was too restrictive, or included a concept which was implied, but not expressed in the natural language. Very few of the unique relevant full-text citations would have been retrieved by title or abstract alone. Finally, as of July, 1990 the more current issue of a journal was just as likely to appear in MEDLINE as in one of the full-text files.

  9. Ontology Assisted Formal Specification Extraction from Text

    Directory of Open Access Journals (Sweden)

    Andreea Mihis

    2010-12-01

    Full Text Available In the field of knowledge processing, the ontologies are the most important mean. They make possible for the computer to understand better the natural language and to make judgments. In this paper, a method which use ontologies in the semi-automatic extraction of formal specifications from a natural language text is proposed.

  10. NOTICING HYBRID RECASTS IN TEXT CHAT

    Directory of Open Access Journals (Sweden)

    Mark J. Oliver

    2016-12-01

    Full Text Available This study examined ten EFL learners’ noticing of the corrective nature of a form of text-based SCMC (text chat feedback that combined a recast of a grammatical error with metalinguistic information. The feedback, termed a hybrid recast, was provided by a native-speaker interlocutor during two text chat activities: a spot-the-difference and picture-ordering task. Data was collected in two ways: analysis of task-based dyadic text chat interaction in which uptake was used as an indicator of learner noticing, and a post-task questionnaire containing questions that identified evidence of learner noticing. Interaction analysis showed that learners responded to almost two thirds of the hybrid recasts with uptake. In addition, every learner provided evidence that they had correctly perceived at least some of the hybrid recasts as corrective in their post-task questionnaire responses.

  11. Frontiers of biomedical text mining: current progress

    Science.gov (United States)

    Zweigenbaum, Pierre; Demner-Fushman, Dina; Yu, Hong; Cohen, Kevin B.

    2008-01-01

    It is now almost 15 years since the publication of the first paper on text mining in the genomics domain, and decades since the first paper on text mining in the medical domain. Enormous progress has been made in the areas of information retrieval, evaluation methodologies and resource construction. Some problems, such as abbreviation-handling, can essentially be considered solved problems, and others, such as identification of gene mentions in text, seem likely to be solved soon. However, a number of problems at the frontiers of biomedical text mining continue to present interesting challenges and opportunities for great improvements and interesting research. In this article we review the current state of the art in biomedical text mining or ‘BioNLP’ in general, focusing primarily on papers published within the past year. PMID:17977867

  12. Learning from text benefits from enactment.

    Science.gov (United States)

    Cutica, Ilaria; Ianì, Francesco; Bucciarelli, Monica

    2014-10-01

    Classical studies on enactment have highlighted the beneficial effects of gestures performed in the encoding phase on memory for words and sentences, for both adults and children. In the present investigation, we focused on the role of enactment for learning from scientific texts among primary-school children. We assumed that enactment would favor the construction of a mental model of the text, and we verified the derived predictions that gestures at the time of encoding would result in greater numbers of correct recollections and discourse-based inferences at recall, as compared to no gestures (Exp. 1), and in a bias to confound paraphrases of the original text with the verbatim text in a recognition test (Exp. 2). The predictions were confirmed; hence, we argue in favor of a theoretical framework that accounts for the beneficial effects of enactment on memory for texts.

  13. EU external relations law : text, cases and materials

    NARCIS (Netherlands)

    Van Vooren, Bart; Wessel, Ramses A.

    2014-01-01

    This major new textbook for students in European law uses a text, cases and materials approach to explore the law, politics, policy and practice of EU external relations, and navigates the complex questions at the interface of these areas. The subject is explored by explaining major constitutional

  14. Effectiveness of Conceptual Change Texts: A Meta Analysis

    Science.gov (United States)

    Armagan, Fulya Öner; Keskin, Melike Özer; Akin, Beril Salman

    2017-01-01

    The purpose of this study was to determine the overall effectiveness of conceptual change texts (CCTs) on academic achievement and to find out if effectiveness was related to some characteristics of the study. It followed up a Meta-analysis research approach. 42 published and unpublished studies, published between 1995 and 2010, and 42 experiment…

  15. Text mining of web-based medical content

    CERN Document Server

    Neustein, Amy

    2014-01-01

    Text Mining of Web-Based Medical Content examines web mining for extracting useful information that can be used for treating and monitoring the healthcare of patients. This work provides methodological approaches to designing mapping tools that exploit data found in social media postings. Specific linguistic features of medical postings are analyzed vis-a-vis available data extraction tools for culling useful information.

  16. A survey of text clustering techniques used for web mining

    Directory of Open Access Journals (Sweden)

    Dan MUNTEANU

    2005-12-01

    Full Text Available This paper contains an overview of basic formulations and approaches to clustering. Then it presents two important clustering paradigms: a bottom-up agglomerative technique, which collects similar documents into larger and larger groups, and a top-down partitioning technique, which divides a corpus into topic-oriented partitions.

  17. Text-Based MOOing in Educational Practice: Experiences of Disinhibition

    Science.gov (United States)

    Chester, Andrea

    2006-01-01

    Purpose: The purpose of this paper is to describe educational MOOs--MUD, object-oriented (text-based, network-accessible virtual environments) and explore how teaching and learning in such a context impacts on students' inhibitions. Design/methodology/approach: Students enrolled in a course on the psychology of cyberspace interacted for 12 weeks…

  18. Intertextuality and Dialogic Interaction in Students' Online Text Construction

    Science.gov (United States)

    Ronan, Briana

    2015-01-01

    This study examines the online writing practices of adolescent emergent bilinguals through the mediating lenses of dialogic interaction and intertextuality. Using a multimodal discourse analysis approach, the study traces how three students develop online academic texts through intertextual moves that traverse modal boundaries. The analysis…

  19. A Network Text Analysis of David Ayer’s Fury

    Directory of Open Access Journals (Sweden)

    Starling David Hunter

    2015-12-01

    Full Text Available Network Text Analysis (NTA involves the creation of networks of words and/or concepts from linguistic data. Its key insight is that the position of words and concepts in a text network provides vital clues to the central and underlying themes of the text as a whole. Recent research has relied on inductive approaches to identify these themes. In this study we demonstrate a deductive approach that we apply to the screenplay of the 2014 World War II-era film Fury. Specifically, we first use genre expectations theory to establish prior expectations as to the key themes associated with war films. We then empirically test whether words and concepts associated with the most influentially-positioned nodes are consistent with themes common to the war-film genre. As predicted, we find that words and concepts associated with the least constrained nodes in the text network were significantly more likely to be associated with the war, action, and biography genres and significantly less likely to be associated with the mystery, science-fiction, fantasy, and film-noir genres. Keywords: content analysis, text analysis, network text analysis, semantic network analysis, film studies, screenplay, screenwriting, war movies, World War II, tanks

  20. Automatic Amharic text news classification: Aneural networks ...

    African Journals Online (AJOL)

    School of Computing and Electrical Engineering, Institute of Technology, Bahir Dar University, Bahir Dar ... The study is on classification of Amharic news automatically using neural networks approach. Learning Vector ... INTRODUCTION.

  1. Statistical text classifier to detect specific type of medical incidents.

    Science.gov (United States)

    Wong, Zoie Shui-Yee; Akiyama, Masanori

    2013-01-01

    WHO Patient Safety has put focus to increase the coherence and expressiveness of patient safety classification with the foundation of International Classification for Patient Safety (ICPS). Text classification and statistical approaches has showed to be successful to identifysafety problems in the Aviation industryusing incident text information. It has been challenging to comprehend the taxonomy of medical incidents in a structured manner. Independent reporting mechanisms for patient safety incidents have been established in the UK, Canada, Australia, Japan, Hong Kong etc. This research demonstrates the potential to construct statistical text classifiers to detect specific type of medical incidents using incident text data. An illustrative example for classifying look-alike sound-alike (LASA) medication incidents using structured text from 227 advisories related to medication errors from Global Patient Safety Alerts (GPSA) is shown in this poster presentation. The classifier was built using logistic regression model. ROC curve and the AUC value indicated that this is a satisfactory good model.

  2. Verbal-Visual Intertextuality: How do Multisemiotic Texts Dialogue?

    Directory of Open Access Journals (Sweden)

    Leonardo Mozdzenski

    2013-11-01

    Full Text Available The objective of this work is to understand how multisemiotic texts interact with each other to produce meanings, observing the complex intertextual relations among genres from various artistic and/or audiovisual fields. Therefore, I initially present a brief review of the literature on intertextuality, critically discussing how leading scholars address this issue. Then I argue that it is necessary to understand intertextuality in an integral and non-discretized way through a typological continuum of relationships between verbal-visual texts. Thus, I develop a model for understanding this phenomenon by means of a graph in which two continua intertwine: the representation of intertextuality through form (Implicitness/ Explicitness and function (Approach/Distance of the quoted voice assumed in communicative situations. To test the model,four music video clips of American singer Madonna were selected so we can verify how music video texts rely on other texts to build their discourses and evoked identities.

  3. A general framework for time series data mining based on event analysis: application to the medical domains of electroencephalography and stabilometry.

    Science.gov (United States)

    Lara, Juan A; Lizcano, David; Pérez, Aurora; Valente, Juan P

    2014-10-01

    There are now domains where information is recorded over a period of time, leading to sequences of data known as time series. In many domains, like medicine, time series analysis requires to focus on certain regions of interest, known as events, rather than analyzing the whole time series. In this paper, we propose a framework for knowledge discovery in both one-dimensional and multidimensional time series containing events. We show how our approach can be used to classify medical time series by means of a process that identifies events in time series, generates time series reference models of representative events and compares two time series by analyzing the events they have in common. We have applied our framework on time series generated in the areas of electroencephalography (EEG) and stabilometry. Framework performance was evaluated in terms of classification accuracy, and the results confirmed that the proposed schema has potential for classifying EEG and stabilometric signals. The proposed framework is useful for discovering knowledge from medical time series containing events, such as stabilometric and electroencephalographic time series. These results would be equally applicable to other medical domains generating iconographic time series, such as, for example, electrocardiography (ECG). Copyright © 2014 Elsevier Inc. All rights reserved.

  4. Advances in Text Mining and Visualization for Precision Medicine.

    Science.gov (United States)

    Gonzalez-Hernandez, Graciela; Sarker, Abeed; O'Connor, Karen; Greene, Casey; Liu, Hongfang

    2018-01-01

    According to the National Institutes of Health (NIH), precision medicine is "an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle for each person." Although the text mining community has explored this realm for some years, the official endorsement and funding launched in 2015 with the Precision Medicine Initiative are beginning to bear fruit. This session sought to elicit participation of researchers with strong background in text mining and/or visualization who are actively collaborating with bench scientists and clinicians for the deployment of integrative approaches in precision medicine that could impact scientific discovery and advance the vision of precision medicine as a universal, accessible approach at the point of care.

  5. LITURGICAL TEXT IN RUSSIAN LITERATURE. PROBLEM STATEMENT

    Directory of Open Access Journals (Sweden)

    Avetis Serezhaevich Seropyan

    2012-11-01

    Full Text Available The article analyses artistic expressions of liturgical language in the literary text and its interaction of the Holy Tradition. Many Russian authors knew the liturgical text well. Studying it reveals the crucial meaning of the Gospel and liturgical texts (as part of the Holy Tradition for Russian literature. Authors saw the essence of every phenomenon in the word for it, and the nature of God in His name. Some ideas and sayings of the authors and their characters find their sources in liturgical texts. The article focuses on liturgical sources of some characters' commemorations and invocations, as well as poetical topics of the symbolists, Dostoevsky's famous dictum on beauty which will save the world (The Idiot, etc. De-cyphering this liturgical code will help us learn and comprehend the hidden endless meaning of a literary text. The specific feature of Russian literature is its pursuit of the spiritual liturgical exploration of the world, an exploration when truth takes shape and thus becomes real in both literary text and history.

  6. Application of LSP texts in translator training

    Directory of Open Access Journals (Sweden)

    Larisa Ilynska

    2017-06-01

    Full Text Available The paper presents discussion of the results of extensive empirical research into efficient methods of educating and training translators of LSP (language for special purposes texts. The methodology is based on using popular LSP texts in the respective fields as one of the main media for translator training. The aim of the paper is to investigate the efficiency of this methodology in developing thematic, linguistic and cultural competences of the students, following Bloom’s revised taxonomy and European Master in Translation Network (EMT translator training competences. The methodology has been tested on the students of a professional Master study programme called Technical Translation implemented by the Institute of Applied Linguistics, Riga Technical University, Latvia. The group of students included representatives of different nationalities, translating from English into Latvian, Russian and French. Analysis of popular LSP texts provides an opportunity to structure student background knowledge and expand it to account for linguistic innovation. Application of popular LSP texts instead of purely technical or scientific texts characterised by neutral style and rigid genre conventions provides an opportunity for student translators to develop advanced text processing and decoding skills, to develop awareness of expressive resources of the source and target languages and to develop understanding of socio-pragmatic language use.

  7. Figure-associated text summarization and evaluation.

    Directory of Open Access Journals (Sweden)

    Balaji Polepalli Ramesh

    Full Text Available Biomedical literature incorporates millions of figures, which are a rich and important knowledge resource for biomedical researchers. Scientists need access to the figures and the knowledge they represent in order to validate research findings and to generate new hypotheses. By themselves, these figures are nearly always incomprehensible to both humans and machines and their associated texts are therefore essential for full comprehension. The associated text of a figure, however, is scattered throughout its full-text article and contains redundant information content. In this paper, we report the continued development and evaluation of several figure summarization systems, the FigSum+ systems, that automatically identify associated texts, remove redundant information, and generate a text summary for every figure in an article. Using a set of 94 annotated figures selected from 19 different journals, we conducted an intrinsic evaluation of FigSum+. We evaluate the performance by precision, recall, F1, and ROUGE scores. The best FigSum+ system is based on an unsupervised method, achieving F1 score of 0.66 and ROUGE-1 score of 0.97. The annotated data is available at figshare.com (http://figshare.com/articles/Figure_Associated_Text_Summarization_and_Evaluation/858903.

  8. Text-Filled Stacked Area Graphs

    DEFF Research Database (Denmark)

    Kraus, Martin

    2011-01-01

    -filled stacked area graphs; i.e., graphs that feature stacked areas that are filled with small-typed text. Since these graphs allow for computing the text layout automatically, it is possible to include large amounts of textual detail with very little effort. We discuss the most important challenges and some...... solutions for the design of text-filled stacked area graphs with the help of an exemplary visualization of the genres, publication years, and titles of a database of several thousand PC games....

  9. NOTICING AND TEXT-BASED CHAT

    Directory of Open Access Journals (Sweden)

    Chun Lai

    2006-09-01

    Full Text Available This study examined the capacity of text-based online chat to promote learners’ noticing of their problematic language productions and of the interactional feedback from their interlocutors. In this study, twelve ESL learners formed six mixed-proficiency dyads. The same dyads worked on two spot-the-difference tasks, one via online chat and the other through face-to-face conversation. Stimulated recall sessions were held subsequently to identify instances of noticing. It was found that text-based online chat promotes noticing more than face-to-face conversations, especially in terms of learners’ noticing of their own linguistic mistakes.

  10. Assessing semantic similarity of texts - Methods and algorithms

    Science.gov (United States)

    Rozeva, Anna; Zerkova, Silvia

    2017-12-01

    Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.

  11. Beyond Readability: Investigating Coherence of Clinical Text for Consumers

    Science.gov (United States)

    Hetzel, Scott; Dalrymple, Prudence; Keselman, Alla

    2011-01-01

    Background A basic tenet of consumer health informatics is that understandable health resources empower the public. Text comprehension holds great promise for helping to characterize consumer problems in understanding health texts. The need for efficient ways to assess consumer-oriented health texts and the availability of computationally supported tools led us to explore the effect of various text characteristics on readers’ understanding of health texts, as well as to develop novel approaches to assessing these characteristics. Objective The goal of this study was to compare the impact of two different approaches to enhancing readability, and three interventions, on individuals’ comprehension of short, complex passages of health text. Methods Participants were 80 university staff, faculty, or students. Each participant was asked to “retell” the content of two health texts: one a clinical trial in the domain of diabetes mellitus, and the other typical Visit Notes. These texts were transformed for the intervention arms of the study. Two interventions provided terminology support via (1) standard dictionary or (2) contextualized vocabulary definitions. The third intervention provided coherence improvement. We assessed participants’ comprehension of the clinical texts through propositional analysis, an open-ended questionnaire, and analysis of the number of errors made. Results For the clinical trial text, the effect of text condition was not significant in any of the comparisons, suggesting no differences in recall, despite the varying levels of support (P = .84). For the Visit Note, however, the difference in the median total propositions recalled between the Coherent and the (Original + Dictionary) conditions was significant (P = .04). This suggests that participants in the Coherent condition recalled more of the original Visit Notes content than did participants in the Original and the Dictionary conditions combined. However, no difference was seen

  12. Building Fluency through the Phrased Text Lesson

    Science.gov (United States)

    Rasinski, Timothy; Yildirim, Kasim; Nageldinger, James

    2012-01-01

    This Teaching Tip article explores the importance of phrasing while reading. It also presents an instructional intervention strategy for helping students develop greater proficiency in reading with phrases that reflect the meaning of the text.

  13. Punctuation effects in english and esperanto texts

    Science.gov (United States)

    Ausloos, M.

    2010-07-01

    A statistical physics study of punctuation effects on sentence lengths is presented for written texts: Alice in wonderland and Through a looking glass. The translation of the first text into esperanto is also considered as a test for the role of punctuation in defining a style, and for contrasting natural and artificial, but written, languages. Several log-log plots of the sentence-length-rank relationship are presented for the major punctuation marks. Different power laws are observed with characteristic exponents. The exponent can take a value much less than unity ( ca. 0.50 or 0.30) depending on how a sentence is defined. The texts are also mapped into time series based on the word frequencies. The quantitative differences between the original and translated texts are very minutes, at the exponent level. It is argued that sentences seem to be more reliable than word distributions in discussing an author style.

  14. MORPHOLOGICAL STRATEGIES IN TEXT MESSAGING AMONG ...

    African Journals Online (AJOL)

    Text messaging is the application of abridged morphological forms in order ... the emergence of the Global System for Mobile Communication (GSM) in the world. ... Our thesis statement is that these morphological patterns as used in SMS are ...

  15. The Relationship between Paraphrasing and Text Analysis

    Directory of Open Access Journals (Sweden)

    María Luisa Cepeda Islas

    2013-04-01

    Full Text Available Given the importance of paraphrasing in the process of comprehension for college students, this study assessed the level of implementation of text analysis and paraphrases the response of a sample of senior students of the career psychology. We selected a group of freshmen to the Psychology course, which was asked to answer a questionnaire and carry out the summary of an empirical article. The results showed that participants have a low level of text analysis, at the same time had low levels of paraphrasing. It was seen that the predominant textual copy. They envision some possibilities for the structure of a training workshop not only paraphrasing but on the analysis of text.

  16. Illustrations in Text: A Retentional Role.

    Science.gov (United States)

    Duchastel, Philippe C.

    1981-01-01

    Describes the results of a study of the retentional role of illustrations in a text and their effect on enhancing long-term memory with 15-year-old secondary school students. Seven references are listed. (CHC)

  17. Figures of thought mathematics and mathematical texts

    CERN Document Server

    Reed, David

    2003-01-01

    Examines the ways in which mathematical works can be read as texts, examines their textual strategiesand demonstrates that such readings provide a rich source of philosophical debate regarding mathematics.

  18. Strategies to Increase Accuracy in Text Classification

    NARCIS (Netherlands)

    D. Blommesteijn (Dennis)

    2014-01-01

    htmlabstractText classification via supervised learning involves various steps from processing raw data, features extraction to training and validating classifiers. Within these steps implementation decisions are critical to the resulting classifier accuracy. This paper contains a report of the

  19. Handwriting segmentation of unconstrained Oriya text

    Indian Academy of Sciences (India)

    Segmentation of handwritten text into lines, words and characters .... We now discuss here some terms relating to water reservoirs that will be used in feature ..... is found. Next, based on the touching position, reservoir base-area points, ...

  20. Nigel: A Systemic Grammar for Text Generation.

    Science.gov (United States)

    1983-02-01

    presumed. Basic references on the systemic framework include [Berry 75, Berry 77, Halliday 76a, Halliday 76b, Hudson 76, Halliday 81, de Joia 80...Edinburgh, 1979. [do Joia 80] de Joia , A., and A. Stanton, Terms in Systemic Linguistics, Batsford Academic and Educational, Ltd., London, 1980. -’C...1 A Grammar for Text Generation- -The Challenge ................................. 1 *1.2 A Grammar for Text Generation--The Design

  1. Text document classification based on mixture models

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana; Malík, Antonín

    2004-01-01

    Roč. 40, č. 3 (2004), s. 293-304 ISSN 0023-5954 R&D Projects: GA AV ČR IAA2075302; GA ČR GA102/03/0049; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : text classification * text categorization * multinomial mixture model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.224, year: 2004

  2. Preserved Network Metrics across Translated Texts

    Science.gov (United States)

    Cabatbat, Josephine Jill T.; Monsanto, Jica P.; Tapang, Giovanni A.

    2014-09-01

    Co-occurrence language networks based on Bible translations and the Universal Declaration of Human Rights (UDHR) translations in different languages were constructed and compared with random text networks. Among the considered network metrics, the network size, N, the normalized betweenness centrality (BC), and the average k-nearest neighbors, knn, were found to be the most preserved across translations. Moreover, similar frequency distributions of co-occurring network motifs were observed for translated texts networks.

  3. Text mining for the biocuration workflow

    Science.gov (United States)

    Hirschman, Lynette; Burns, Gully A. P. C; Krallinger, Martin; Arighi, Cecilia; Cohen, K. Bretonnel; Valencia, Alfonso; Wu, Cathy H.; Chatr-Aryamontri, Andrew; Dowell, Karen G.; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G.

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on ‘Text Mining for the BioCuration Workflow’ at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community. PMID:22513129

  4. Text Entry by Gazing and Smiling

    Directory of Open Access Journals (Sweden)

    Outi Tuisku

    2013-01-01

    Full Text Available Face Interface is a wearable prototype that combines the use of voluntary gaze direction and facial activations, for pointing and selecting objects on a computer screen, respectively. The aim was to investigate the functionality of the prototype for entering text. First, three on-screen keyboard layout designs were developed and tested (n=10 to find a layout that would be more suitable for text entry with the prototype than traditional QWERTY layout. The task was to enter one word ten times with each of the layouts by pointing letters with gaze and select them by smiling. Subjective ratings showed that a layout with large keys on the edge and small keys near the center of the keyboard was rated as the most enjoyable, clearest, and most functional. Second, using this layout, the aim of the second experiment (n=12 was to compare entering text with Face Interface to entering text with mouse. The results showed that text entry rate for Face Interface was 20 characters per minute (cpm and 27 cpm for the mouse. For Face Interface, keystrokes per character (KSPC value was 1.1 and minimum string distance (MSD error rate was 0.12. These values compare especially well with other similar techniques.

  5. Monolingual accounting dictionaries for EFL text production

    Directory of Open Access Journals (Sweden)

    Sandro Nielsen

    2006-10-01

    Full Text Available Monolingual accounting dictionaries are important for producing financial reporting texts in English in an international setting, because of the lack of specialised bilingual dictionaries. As the intended user groups have different factual and linguistic competences, they require specific types of information. By identifying and analysing the users' factual and linguistic competences, user needs, use-situations and the stages involved in producing accounting texts in English as a foreign language, lexicographers will have a sound basis for designing the optimal English accounting dictionary for EFL text production. The monolingual accounting dictionary needs to include information about UK, US and international accounting terms, their grammatical properties, their potential for being combined with other words in collocations, phrases and sentences in order to meet user requirements. Data items that deal with these aspects are necessary for the international user group as they produce subject-field specific and register-specific texts in a foreign language, and the data items are relevant for the various stages in text production: draft writing, copyediting, stylistic editing and proofreading.

  6. Inspiration and the Texts of the Bible

    Directory of Open Access Journals (Sweden)

    Dirk Buchner

    1997-12-01

    Full Text Available This article seeks to explore what the inspired text of the Old Testament was as it existed for the New Testament authors, particularly for the author of the book of Hebrews. A quick look at the facts makes. it clear that there was, at the time, more than one 'inspired' text, among these were the Septuagint and the Masoretic Text 'to name but two'. The latter eventually gained ascendancy which is why it forms the basis of our translated Old Testament today. Yet we have to ask: what do we make of that other text that was the inspired Bible to the early Church, especially to the writer of the book of Hebrews, who ignored the Masoretic text? This article will take a brief look at some suggestions for a doctrine of inspiration that keeps up with the facts of Scripture. Allied to this, the article is something of a bibliographical study of recent developments in textual research following the discovery of the Dead Sea scrolls.

  7. Ancient medical texts, modern reading problems

    Directory of Open Access Journals (Sweden)

    Maria Carlota Rosa

    2006-12-01

    Full Text Available The word tradition has a very specific meaning in linguistics: the passing down of a text, which may have been completed or corrected by different copyists at different times, when the concept of authorship was not the same as it is today. When reading an ancient text the word tradition must be in the reader's mind. To discuss one of the problems an ancient text poses to its modern readers, this work deals with one of the first printed medical texts in Portuguese, the Regimento proueytoso contra ha pestenença, and draws a parallel between it and two related texts, A moche profitable treatise against the pestilence, and the Recopilaçam das cousas que conuem guardar se no modo de preseruar à Cidade de Lixboa E os sãos, & curar os que esteuerem enfermos de Peste. The problems which arise out of the textual structure of those books show how difficult is to establish a tradition of another type, the medical tradition. The linguistic study of the innumerable medieval plague treatises may throw light on the continuities and on the disruptions of the so-called hippocratic-galenical medical tradition.

  8. Text mining resources for the life sciences.

    Science.gov (United States)

    Przybyła, Piotr; Shardlow, Matthew; Aubin, Sophie; Bossy, Robert; Eckart de Castilho, Richard; Piperidis, Stelios; McNaught, John; Ananiadou, Sophia

    2016-01-01

    Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable-those that have the crucial ability to share information, enabling smooth integration and reusability. © The Author(s) 2016. Published by Oxford University Press.

  9. Text mining resources for the life sciences

    Science.gov (United States)

    Shardlow, Matthew; Aubin, Sophie; Bossy, Robert; Eckart de Castilho, Richard; Piperidis, Stelios; McNaught, John; Ananiadou, Sophia

    2016-01-01

    Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable—those that have the crucial ability to share information, enabling smooth integration and reusability. PMID:27888231

  10. Figure-associated text summarization and evaluation.

    Science.gov (United States)

    Polepalli Ramesh, Balaji; Sethi, Ricky J; Yu, Hong

    2015-01-01

    Biomedical literature incorporates millions of figures, which are a rich and important knowledge resource for biomedical researchers. Scientists need access to the figures and the knowledge they represent in order to validate research findings and to generate new hypotheses. By themselves, these figures are nearly always incomprehensible to both humans and machines and their associated texts are therefore essential for full comprehension. The associated text of a figure, however, is scattered throughout its full-text article and contains redundant information content. In this paper, we report the continued development and evaluation of several figure summarization systems, the FigSum+ systems, that automatically identify associated texts, remove redundant information, and generate a text summary for every figure in an article. Using a set of 94 annotated figures selected from 19 different journals, we conducted an intrinsic evaluation of FigSum+. We evaluate the performance by precision, recall, F1, and ROUGE scores. The best FigSum+ system is based on an unsupervised method, achieving F1 score of 0.66 and ROUGE-1 score of 0.97. The annotated data is available at figshare.com (http://figshare.com/articles/Figure_Associated_Text_Summarization_and_Evaluation/858903).

  11. Texting preferences in a Paediatric residency.

    Science.gov (United States)

    Draper, Lauren; Kuklinski, Cadence; Ladley, Amy; Adamson, Greg; Broom, Matthew

    2017-12-01

    Text messaging is ubiquitous among residents, but remains an underused educational tool. Though feasibility has been demonstrated, evidence of its ability to improve standardised test scores and provide insight on resident texting preferences is lacking. The authors set out to evaluate: (1) satisfaction with a hybrid question-and-answer (Q&A) texting format; and (2) pre-/post-paediatric in-training exam (ITE) performance. A prospective study with paediatrics and internal medicine-paediatrics residents. Residents were divided into subgroups: adolescent medicine (AM) and developmental medicine (DM). Messages were derived from ITE questions and sent Monday-Friday with a 20 per cent variance in messages specific to the sub-group. Residents completed surveys gauging perceptions of the programme, and pre- and post-programme ITE scores were analysed. Forty-one residents enrolled and 32 (78%) completed a post-programme survey. Of those, 21 (66%) preferred a Q&A format with an immediate text response versus information-only texts. The percentage change in ITE scores between 2013 and 2014 was significant. Comparing subgroups, there was no significant difference between the percentage change in ITE scores. Neither group performed significantly better on either the adolescent or developmental sections of the ITE. Text messaging… remains an underused educational tool CONCLUSIONS: Overall, participants improved their ITE scores, but no improvement was seen in the targeted subgroups on the exam. Although Q&A texts are preferred by residents, further assessment is required to assess the effect on educational outcomes. © 2017 John Wiley & Sons Ltd and The Association for the Study of Medical Education.

  12. 重新發現教科「書」的歷程:從物質文化看教科書的潛在課程 The Trajectory of Rediscovering the Text-BOOK: Approaching the Hidden Curriculum of the Textbook from a Material Culturist Perspective

    Directory of Open Access Journals (Sweden)

    彭秉權 Ping-Chuan Peng

    2018-03-01

    Full Text Available 2005年筆者在北部某大學講授「潛在課程」,一份期末報告敘說了學生在成長過程中對教科書的愛恨情仇,這些大家熟悉的情感與經驗揭露了既有教科書研究方法的不足,促使筆者重新思考這本「書」的存在,不只是個傳統科技的文字載具,也是青少年日常生活裡不可或缺的物件。之後10年,筆者嘗試從物質文化的角度重新檢視這本書對學習與社會化的影響。本文以倒敘的方式先分享筆者尋找物質之理論意涵的歷程,放眼教育社會學的批判傳統,從古典理論,到繼起的文化研究、後現代、後結構,乃至近期的後人文思想,儘管處理物質的方式殊異,但皆無損其重要性。之後再引用部分理論來敘說、反芻當年的情事,完成延宕多年的回應。本文希望能為教育研究者與工作者開啟物質文化取向在教科書、潛在課程與青少年次文化,乃至學習、教育科技、政策及課程與教學等領域的應用。 The author has taught a course on hidden curriculum at a university in the northern Taiwan in 2005. A term project on students’ normal but forgotten affection to textbooks unwittingly revealed the limit of the critical approaches in textbook studies. New theories were desperately needed. The author, therefore, has begun reconsidering the existence of the “book” as more than a vehicle of words made by outdated printing technology, but also an everyday necessity for students’ social practice and learning. After years of searching, the author was convinced that material cultural studies are helpful in exploring the effect of the book for researchers and educators interested in studying textbook, hidden curriculum, and youth cultures, and issues of learning, educational technologies and policy, as well as curriculum and pedagogy. This article is a flashback. It begins with the author’s exploration of a long lost

  13. Learners misperceive the benefits of redundant text in multimedia learning.

    Science.gov (United States)

    Fenesi, Barbara; Kim, Joseph A

    2014-01-01

    Research on metacognition has consistently demonstrated that learners fail to endorse instructional designs that produce benefits to memory, and often prefer designs that actually impair comprehension. Unlike previous studies in which learners were only exposed to a single multimedia design, the current study used a within-subjects approach to examine whether exposure to both redundant text and non-redundant text multimedia presentations improved learners' metacognitive judgments about presentation styles that promote better understanding. A redundant text multimedia presentation containing narration paired with verbatim on-screen text (Redundant) was contrasted with two non-redundant text multimedia presentations: (1) narration paired with images and minimal text (Complementary) or (2) narration paired with minimal text (Sparse). Learners watched presentation pairs of either Redundant + Complementary, or Redundant + Sparse. Results demonstrate that Complementary and Sparse presentations produced highest overall performance on the final comprehension assessment, but the Redundant presentation produced highest perceived understanding and engagement ratings. These findings suggest that learners misperceive the benefits of redundant text, even after direct exposure to a non-redundant, effective presentation.

  14. Benchmarking infrastructure for mutation text mining.

    Science.gov (United States)

    Klein, Artjom; Riazanov, Alexandre; Hindle, Matthew M; Baker, Christopher Jo

    2014-02-25

    Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption.

  15. Benchmarking infrastructure for mutation text mining

    Science.gov (United States)

    2014-01-01

    Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600

  16. n-Gram-Based Text Compression

    Directory of Open Access Journals (Sweden)

    Vu H. Nguyen

    2016-01-01

    Full Text Available We propose an efficient method for compressing Vietnamese text using n-gram dictionaries. It has a significant compression ratio in comparison with those of state-of-the-art methods on the same dataset. Given a text, first, the proposed method splits it into n-grams and then encodes them based on n-gram dictionaries. In the encoding phase, we use a sliding window with a size that ranges from bigram to five grams to obtain the best encoding stream. Each n-gram is encoded by two to four bytes accordingly based on its corresponding n-gram dictionary. We collected 2.5 GB text corpus from some Vietnamese news agencies to build n-gram dictionaries from unigram to five grams and achieve dictionaries with a size of 12 GB in total. In order to evaluate our method, we collected a testing set of 10 different text files with different sizes. The experimental results indicate that our method achieves compression ratio around 90% and outperforms state-of-the-art methods.

  17. Keyword Extraction from Arabic Legal Texts

    Science.gov (United States)

    Rammal, Mahmoud; Bahsoun, Zeinab; Al Achkar Jabbour, Mona

    2015-01-01

    Purpose: The purpose of this paper is to apply local grammar (LG) to develop an indexing system which automatically extracts keywords from titles of Lebanese official journals. Design/methodology/approach: To build LG for our system, the first word that plays the determinant role in understanding the meaning of a title is analyzed and grouped as…

  18. Deep Belief Networks Based Toponym Recognition for Chinese Text

    Directory of Open Access Journals (Sweden)

    Shu Wang

    2018-06-01

    Full Text Available In Geographical Information Systems, geo-coding is used for the task of mapping from implicitly geo-referenced data to explicitly geo-referenced coordinates. At present, an enormous amount of implicitly geo-referenced information is hidden in unstructured text, e.g., Wikipedia, social data and news. Toponym recognition is the foundation of mining this useful geo-referenced information by identifying words as toponyms in text. In this paper, we propose an adapted toponym recognition approach based on deep belief network (DBN by exploring two key issues: word representation and model interpretation. A Skip-Gram model is used in the word representation process to represent words with contextual information that are ignored by current word representation models. We then determine the core hyper-parameters of the DBN model by illustrating the relationship between the performance and the hyper-parameters, e.g., vector dimensionality, DBN structures and probability thresholds. The experiments evaluate the performance of the Skip-Gram model implemented by the Word2Vec open-source tool, determine stable hyper-parameters and compare our approach with a conditional random field (CRF based approach. The experimental results show that the DBN model outperforms the CRF model with smaller corpus. When the corpus size is large enough, their statistical metrics become approaching. However, their recognition results express differences and complementarity on different kinds of toponyms. More importantly, combining their results can directly improve the performance of toponym recognition relative to their individual performances. It seems that the scale of the corpus has an obvious effect on the performance of toponym recognition. Generally, there is no adequate tagged corpus on specific toponym recognition tasks, especially in the era of Big Data. In conclusion, we believe that the DBN-based approach is a promising and powerful method to extract geo

  19. Multilingual access to full text databases

    International Nuclear Information System (INIS)

    Fluhr, C.; Radwan, K.

    1990-05-01

    Many full text databases are available in only one language, or more, they may contain documents in different languages. Even if the user is able to understand the language of the documents in the database, it could be easier for him to express his need in his own language. For the case of databases containing documents in different languages, it is more simple to formulate the query in one language only and to retrieve documents in different languages. This paper present the developments and the first experiments of multilingual search, applied to french-english pair, for text data in nuclear field, based on the system SPIRIT. After reminding the general problems of full text databases search by queries formulated in natural language, we present the methods used to reformulate the queries and show how they can be expanded for multilingual search. The first results on data in nuclear field are presented (AFCEN norms and INIS abstracts). 4 refs

  20. Monolingual accounting dictionaries for EFL text production

    DEFF Research Database (Denmark)

    Nielsen, Sandro

    2006-01-01

    Monolingual accounting dictionaries are important for producing financial reporting texts in English in an international setting, because of the lack of specialised bilingual dictionaries. As the intended user groups have different factual and linguistic competences, they require specific types...... text production. The monolingual accounting dictionary needs to include information about UK, US and international accounting terms, their grammatical properties, their potential for being combined with other words in collocations, phrases and sentences in order to meet user requirements. Data items...... of information. By identifying and analysing the users' factual and linguistic competences, user needs, use-situations and the stages involved in producing accounting texts in English as a foreign language, lexicographers will have a sound basis for designing the optimal English accounting dictionary for EFL...

  1. Monolingual Accounting Dictionaries for EFL Text Production

    DEFF Research Database (Denmark)

    Nielsen, Sandro

    2009-01-01

    Monolingual accounting dictionaries are important for producing financial reporting texts in English in an international setting, because of the lack of specialised bilingual dictionaries. As the intended user groups have different factual and linguistic competences, they require specific types...... text production. The monolingual accounting dictionary needs to include information about UK, US and international accounting terms, their grammatical properties, their potential for being combined with other words in collocations, phrases and sentences in order to meet user requirements. Data items...... of information. By identifying and analysing the users' factual and linguistic competences, user needs, use-situations and the stages involved in producing accounting texts in English as a foreign language, lexicographers will have a sound basis for designing the optimal English accounting dictionary for EFL...

  2. Runaway electrons in TEXT-U

    International Nuclear Information System (INIS)

    Freeman, M.R.

    1994-01-01

    Runaway electrons have long been studied in tokamak plasmas. The previous results regarding runaway electrons and the detection of hard x-rays are reviewed. The hard x-ray energy on TEXT-U is measured and the scaling of energy with electron density, n e , is noted. This scaling suggests a runaway source term that scales roughly as n e / 1 . The results indicate that runaways are created throughout the discharges. An upper bound for X e due to magnetic fluctuations was found to be .0343 m 2 /s. This is an order of magnitude too low to explain the thermal transport in TEXT, implying that electrostatic fluctuations are important in thermal transport in TEXT

  3. No More Provincialism: Art and Text

    Directory of Open Access Journals (Sweden)

    Heather Barker

    2010-11-01

    Full Text Available This essay discusses the writing and personalities surrounding the 1981 establishment of the Australian art magazine, Art & Text, and traces its progression under Paul Taylor’s editorship up to his relocation to New York. During this period, Art & Text published Taylor’s own essays and, more importantly, those of other writers and artists — Meaghan Morris, Paul Foss, Philip Brophy, Imants Tillers, Rex Butler, Edward Colless — all articulating a consistent and complex postmodern position. The magazine’s founder and editor, Paul Taylor, personified the shattering impact of postmodernism upon the Australian art world as well as postmodernism’s limitations. Taylor facilitated a new theoretical framework for the discussion of Australian art, one that continues to dominate the internationalist aspirations of Australian art writers. He produced temporarily convincing solutions to problems that earlier critics had wrestled with unsuccessfully, in particular the twin problems of provincialism, and the relationship of Australian to international art.

  4. Text mining patents for biomedical knowledge.

    Science.gov (United States)

    Rodriguez-Esteban, Raul; Bundschus, Markus

    2016-06-01

    Biomedical text mining of scientific knowledge bases, such as Medline, has received much attention in recent years. Given that text mining is able to automatically extract biomedical facts that revolve around entities such as genes, proteins, and drugs, from unstructured text sources, it is seen as a major enabler to foster biomedical research and drug discovery. In contrast to the biomedical literature, research into the mining of biomedical patents has not reached the same level of maturity. Here, we review existing work and highlight the associated technical challenges that emerge from automatically extracting facts from patents. We conclude by outlining potential future directions in this domain that could help drive biomedical research and drug discovery. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. WYLBUR reference manual. [For interactive text editing

    Energy Technology Data Exchange (ETDEWEB)

    Krupp, R.F.; Messina, P.C.; Peavler, J.M.; Schustack, S.; Starai, T.

    1977-04-01

    WYLBUR is a system for manipulating various kinds of text, such as computer programs, manuscripts, letters, forms, articles, or reports. Its on-line interactive text-editing capabilities allow the user to create, change, and correct text, and to search and display it. WYLBUR also has facilities for job submission and retrieval from remote terminals that make it possible for a user to inquire about the status of any job in the system, cancel jobs that are executing or awaiting execution, reroute output, raise job priority, or get information on the backlog of batch jobs. WYLBUR also has excellent recovery capabilities and a fast response time. This manual describes the WYLBUR version currently used at ANL. It is intended primarily as a reference manual; thus, examples of WYLBUR commands are kept to a minimum. (RWR)

  6. Mining biological networks from full-text articles.

    Science.gov (United States)

    Czarnecki, Jan; Shepherd, Adrian J

    2014-01-01

    The study of biological networks is playing an increasingly important role in the life sciences. Many different kinds of biological system can be modelled as networks; perhaps the most important examples are protein-protein interaction (PPI) networks, metabolic pathways, gene regulatory networks, and signalling networks. Although much useful information is easily accessible in publicly databases, a lot of extra relevant data lies scattered in numerous published papers. Hence there is a pressing need for automated text-mining methods capable of extracting such information from full-text articles. Here we present practical guidelines for constructing a text-mining pipeline from existing code and software components capable of extracting PPI networks from full-text articles. This approach can be adapted to tackle other types of biological network.

  7. Texting literacies as social practices among older women

    Directory of Open Access Journals (Sweden)

    Dyers, Charlyn

    2014-12-01

    Full Text Available While many studies on mobile messaging have tended to focus on the communicative practices of the urban young, this paper considers the role of mobile messaging (also called texting both as a social practice as well as a form of literacy enhancement among a group of older working class women between the ages of 50 and 80 in a Cape Town township. The paper examines how these women, with little or no formal education, acquire this form of literacy, as well as the purposes for which they use texting. It also explores how this form of late-modern communication is adding to four of their existing or developing literacies – text, numeracy, visual and personal. The paper therefore adopts a multiliteracies approach within the context of portable literacies.

  8. Quantum mechanics a comprehensive text for chemistry

    CERN Document Server

    Arora, Kishor

    2010-01-01

    This book contains 14 chapters. The text includes the inadequacy of classical mechanics and covers basic and fundamental concepts of quantum mechanics including concepts of transitional, vibration rotation and electronic energies, introduction to concepts of angular momenta, approximatemethods and their application concepts related to electron spin, symmetery concepts and quantum mechanics and ultimately the book features the theories of chemical bonding and use of softwares in quantum mechanics. the text of the book is presented in a lucid manner with ample examples and illustrations wherever

  9. Ancient Indian Astronomy in Introductory Texts

    Science.gov (United States)

    Narahari Achar, B. N.

    1997-10-01

    It is customary in introductory survey courses in astronomy to devote some time to the history of astronomy. In the available text books only the Greek contribution receives any attention. Apart from Stonehenge and Chichenitza pictures, contributions from Babylon and China are some times mentioned. Hardly any account is given of ancient Indian astronomy. Even when something is mentioned it is incomplete or incorrect or both. Examples are given from several text books currently available. An attempt is made to correct this situation by sketching the contributions from the earliest astronomy of India, namely Vedaanga Jyotisha.

  10. Radioprotection and radiotherapy: new regulatory texts

    International Nuclear Information System (INIS)

    Cosset, J.M.

    1998-01-01

    This article reviews about radiation protection of the workers in the radiotherapy centers. The different texts are explained. These texts (international and european ones) have to aim to reinforce the protection of personnel working in radiotherapy services, to reduce as it is possible the determinists an stochastic effects to organs out of the irradiated volumes, to avoid severe accidents. The radiotherapists have to keep in their mind that treatments must be justified in a clear way and optimized as reasonably achievable. (N.C.)

  11. Texting Styles and Information Change of SMS Text Messages in Filipino

    Science.gov (United States)

    Cabatbat, Josephine Jill T.; Tapang, Giovanni A.

    2013-02-01

    We identify the different styles of texting in Filipino short message service (SMS) texts and analyze the change in unigram and bigram frequencies due to these styles. Style preference vectors for sample texts were calculated and used to identify the style combination used by an average individual. The change in Shannon entropy of the SMS text is explained in light of a coding process.

  12. Knowledge Revision Processes in Refutation Texts

    Science.gov (United States)

    Kendeou, Panayiota; Walsh, Erinn K.; Smith, Emily R.; O'Brien, Edward J.

    2014-01-01

    In the present set of experiments, we systematically examined the processes that occur while reading texts designed to refute and explain commonsense beliefs that reside in readers' long-term memory. In Experiment 1 (n = 36), providing readers with a refutation-plus-explanation of a commonsense belief was sufficient to significantly reduce…

  13. Sleep Habits and Nighttime Texting among Adolescents

    Science.gov (United States)

    Garmy, Pernilla; Ward, Teresa M.

    2018-01-01

    The aim of this study was to examine sleep habits (i.e., bedtimes and rising times) and their association with nighttime text messaging in 15- to 17-year-old adolescents. This cross-sectional study analyzed data from a web-based survey of adolescent students attending secondary schools in southern Sweden (N = 278, 50% female). Less than 8 hr of…

  14. Handwriting segmentation of unconstrained Oriya text

    Indian Academy of Sciences (India)

    Based on vertical projection profiles and structural features of Oriya characters, text lines are segmented into words. For character segmentation, at first, the isolated and connected (touching) characters in a word are detected. Using structural, topological and water reservoir concept-based features, characters of the word ...

  15. The Readability of an Unreadable Text.

    Science.gov (United States)

    Gordon, Robert M.

    1980-01-01

    The Dale-Chall Readability Formula and the Fry Readability Graph were used to analyze passages of Plato's "Parmenides," a notoriously difficult literary piece. The readability levels of the text ranged from fourth to eighth grade (Dale-Chall) and from sixth to tenth grade (Fry), indicating the limitations of the readability tests. (DF)

  16. Validation Study of Waray Text Readability Instrument

    Science.gov (United States)

    Oyzon, Voltaire Q.; Corrales, Juven B.; Estardo, Wilfredo M., Jr.

    2015-01-01

    In 2012 the Leyte Normal University developed a computer software--modelled after the Spache Readability Formula (1953) made for English--made to help rank texts that can is used by teachers or research groups on selecting appropriate reading materials to support the DepEd's MTB-MLE program in Region VIII, in the Philippines. However,…

  17. The Pelindaba text and its previous

    International Nuclear Information System (INIS)

    Adeniji, O.

    1996-01-01

    The main body of the Treaty, the preamble, articles 1-22, and the map are reproduced in this issue in the section ''Documentation Relating to Disarmament and International Security''. The complete text, including annexes and protocols, is contained in document A/50/426

  18. n-Gram-Based Text Compression

    Science.gov (United States)

    Duong, Hieu N.; Snasel, Vaclav

    2016-01-01

    We propose an efficient method for compressing Vietnamese text using n-gram dictionaries. It has a significant compression ratio in comparison with those of state-of-the-art methods on the same dataset. Given a text, first, the proposed method splits it into n-grams and then encodes them based on n-gram dictionaries. In the encoding phase, we use a sliding window with a size that ranges from bigram to five grams to obtain the best encoding stream. Each n-gram is encoded by two to four bytes accordingly based on its corresponding n-gram dictionary. We collected 2.5 GB text corpus from some Vietnamese news agencies to build n-gram dictionaries from unigram to five grams and achieve dictionaries with a size of 12 GB in total. In order to evaluate our method, we collected a testing set of 10 different text files with different sizes. The experimental results indicate that our method achieves compression ratio around 90% and outperforms state-of-the-art methods. PMID:27965708

  19. Using Text Models In Diagnostic Tasks.

    Directory of Open Access Journals (Sweden)

    Korostil Yuriy

    2015-09-01

    Full Text Available This paper contains developing of a method of solving diagnostic tasks for complex technical objects (STO based on using text models (TMi to describe the functioning of STO. A TMi model is a text description, in normalized form, of all fragments of STO functioning process. The description of TMi is for med using semantic vocabularies of different types, which are generated on the basis of usage of information about all the aspects of STO construction and functioning. Such interpretation description is a subject area for tasks of STO diagnostics. Detection of malfunction and deviations of a functioning process of STO from an established functioning mode is implemented on the basis of analysis of semantic parameters of text description of the STO functioning process in order to determine semantic anomalies which occur in the descriptions of the STO functioning process, as well as in the descriptions of fragments of its functioning. Semantic anomalies occur in case when values of semantic parameters go beyond their established limits.

  20. Historical Text Comprehension Reflective Tutorial Dialogue System

    Science.gov (United States)

    Grigoriadou, Maria; Tsaganou, Grammatiki; Cavoura, Theodora

    2005-01-01

    The Reflective Tutorial Dialogue System (ReTuDiS) is a system for learner modelling historical text comprehension through reflective dialogue. The system infers learners' cognitive profiles and constructs their learner models. Based on the learner model the system plans the appropriate--personalized for learners--reflective tutorial dialogue in…

  1. There is a Text in 'The Balloon'

    DEFF Research Database (Denmark)

    Elias, Camelia

    2009-01-01

    From the Introduction: Camelia Elias' "There is a Text in 'The Balloon': Donald Barthelme's Allegorical Flights" provides its reader with a much-need and useful distinction between fantasy and the fantastic: "whereas fantasy in critical discourse can be aligned with allegory, in which a supernatu...

  2. Studies of electron cyclotron emission on text

    International Nuclear Information System (INIS)

    Gandy, R.F.

    1990-07-01

    The Auburn University electron cyclotron emission (ECE) system has made many significant contributions to the TEXT experimental program during the past five years. Contributions include electron temperature information used in the following areas of study: electron cyclotron heating (ECH), pellet injection, and impurity/energy transport. Details of the role which the Auburn ECE system has played will now be discussed

  3. The Cultural Content of Business Spanish Texts.

    Science.gov (United States)

    Grosse, Christine Uber; Uber, David

    A study examined eight business Spanish textbooks for cultural content by looking at commonly appearing cultural topics and themes, presentation of cultural information, activities and techniques used to promote cultural understanding, and incorporation of authentic materials. The texts were evenly divided among beginning, intermediate, and…

  4. Modeling statistical properties of written text.

    Directory of Open Access Journals (Sweden)

    M Angeles Serrano

    Full Text Available Written text is one of the fundamental manifestations of human language, and the study of its universal regularities can give clues about how our brains process information and how we, as a society, organize and share it. Among these regularities, only Zipf's law has been explored in depth. Other basic properties, such as the existence of bursts of rare words in specific documents, have only been studied independently of each other and mainly by descriptive models. As a consequence, there is a lack of understanding of linguistic processes as complex emergent phenomena. Beyond Zipf's law for word frequencies, here we focus on burstiness, Heaps' law describing the sublinear growth of vocabulary size with the length of a document, and the topicality of document collections, which encode correlations within and across documents absent in random null models. We introduce and validate a generative model that explains the simultaneous emergence of all these patterns from simple rules. As a result, we find a connection between the bursty nature of rare words and the topical organization of texts and identify dynamic word ranking and memory across documents as key mechanisms explaining the non trivial organization of written text. Our research can have broad implications and practical applications in computer science, cognitive science and linguistics.

  5. Assessing Assessment Texts: Where Is Planning?

    Science.gov (United States)

    Fives, Helenrose; Barnes, Nicole; Dacey, Charity; Gillis, Anna

    2016-01-01

    We conducted a content analysis of 27 assessment textbooks to determine how assessment planning was framed in texts for preservice teachers. We identified eight assessment planning themes: alignment, assessment purpose and types, reliability and validity, writing goals and objectives, planning specific assessments, unpacking, overall assessment…

  6. "The Politics of Location": Text as Opposition.

    Science.gov (United States)

    Moreno, Renee

    Eduardo Galeano's "Memory of Fire: Genesis" raises a number of questions concerning the "politics of location," a term that may be defined as the intersections, tensions, and complications that people of color bring to space and what space means in terms of hierarchies and power, racial and gender stratifications. Text can also…

  7. AUTHENTIC TEXTS FOR CRITICAL READING ACTIVITIES

    Directory of Open Access Journals (Sweden)

    Ila Amalia

    2016-03-01

    Full Text Available This research takes an action research aimed at promoting critical reading (“thinking” while reading skills using authentic materials among the students. This research also aims to reveal the students perception on using critical reading skills in reading activities. Nineteen English Education Department students who took Reading IV class, participated in this project. There were three cycles with three different critical reading strategies were applied. Meanwhile, the authentic materials were taken from newspaper and internet articles. The result revealed that the use of critical reading strategies along with the use of authentic materials has improved students’ critical reading skills as seen from the improvement of each cycle - the students critical reading skill was 54% (fair in the cycle 1 improved to 68% (average in cycle 2, and 82% (good in cycle 3.. In addition, based on the critical reading skill criteria, the students’ critical reading skill has improved from 40% (nearly meet to 80% (exceed. Meanwhile, from the students’ perception questionnaire, it was shown that 63% students agreed the critical reading activity using authentic text could improve critical thinking and 58% students agreed that doing critical reading activity could improve reading comprehension. The result had the implication that the use of authentic texts could improve students’ critical reading skills if it was taught by performing not lecturing them. Selectively choosing various strategies and materials can trigger students’ activeness in responding to a text, that eventually shape their critical reading skills.

  8. Database citation in full text biomedical articles.

    Science.gov (United States)

    Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R

    2013-01-01

    Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services.

  9. Selecting Full-Text Undergraduate Periodicals Databases.

    Science.gov (United States)

    Still, Julie M.; Kassabian, Vibiana

    1999-01-01

    Examines how libraries and librarians can compare full-text general periodical indices, using ProQuest Direct, Periodical Abstracts (via Ovid), and EBSCOhost as examples. Explores breadth and depth of coverage; manipulation of results (email/download/print); ease of use (searching); and indexing quirks. (AEF)

  10. The Impact of Texting on Comprehension

    Directory of Open Access Journals (Sweden)

    Jamal K. M. Ali

    2015-07-01

    Full Text Available This paper presents a study of the effects of texting on English language comprehension. The authors believe that English used in texting causes a lack of comprehension for English speakers, learners, and texters. Wei, Xian-hai and Jiang (2008:3 declare “In Netspeak, there are some newly-created vocabularies, which people cannot comprehend them either from their partial pronunciation or from their figures.” Crystal (2007:23 claims; “variation causes problems of comprehension and acceptability. If you speak or write differently from the way I do, we may fail to understand each other.”  In this paper, the authors conducted a questionnaire at Aligarh Muslim University to ninety respondents from five different Faculties and four different levels. To measure respondents’ comprehension of English texting, the authors gave the respondents abbreviations used by texters and asked them to write the full forms of the abbreviations. The authors found that many abbreviations were not understood, which suggested that most of the respondents did not understand and did not use these abbreviations.

  11. Neogeography: The Treasure of User Volunteered Text

    NARCIS (Netherlands)

    Habib, Mena Badieh

    Neogeography is the combination of user generated data and experiences with mapping technologies. This poster presents a research project to extract valuable structured information with a geographic component from unstructured user generated text in wikis, forums, or SMSs. The project intends to

  12. Prayer in Qumran texts. A brief introduction

    Directory of Open Access Journals (Sweden)

    Zdzisław J. Kapera

    2011-03-01

    Full Text Available Of some three hundred literary texts found in the caves of the Judaean Desert and those close to Khirbet Qumran, 56 are various pieces of poetry and liturgy. Seven specific groups have been distinguished among them: 1. Liturgy on sunshine and sunset and on specific days; 2. Liturgy on specific ceremonies of the community; 3. Eschatological prayers; 4. Magic texts; 5. Collections of psalms (including pseudepigrapha; 6. Thanksgiving hymns; 7. Prose prayers. The issue of how the Qumranians were praying is here briefly touched upon. Then there is a description of morning and evening prayers, Sabbath prayers, specific liturgy of the annual ceremony of entering the New Covenant, the Hodayot (Thanksgiving Hymns, pseudepigraphic Psalms (like Ps 151, and the eschatological prayers. The introduction ends with a summary evaluation of the role of the texts in reconstructing the historical development of the Jewish prayer of the late Second Temple period. The need to study the relationship of the Qumran prayers with the early Christian prayers is also briefly discussed.

  13. Rubrics and Exemplars in Text-Conferencing

    Science.gov (United States)

    Zahara, Allan

    2005-01-01

    The author draws on his K-12 teaching experiences in analyzing the strengths and weaknesses of asynchronous, text-based conferencing in online education. Issues relating to Web-based versus client-driven systems in computer-mediated conferencing (CMC) are examined. The paper also discusses pedagogical and administrative implications of choosing a…

  14. The Challenges of Qualitatively Coding Ancient Texts

    Science.gov (United States)

    Slingerland, Edward; Chudek, Maciej

    2012-01-01

    We respond to several important and valid concerns about our study ("The Prevalence of Folk Dualism in Early China," "Cognitive Science" 35: 997-1007) by Klein and Klein, defending our interpretation of our data. We also argue that, despite the undeniable challenges involved in qualitatively coding texts from ancient cultures,…

  15. CONAN : Text Mining in the Biomedical Domain

    NARCIS (Netherlands)

    Malik, R.

    2006-01-01

    This thesis is about Text Mining. Extracting important information from literature. In the last years, the number of biomedical articles and journals is growing exponentially. Scientists might not find the information they want because of the large number of publications. Therefore a system was

  16. Teaching life writing texts in Europe : Introduction

    NARCIS (Netherlands)

    Mreijen, Anne-Marie

    2015-01-01

    Although courses on auto/biography and life writing are taught at different universities in Europe, and elements of contemporary life writing issues are addressed in different disciplines like sociology and history, life writing courses, as described in Teaching Life Writing Texts, are certainly not

  17. Measurement of the [Formula: see text] meson lifetime using [Formula: see text] decays.

    Science.gov (United States)

    Aaij, R; Adeva, B; Adinolfi, M; Affolder, A; Ajaltouni, Z; Albrecht, J; Alessio, F; Alexander, M; Ali, S; Alkhazov, G; Cartelle, P Alvarez; Alves, A A; Amato, S; Amerio, S; Amhis, Y; Anderlini, L; Anderson, J; Andreassen, R; Andreotti, M; Andrews, J E; Appleby, R B; Gutierrez, O Aquines; Archilli, F; Artamonov, A; Artuso, M; Aslanides, E; Auriemma, G; Baalouch, M; Bachmann, S; Back, J J; Badalov, A; Balagura, V; Baldini, W; Barlow, R J; Barschel, C; Barsuk, S; Barter, W; Batozskaya, V; Bauer, Th; Bay, A; Beddow, J; Bedeschi, F; Bediaga, I; Belogurov, S; Belous, K; Belyaev, I; Ben-Haim, E; Bencivenni, G; Benson, S; Benton, J; Berezhnoy, A; Bernet, R; Bettler, M-O; van Beuzekom, M; Bien, A; Bifani, S; Bird, T; Bizzeti, A; Bjørnstad, P M; Blake, T; Blanc, F; Blouw, J; Blusk, S; Bocci, V; Bondar, A; Bondar, N; Bonivento, W; Borghi, S; Borgia, A; Borsato, M; Bowcock, T J V; Bowen, E; Bozzi, C; Brambach, T; van den Brand, J; Bressieux, J; Brett, D; Britsch, M; Britton, T; Brook, N H; Brown, H; Bursche, A; Busetto, G; Buytaert, J; Cadeddu, S; Calabrese, R; Callot, O; Calvi, M; Calvo Gomez, M; Camboni, A; Campana, P; Campora Perez, D; Carbone, A; Carboni, G; Cardinale, R; Cardini, A; Carranza-Mejia, H; Carson, L; Carvalho Akiba, K; Casse, G; Castillo Garcia, L; Cattaneo, M; Cauet, Ch; Cenci, R; Charles, M; Charpentier, Ph; Cheung, S-F; Chiapolini, N; Chrzaszcz, M; Ciba, K; Cid Vidal, X; Ciezarek, G; Clarke, P E L; Clemencic, M; Cliff, H V; Closier, J; Coca, C; Coco, V; Cogan, J; Cogneras, E; Collins, P; Comerma-Montells, A; Contu, A; Cook, A; Coombes, M; Coquereau, S; Corti, G; Counts, I; Couturier, B; Cowan, G A; Craik, D C; Cruz Torres, M; Cunliffe, S; Currie, R; D'Ambrosio, C; Dalseno, J; David, P; David, P N Y; Davis, A; De Bonis, I; De Bruyn, K; De Capua, S; De Cian, M; De Miranda, J M; De Paula, L; De Silva, W; De Simone, P; Decamp, D; Deckenhoff, M; Del Buono, L; Déléage, N; Derkach, D; Deschamps, O; Dettori, F; Di Canto, A; Dijkstra, H; Donleavy, S; Dordei, F; Dorigo, M; Dorosz, P; Dosil Suárez, A; Dossett, D; Dovbnya, A; Dupertuis, F; Durante, P; Dzhelyadin, R; Dziurda, A; Dzyuba, A; Easo, S; Egede, U; Egorychev, V; Eidelman, S; Eisenhardt, S; Eitschberger, U; Ekelhof, R; Eklund, L; El Rifai, I; Elsasser, Ch; Falabella, A; Färber, C; Farinelli, C; Farry, S; Ferguson, D; Fernandez Albor, V; Ferreira Rodrigues, F; Ferro-Luzzi, M; Filippov, S; Fiore, M; Fiorini, M; Fitzpatrick, C; Fontana, M; Fontanelli, F; Forty, R; Francisco, O; Frank, M; Frei, C; Frosini, M; Furfaro, E; Gallas Torreira, A; Galli, D; Gandelman, M; Gandini, P; Gao, Y; Garofoli, J; Garra Tico, J; Garrido, L; Gaspar, C; Gauld, R; Gersabeck, E; Gersabeck, M; Gershon, T; Ghez, Ph; Gianelle, A; Gibson, V; Giubega, L; Gligorov, V V; Göbel, C; Golubkov, D; Golutvin, A; Gomes, A; Gordon, H; Grabalosa Gándara, M; Graciani Diaz, R; Granado Cardoso, L A; Graugés, E; Graziani, G; Grecu, A; Greening, E; Gregson, S; Griffith, P; Grillo, L; Grünberg, O; Gui, B; Gushchin, E; Guz, Yu; Gys, T; Hadjivasiliou, C; Haefeli, G; Haen, C; Hafkenscheid, T W; Haines, S C; Hall, S; Hamilton, B; Hampson, T; Hansmann-Menzemer, S; Harnew, N; Harnew, S T; Harrison, J; Hartmann, T; He, J; Head, T; Heijne, V; Hennessy, K; Henrard, P; Hernando Morata, J A; van Herwijnen, E; Heß, M; Hicheur, A; Hill, D; Hoballah, M; Hombach, C; Hulsbergen, W; Hunt, P; Huse, T; Hussain, N; Hutchcroft, D; Hynds, D; Iakovenko, V; Idzik, M; Ilten, P; Jacobsson, R; Jaeger, A; Jans, E; Jaton, P; Jawahery, A; Jing, F; John, M; Johnson, D; Jones, C R; Joram, C; Jost, B; Jurik, N; Kaballo, M; Kandybei, S; Kanso, W; Karacson, M; Karbach, T M; Kenyon, I R; Ketel, T; Khanji, B; Khurewathanakul, C; Klaver, S; Kochebina, O; Komarov, I; Koopman, R F; Koppenburg, P; Korolev, M; Kozlinskiy, A; Kravchuk, L; Kreplin, K; Kreps, M; Krocker, G; Krokovny, P; Kruse, F; Kucharczyk, M; Kudryavtsev, V; Kurek, K; Kvaratskheliya, T; La Thi, V N; Lacarrere, D; Lafferty, G; Lai, A; Lambert, D; Lambert, R W; Lanciotti, E; Lanfranchi, G; Langenbruch, C; Latham, T; Lazzeroni, C; Le Gac, R; van Leerdam, J; Lees, J-P; Lefèvre, R; Leflat, A; Lefrançois, J; Leo, S; Leroy, O; Lesiak, T; Leverington, B; Li, Y; Liles, M; Lindner, R; Linn, C; Lionetto, F; Liu, B; Liu, G; Lohn, S; Longstaff, I; Lopes, J H; Lopez-March, N; Lowdon, P; Lu, H; Lucchesi, D; Luisier, J; Luo, H; Luppi, E; Lupton, O; Machefert, F; Machikhiliyan, I V; Maciuc, F; Maev, O; Malde, S; Manca, G; Mancinelli, G; Manzali, M; Maratas, J; Marconi, U; Marino, P; Märki, R; Marks, J; Martellotti, G; Martens, A; Martín Sánchez, A; Martinelli, M; Martinez Santos, D; Martins Tostes, D; Massafferri, A; Matev, R; Mathe, Z; Matteuzzi, C; Mazurov, A; McCann, M; McCarthy, J; McNab, A; McNulty, R; McSkelly, B; Meadows, B; Meier, F; Meissner, M; Merk, M; Milanes, D A; Minard, M-N; Molina Rodriguez, J; Monteil, S; Moran, D; Morandin, M; Morawski, P; Mordà, A; Morello, M J; Mountain, R; Mous, I; Muheim, F; Müller, K; Muresan, R; Muryn, B; Muster, B; Naik, P; Nakada, T; Nandakumar, R; Nasteva, I; Needham, M; Neubert, S; Neufeld, N; Nguyen, A D; Nguyen, T D; Nguyen-Mau, C; Nicol, M; Niess, V; Niet, R; Nikitin, N; Nikodem, T; Novoselov, A; Oblakowska-Mucha, A; Obraztsov, V; Oggero, S; Ogilvy, S; Okhrimenko, O; Oldeman, R; Onderwater, G; Orlandea, M; Otalora Goicochea, J M; Owen, P; Oyanguren, A; Pal, B K; Palano, A; Palutan, M; Panman, J; Papanestis, A; Pappagallo, M; Pappalardo, L; Parkes, C; Parkinson, C J; Passaleva, G; Patel, G D; Patel, M; Patrignani, C; Pavel-Nicorescu, C; Pazos Alvarez, A; Pearce, A; Pellegrino, A; Penso, G; Pepe Altarelli, M; Perazzini, S; Perez Trigo, E; Perret, P; Perrin-Terrin, M; Pescatore, L; Pesen, E; Pessina, G; Petridis, K; Petrolini, A; Picatoste Olloqui, E; Pietrzyk, B; Pilař, T; Pinci, D; Pistone, A; Playfer, S; Plo Casasus, M; Polci, F; Polok, G; Poluektov, A; Polycarpo, E; Popov, A; Popov, D; Popovici, B; Potterat, C; Powell, A; Prisciandaro, J; Pritchard, A; Prouve, C; Pugatch, V; Puig Navarro, A; Punzi, G; Qian, W; Rachwal, B; Rademacker, J H; Rakotomiaramanana, B; Rama, M; Rangel, M S; Raniuk, I; Rauschmayr, N; Raven, G; Redford, S; Reichert, S; Reid, M M; Dos Reis, A C; Ricciardi, S; Richards, A; Rinnert, K; Rives Molina, V; Roa Romero, D A; Robbe, P; Roberts, D A; Rodrigues, A B; Rodrigues, E; Rodriguez Perez, P; Roiser, S; Romanovsky, V; Romero Vidal, A; Rotondo, M; Rouvinet, J; Ruf, T; Ruffini, F; Ruiz, H; Ruiz Valls, P; Sabatino, G; Saborido Silva, J J; Sagidova, N; Sail, P; Saitta, B; Salustino Guimaraes, V; Sanmartin Sedes, B; Santacesaria, R; Santamarina Rios, C; Santovetti, E; Sapunov, M; Sarti, A; Satriano, C; Satta, A; Savrie, M; Savrina, D; Schiller, M; Schindler, H; Schlupp, M; Schmelling, M; Schmidt, B; Schneider, O; Schopper, A; Schune, M-H; Schwemmer, R; Sciascia, B; Sciubba, A; Seco, M; Semennikov, A; Senderowska, K; Sepp, I; Serra, N; Serrano, J; Seyfert, P; Shapkin, M; Shapoval, I; Shcheglov, Y; Shears, T; Shekhtman, L; Shevchenko, O; Shevchenko, V; Shires, A; Silva Coutinho, R; Simi, G; Sirendi, M; Skidmore, N; Skwarnicki, T; Smith, N A; Smith, E; Smith, E; Smith, J; Smith, M; Snoek, H; Sokoloff, M D; Soler, F J P; Soomro, F; Souza, D; Souza De Paula, B; Spaan, B; Sparkes, A; Spinella, F; Spradlin, P; Stagni, F; Stahl, S; Steinkamp, O; Stevenson, S; Stoica, S; Stone, S; Storaci, B; Stracka, S; Straticiuc, M; Straumann, U; Stroili, R; Subbiah, V K; Sun, L; Sutcliffe, W; Swientek, S; Syropoulos, V; Szczekowski, M; Szczypka, P; Szilard, D; Szumlak, T; T'Jampens, S; Teklishyn, M; Tellarini, G; Teodorescu, E; Teubert, F; Thomas, C; Thomas, E; van Tilburg, J; Tisserand, V; Tobin, M; Tolk, S; Tomassetti, L; Tonelli, D; Topp-Joergensen, S; Torr, N; Tournefier, E; Tourneur, S; Tran, M T; Tresch, M; Tsaregorodtsev, A; Tsopelas, P; Tuning, N; Ubeda Garcia, M; Ukleja, A; Ustyuzhanin, A; Uwer, U; Vagnoni, V; Valenti, G; Vallier, A; Vazquez Gomez, R; Vazquez Regueiro, P; Vázquez Sierra, C; Vecchi, S; Velthuis, J J; Veltri, M; Veneziano, G; Vesterinen, M; Viaud, B; Vieira, D; Vilasis-Cardona, X; Vollhardt, A; Volyanskyy, D; Voong, D; Vorobyev, A; Vorobyev, V; Voß, C; Voss, H; de Vries, J A; Waldi, R; Wallace, C; Wallace, R; Wandernoth, S; Wang, J; Ward, D R; Watson, N K; Webber, A D; Websdale, D; Whitehead, M; Wicht, J; Wiechczynski, J; Wiedner, D; Wiggers, L; Wilkinson, G; Williams, M P; Williams, M; Wilson, F F; Wimberley, J; Wishahi, J; Wislicki, W; Witek, M; Wormser, G; Wotton, S A; Wright, S; Wu, S; Wyllie, K; Xie, Y; Xing, Z; Yang, Z; Yuan, X; Yushchenko, O; Zangoli, M; Zavertyaev, M; Zhang, F; Zhang, L; Zhang, W C; Zhang, Y; Zhelezov, A; Zhokhov, A; Zhong, L; Zvyagin, A

    The lifetime of the [Formula: see text] meson is measured using semileptonic decays having a [Formula: see text] meson and a muon in the final state. The data, corresponding to an integrated luminosity of [Formula: see text], are collected by the LHCb detector in [Formula: see text] collisions at a centre-of-mass energy of 8 TeV. The measured lifetime is [Formula: see text]where the first uncertainty is statistical and the second is systematic.

  18. Text Mining the History of Medicine.

    Science.gov (United States)

    Thompson, Paul; Batista-Navarro, Riza Theresa; Kontonatsios, Georgios; Carter, Jacob; Toon, Elizabeth; McNaught, John; Timmermann, Carsten; Worboys, Michael; Ananiadou, Sophia

    2016-01-01

    Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while

  19. Working with Data: Discovering Knowledge through Mining and Analysis; Systematic Knowledge Management and Knowledge Discovery; Text Mining; Methodological Approach in Discovering User Search Patterns through Web Log Analysis; Knowledge Discovery in Databases Using Formal Concept Analysis; Knowledge Discovery with a Little Perspective.

    Science.gov (United States)

    Qin, Jian; Jurisica, Igor; Liddy, Elizabeth D.; Jansen, Bernard J; Spink, Amanda; Priss, Uta; Norton, Melanie J.

    2000-01-01

    These six articles discuss knowledge discovery in databases (KDD). Topics include data mining; knowledge management systems; applications of knowledge discovery; text and Web mining; text mining and information retrieval; user search patterns through Web log analysis; concept analysis; data collection; and data structure inconsistency. (LRW)

  20. Bollywood Movie Corpus for Text, Images and Videos

    OpenAIRE

    Madaan, Nishtha; Mehta, Sameep; Saxena, Mayank; Aggarwal, Aditi; Agrawaal, Taneea S; Malhotra, Vrinda

    2017-01-01

    In past few years, several data-sets have been released for text and images. We present an approach to create the data-set for use in detecting and removing gender bias from text. We also include a set of challenges we have faced while creating this corpora. In this work, we have worked with movie data from Wikipedia plots and movie trailers from YouTube. Our Bollywood Movie corpus contains 4000 movies extracted from Wikipedia and 880 trailers extracted from YouTube which were released from 1...

  1. Morpheme matching based text tokenization for a scarce resourced language.

    Science.gov (United States)

    Rehman, Zobia; Anwar, Waqas; Bajwa, Usama Ijaz; Xuan, Wang; Chaoying, Zhou

    2013-01-01

    Text tokenization is a fundamental pre-processing step for almost all the information processing applications. This task is nontrivial for the scarce resourced languages such as Urdu, as there is inconsistent use of space between words. In this paper a morpheme matching based approach has been proposed for Urdu text tokenization, along with some other algorithms to solve the additional issues of boundary detection of compound words, affixation, reduplication, names and abbreviations. This study resulted into 97.28% precision, 93.71% recall, and 95.46% F1-measure; while tokenizing a corpus of 57000 words by using a morpheme list with 6400 entries.

  2. Text mining improves prediction of protein functional sites.

    Directory of Open Access Journals (Sweden)

    Karin M Verspoor

    Full Text Available We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites. The structure analysis was carried out using Dynamics Perturbation Analysis (DPA, which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions.

  3. Text Character Extraction Implementation from Captured Handwritten Image to Text Conversionusing Template Matching Technique

    Directory of Open Access Journals (Sweden)

    Barate Seema

    2016-01-01

    Full Text Available Images contain various types of useful information that should be extracted whenever required. A various algorithms and methods are proposed to extract text from the given image, and by using that user will be able to access the text from any image. Variations in text may occur because of differences in size, style,orientation, alignment of text, and low image contrast, composite backgrounds make the problem during extraction of text. If we develop an application that extracts and recognizes those texts accurately in real time, then it can be applied to many important applications like document analysis, vehicle license plate extraction, text- based image indexing, etc and many applications have become realities in recent years. To overcome the above problems we develop such application that will convert the image into text by using algorithms, such as bounding box, HSV model, blob analysis,template matching, template generation.

  4. Domain-independent information extraction in unstructured text

    Energy Technology Data Exchange (ETDEWEB)

    Irwin, N.H. [Sandia National Labs., Albuquerque, NM (United States). Software Surety Dept.

    1996-09-01

    Extracting information from unstructured text has become an important research area in recent years due to the large amount of text now electronically available. This status report describes the findings and work done during the second year of a two-year Laboratory Directed Research and Development Project. Building on the first-year`s work of identifying important entities, this report details techniques used to group words into semantic categories and to output templates containing selective document content. Using word profiles and category clustering derived during a training run, the time-consuming knowledge-building task can be avoided. Though the output still lacks in completeness when compared to systems with domain-specific knowledge bases, the results do look promising. The two approaches are compatible and could complement each other within the same system. Domain-independent approaches retain appeal as a system that adapts and learns will soon outpace a system with any amount of a priori knowledge.

  5. DEEP LEARNING MODEL FOR BILINGUAL SENTIMENT CLASSIFICATION OF SHORT TEXTS

    Directory of Open Access Journals (Sweden)

    Y. B. Abdullin

    2017-01-01

    Full Text Available Sentiment analysis of short texts such as Twitter messages and comments in news portals is challenging due to the lack of contextual information. We propose a deep neural network model that uses bilingual word embeddings to effectively solve sentiment classification problem for a given pair of languages. We apply our approach to two corpora of two different language pairs: English-Russian and Russian-Kazakh. We show how to train a classifier in one language and predict in another. Our approach achieves 73% accuracy for English and 74% accuracy for Russian. For Kazakh sentiment analysis, we propose a baseline method, that achieves 60% accuracy; and a method to learn bilingual embeddings from a large unlabeled corpus using a bilingual word pairs.

  6. Cell Phoning and Texting While Driving

    Directory of Open Access Journals (Sweden)

    Judy Honoria Rosaire Telemaque

    2015-07-01

    Full Text Available A qualitative phenomenological study was conducted on the consequences of cell phone use while operating a vehicle. We discussed why talking and texting on cell phones are so popular through the analysis of our interviews with police officers, driving instructors, and parents of teens and young adults. The participants came from central, northeastern, northwestern, and southeastern Connecticut. All had exposure with respect to the effects of cell phone usage problem. The study reached a point of theoretical saturation or redundancy by which the analysis no longer resulted in new themes. We concluded that the discoveries revealed the necessity for education, expansion of technology, and additional driver education preparation, which may provide a path for leadership to help solve the problem.

  7. Can An Evolutionary Process Create English Text?

    Energy Technology Data Exchange (ETDEWEB)

    Bailey, David H.

    2008-10-29

    Critics of the conventional theory of biological evolution have asserted that while natural processes might result in some limited diversity, nothing fundamentally new can arise from 'random' evolution. In response, biologists such as Richard Dawkins have demonstrated that a computer program can generate a specific short phrase via evolution-like iterations starting with random gibberish. While such demonstrations are intriguing, they are flawed in that they have a fixed, pre-specified future target, whereas in real biological evolution there is no fixed future target, but only a complicated 'fitness landscape'. In this study, a significantly more sophisticated evolutionary scheme is employed to produce text segments reminiscent of a Charles Dickens novel. The aggregate size of these segments is larger than the computer program and the input Dickens text, even when comparing compressed data (as a measure of information content).

  8. Resonant island divertor experiments on text

    International Nuclear Information System (INIS)

    deGrassie, J.S.; Evans, T.E.; Jackson, G.L.

    1988-09-01

    The first experimental tests of the resonant island divertor (RID) concept have been carried out on the Texas Experimental Tokamak (TEXT). Modular perturbation coils produce static resonant magnetic fields at the tokamak boundary. The resulting magnetic islands are used to guide heat and particle fluxes around a small scoop limiter head. An enhancement in the limiter collection efficiency over the nonisland operation, as evidenced by enhanced neutral density within the limiter head, of up to a factor of 4 is obtained. This enhancement is larger than one would expect given the measured magnitude of the cross-field particle transport in TEXT. It is proposed that electrostatic perturbations occur which enhance the ion convection rate around the islands. Preliminary experiments utilizing electron cyclotron heating (ECH) in conjunction with RID operation have also have been performed. 6 refs., 3 figs

  9. Text-based CAPTCHAs over the years

    Science.gov (United States)

    Chow, Y. W.; Susilo, W.

    2017-11-01

    The notion of CAPTCHAs has been around for more than two decades. Since its introduction, CAPTCHAs have now become a ubiquitous part of the Internet. Over the years, research on various aspects of CAPTCHAs has evolved and different design principles have emerged. This article discusses text-based CAPTCHAs in terms of their fundamental requirements, namely, security and usability. Practicality necessitates that humans must be able to correctly solve CAPTCHA challenges, while at the same time automated computer programs should have difficulty solving the challenges. This article also presents alternative paradigms to text-based CAPTCHA design that have been examined in previous work. With the advances in techniques to defeat CAPTCHAs, the future of auto- mated Turing tests is an open question.

  10. Ordinary differential equations a graduate text

    CERN Document Server

    Bhamra, K S

    2015-01-01

    ORDINARY DIFFERENTIAL EQUATIONS: A Graduate Text presents a systematic and comprehensive introduction to ODEs for graduate and postgraduate students. The systematic organized text on differential inequalities, Gronwall's inequality, Nagumo's theorems, Osgood's criteria and applications of different equations of first order is dealt with in a greater depth. The book discusses qualitative and quantitative aspects of the Strum - Liouville problems, Green's function, integral equations, Laplace transform and is supported by a number of worked-out examples in each lesson to make the concepts clear. A lot of stress on stability theory is laid down, especially on Lyapunov and Poincare stability theory. A numerous figures in various lessons (in particular lessons dealing with stability theory) have been added to clarify the key concepts in DE theory. Nonlinear oscillation in conservative systems and Hamiltonian systems highlights basic nature of the systems considered. Perturbation techniques lesson deals in fairly d...

  11. Stemming of Slovenian library science texts

    Directory of Open Access Journals (Sweden)

    Polona Vilar

    2002-01-01

    Full Text Available The theme of the article is the preparation of a stemming algorithm for Slovenian library science texts. The procedure consisted of three phases: learning, testing and evaluation.The preparation of the optimal stemmer for Slovenian texts from the field of library science is presented, its testing and comparison with two other stemmers for the Slovenian language: the Popovič stemmer and the Generic stemmer. A corpus of 790.000 words from the field of library science was used for learning. Lists of stems, word endings and stop-words were built. In the testing phase, the component parts of the algorithm were tested on an additional corpus of 167.000 words. In the evaluation phase, a comparison of the three stemmers processing the same word corpus was made. The results of each stemmer were compared with an intellectually prepared control result of the stemming of the corpus. It consisted of groups of semantically connected words with no errors. Understemming was especially monitored – the number of stems for semantically connected words, produced by an algorithm. The results were statistically processed with the Kruskal-Wallis test. The Optimal stemmer produced the best results.It matched best with the reference results and also gave the smallest number of stems for one semantic meaning. The Popovič stemmer followed closely. The Generic stemmer proved to be the least accurate. The procedures described in the thesis can represent a platform for the development of the tools for automatic indexing and retrieval for library science texts in Slovenian language.

  12. Reading an ESL Writer’s Text

    Directory of Open Access Journals (Sweden)

    Paul Kei Matsuda

    2011-03-01

    Full Text Available This paper focuses on reading as a central act of communication in the tutorial session. Writing center tutors without extensive experience reading writing by second language writers may have difficulty getting past the many differences in surface-level features, organization, and rhetorical moves. After exploring some of the sources of these differences in writing, the authors present strategies that writing tutors can use to work effectively with second language writers.

  13. Logistic regression a self-learning text

    CERN Document Server

    Kleinbaum, David G

    1994-01-01

    This textbook provides students and professionals in the health sciences with a presentation of the use of logistic regression in research. The text is self-contained, and designed to be used both in class or as a tool for self-study. It arises from the author's many years of experience teaching this material and the notes on which it is based have been extensively used throughout the world.

  14. HPTA: High-Performance Text Analytics

    OpenAIRE

    Vandierendonck, Hans; Murphy, Karen; Arif, Mahwish; Nikolopoulos, Dimitrios S.

    2017-01-01

    One of the main targets of data analytics is unstructured data, which primarily involves textual data. High-performance processing of textual data is non-trivial. We present the HPTA library for high-performance text analytics. The library helps programmers to map textual data to a dense numeric representation, which can be handled more efficiently. HPTA encapsulates three performance optimizations: (i) efficient memory management for textual data, (ii) parallel computation on associative dat...

  15. Revising strategies for different text types

    OpenAIRE

    Roussey, JY; Piolat, A; Guercin, F

    1990-01-01

    Forty-eight children and forty-eight adults of contrasting degrees of expertise made a series of corrections in order to improve a text (narrative or description) in which three within-statement errors and three between-statement errors had been inserted. Subjects used a simplified word processor (SCRIPREV) which recorded all movements of linguistic units. The purpose of this research was to study revising strategies by examining the correction-sequencing procedures implemented by these subje...

  16. Resource Lean and Portable Automatic Text Summarization

    OpenAIRE

    Hassel, Martin

    2007-01-01

    Today, with digitally stored information available in abundance, even for many minor languages, this information must by some means be filtered and extracted in order to avoid drowning in it. Automatic summarization is one such technique, where a computer summarizes a longer text to a shorter non-rendundant form. Apart from the major languages of the world there are a lot of languages for which large bodies of data aimed at language technology research to a high degree are lacking. There migh...

  17. Knowledge Based Understanding of Radiology Text

    OpenAIRE

    Ranum, David L.

    1988-01-01

    A data acquisition tool which will extract pertinent diagnostic information from radiology reports has been designed and implemented. Pertinent diagnostic information is defined as that clinical data which is used by the HELP medical expert system. The program uses a memory based semantic parsing technique to “understand” the text. Moreover, the memory structures and lexicon necessary to perform this action are automatically generated from the diagnostic knowledge base by using a special purp...

  18. Texting your way to healthier eating?

    DEFF Research Database (Denmark)

    Pedersen, Susanne; Grønhøj, Alice; Thøgersen, John

    2016-01-01

    This study investigates the effects of a feedback intervention employing text messaging during 11 weeks on adolescents’ behavior, self-efficacy and outcome expectations regarding fruit and vegetable intake. A pre- and post-survey was completed by 1488 adolescents school-wise randomly allocated...... than 10% experienced a significant drop in outcome expectations. The findings suggest that participants’ active engagement in an intervention is crucial to its success. Implications for health-promoting interventions are discussed....

  19. Text Detection and Pose Estimation for a Reading Robot

    OpenAIRE

    Bulacu, Marius; Ezaki, Nobuo; Schomaker, Lambert

    2008-01-01

    One very important advantage of using CoCos for text detection is that they naturally allow the analysis to take place across scales. In this approach, scale does not represent such a problematic issue because the CoCo extraction process is scale independent. CoCos give a prompt, but rather imperfect, hold to the structures present in the image and CoCo selection

  20. Extracting BI-RADS Features from Portuguese Clinical Texts.

    Science.gov (United States)

    Nassif, Houssam; Cunha, Filipe; Moreira, Inês C; Cruz-Correia, Ricardo; Sousa, Eliana; Page, David; Burnside, Elizabeth; Dutra, Inês

    2012-01-01

    In this work we build the first BI-RADS parser for Portuguese free texts, modeled after existing approaches to extract BI-RADS features from English medical records. Our concept finder uses a semantic grammar based on the BIRADS lexicon and on iterative transferred expert knowledge. We compare the performance of our algorithm to manual annotation by a specialist in mammography. Our results show that our parser's performance is comparable to the manual method.

  1. Text mining effectively scores and ranks the literature for improving chemical-gene-disease curation at the comparative toxicogenomics database.

    Directory of Open Access Journals (Sweden)

    Allan Peter Davis

    Full Text Available The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/ is a public resource that curates interactions between environmental chemicals and gene products, and their relationships to diseases, as a means of understanding the effects of environmental chemicals on human health. CTD provides a triad of core information in the form of chemical-gene, chemical-disease, and gene-disease interactions that are manually curated from scientific articles. To increase the efficiency, productivity, and data coverage of manual curation, we have leveraged text mining to help rank and prioritize the triaged literature. Here, we describe our text-mining process that computes and assigns each article a document relevancy score (DRS, wherein a high DRS suggests that an article is more likely to be relevant for curation at CTD. We evaluated our process by first text mining a corpus of 14,904 articles triaged for seven heavy metals (cadmium, cobalt, copper, lead, manganese, mercury, and nickel. Based upon initial analysis, a representative subset corpus of 3,583 articles was then selected from the 14,094 articles and sent to five CTD biocurators for review. The resulting curation of these 3,583 articles was analyzed for a variety of parameters, including article relevancy, novel data content, interaction yield rate, mean average precision, and biological and toxicological interpretability. We show that for all measured parameters, the DRS is an effective indicator for scoring and improving the ranking of literature for the curation of chemical-gene-disease information at CTD. Here, we demonstrate how fully incorporating text mining-based DRS scoring into our curation pipeline enhances manual curation by prioritizing more relevant articles, thereby increasing data content, productivity, and efficiency.

  2. The interjection in old Romanian texts

    Directory of Open Access Journals (Sweden)

    Margareta Manu Magda

    2017-09-01

    Full Text Available The paper tries to identify the special problems posed by the study of interjection based on the examination of a corpus of texts from the old Romanian (1600–1780, referring to texts from modern Romanian. We have watched how certain interjectional formations have acquired, through diachronic expansion, new grammatical, semantic and pragmatic values.The structure of the paper is the following: the introduction (§1 summarizes the author’s position on the status of the interjection category at a morphosyntactic, semantic and pragmatic level (§1.1 and on the relation between different linguistic structures and their grammaticalization / pragmaticalization process (§1.2. The second section (§2 refers to the specific routes followed by the evolution of the various categories of the analysed interjections, from the old Romanian to the modern Romanian: the presentatives adecă, iată, ni (§2.1, the hortatives haide, ni (§2.2, the addressing particles bre, măi (§2.3, the connectors with demarcation signal function adevăr, amin (§2.4. The third section (§3 has as objective the description of a species of delocutive derivation, illustrated in Romanian by the lexicalized semantic variants of the secondary interjection Doamne!. The study concludes with several final considerations regarding the results of the research (§4.

  3. Speech Act Classification of German Advertising Texts

    Directory of Open Access Journals (Sweden)

    Артур Нарманович Мамедов

    2015-12-01

    Full Text Available This paper uses the theory of speech acts and the underlying concept of pragmalinguistics to determine the types of speech acts and their classification in the German advertising printed texts. We ascertain that the advertising of cars and accessories, household appliances and computer equipment, watches, fancy goods, food, pharmaceuticals, and financial, insurance, legal services and also airline advertising is dominated by a pragmatic principle, which is based on demonstrating information about the benefits of a product / service. This influences the frequent usage of certain speech acts. The dominant form of exposure is to inform the recipient-user about the characteristics of the advertised product. This information is fore-grounded by means of stylistic and syntactic constructions specific to the advertisement (participial constructions, appositional constructions which contribute to emphasize certain notional components within the framework of the advertising text. Stylistic and syntactic devices of reduction (parceling constructions convey the author's idea. Other means like repetitions, enumerations etc are used by the advertiser to strengthen his selling power. The advertiser focuses the attention of the consumer on the characteristics of the product seeking to convince him of the utility of the product and to influence his/ her buying behavior.

  4. Effects of music on memory for text.

    Science.gov (United States)

    Purnell-Webb, Patricia; Speelman, Craig P

    2008-06-01

    Previous research has suggested that the use of song can facilitate recall of text. This study examined the effect of repetition of a melody across verses, familiarity with the melody, rhythm, and other structural processing hypotheses to explain this phenomenon. Two experiments were conducted, each with 100 participants recruited from undergraduate Psychology programs (44 men, 156 women, M age = 28.5 yr., SD = 9.4). In Exp. 1, participants learned a four-verse ballad in one of five encoding conditions (familiar melody, unfamiliar melody, unknown rhythm, known rhythm, and spoken). Exp. 2 assessed the effect of familiarity in rhythm-only conditions and of pre-exposure with a previously unfamiliar melody. Measures taken were number of verbatim words recalled and number of lines produced with correct syllabic structure. Analysis indicated that rhythm, with or without musical accompaniment, can facilitate recall of text, suggesting that rhythm may provide a schematic frame to which text can be attached. Similarly, familiarity with the rhythm or melody facilitated recall. Findings are discussed in terms of integration and dual-processing theories.

  5. Facilitating text reading in posterior cortical atrophy.

    Science.gov (United States)

    Yong, Keir X X; Rajdev, Kishan; Shakespeare, Timothy J; Leff, Alexander P; Crutch, Sebastian J

    2015-07-28

    We report (1) the quantitative investigation of text reading in posterior cortical atrophy (PCA), and (2) the effects of 2 novel software-based reading aids that result in dramatic improvements in the reading ability of patients with PCA. Reading performance, eye movements, and fixations were assessed in patients with PCA and typical Alzheimer disease and in healthy controls (experiment 1). Two reading aids (single- and double-word) were evaluated based on the notion that reducing the spatial and oculomotor demands of text reading might support reading in PCA (experiment 2). Mean reading accuracy in patients with PCA was significantly worse (57%) compared with both patients with typical Alzheimer disease (98%) and healthy controls (99%); spatial aspects of passages were the primary determinants of text reading ability in PCA. Both aids led to considerable gains in reading accuracy (PCA mean reading accuracy: single-word reading aid = 96%; individual patient improvement range: 6%-270%) and self-rated measures of reading. Data suggest a greater efficiency of fixations and eye movements under the single-word reading aid in patients with PCA. These findings demonstrate how neurologic characterization of a neurodegenerative syndrome (PCA) and detailed cognitive analysis of an important everyday skill (reading) can combine to yield aids capable of supporting important everyday functional abilities. This study provides Class III evidence that for patients with PCA, 2 software-based reading aids (single-word and double-word) improve reading accuracy. © 2015 American Academy of Neurology.

  6. PEDANT: Parallel Texts in Göteborg

    Directory of Open Access Journals (Sweden)

    Daniel Ridings

    2012-09-01

    Full Text Available

    The article presents the status of the PEDANT project with parallel corpora at the Language Bank at Göteborg University. The solutions for access to the corpus data are presented. Access is provided by way of the internet and standard applications and SGML-aware programming tools. The SGML format for encoding translation pairs is outlined together. The methods allow working with everything from plain text to texts densely encoded with linguistic information.

     

    In hierdie artikel word 'n beskrywing gegee van die stand van die PEDANT-projek met parallelle korpora by die Taalbank by die Universiteit van Göteborg. Oplossings vir die verkryging van toegang tot die korpusdata word aangedui. Toegang word verskaf deur middel van die Internet en standaardtoepassings en SGML-sensitiewe programmeringshulpmiddels. Die SGML-formaat vir die enkodering van vertaalpare word gesamentlik geskets. Hierdie metodes laat toe dat gewerk kan word met enigiets vanaf suiwer teks tot tekste wat taalkundig dig geëtiketteer is.

     

  7. Facilitating text reading in posterior cortical atrophy

    Science.gov (United States)

    Rajdev, Kishan; Shakespeare, Timothy J.; Leff, Alexander P.; Crutch, Sebastian J.

    2015-01-01

    Objective: We report (1) the quantitative investigation of text reading in posterior cortical atrophy (PCA), and (2) the effects of 2 novel software-based reading aids that result in dramatic improvements in the reading ability of patients with PCA. Methods: Reading performance, eye movements, and fixations were assessed in patients with PCA and typical Alzheimer disease and in healthy controls (experiment 1). Two reading aids (single- and double-word) were evaluated based on the notion that reducing the spatial and oculomotor demands of text reading might support reading in PCA (experiment 2). Results: Mean reading accuracy in patients with PCA was significantly worse (57%) compared with both patients with typical Alzheimer disease (98%) and healthy controls (99%); spatial aspects of passages were the primary determinants of text reading ability in PCA. Both aids led to considerable gains in reading accuracy (PCA mean reading accuracy: single-word reading aid = 96%; individual patient improvement range: 6%–270%) and self-rated measures of reading. Data suggest a greater efficiency of fixations and eye movements under the single-word reading aid in patients with PCA. Conclusions: These findings demonstrate how neurologic characterization of a neurodegenerative syndrome (PCA) and detailed cognitive analysis of an important everyday skill (reading) can combine to yield aids capable of supporting important everyday functional abilities. Classification of evidence: This study provides Class III evidence that for patients with PCA, 2 software-based reading aids (single-word and double-word) improve reading accuracy. PMID:26138948

  8. Three Writers of Arabic Texts in Yogyakarta

    Directory of Open Access Journals (Sweden)

    Muhamad Murtadlo

    2015-02-01

    Full Text Available This study examines the use of the Arabic alphabet in religious literature in Yogyakarta. This study uses a case study on three figure writers of religious texts that using the Arabic alphabet in southern part of Central Java (Yogyakarta, namely Asrori Ahmad (Magelang, Ali Maksum (Yogyakarta, and Ahmad Mujab Mahalli (Bantul. This study concluded that the writing of religious texts in Arabic alphabet in the southern Java area had been carried out by means of using Arabic Pegon, and only a few people who wrote in the Arabic language. The transmission of Arabic Pegon in Yogyakarta is allegedly from north coast of Java, especially from Lasem / East Java. The tradition of Arabic language teaching in the pesantrens still focuses mostly on the reading effort, communication, and understanding and it is not oriented to the writing skill. The presence of international journals initiated by the College of Islamic religious institutions and the effort of translation business into Arabic from certain institutions gives an opportunity to strengthen the use of the Arabic alphabet in Indonesia.

  9. Sentiment topic mining based on comment tags

    Science.gov (United States)

    Zhang, Daohai; Liu, Xue; Li, Juan; Fan, Mingyue

    2018-03-01

    With the development of e-commerce, various comments based on tags are generated, how to extract valuable information from these comment tags has become an important content of business management decisions. This study takes HUAWEI mobile phone tags as an example using the sentiment analysis and topic LDA mining method. The first step is data preprocessing and classification of comment tag topic mining. And then make the sentiment classification for comment tags. Finally, mine the comments again and analyze the emotional theme distribution under different sentiment classification. The results show that HUAWEI mobile phone has a good user experience in terms of fluency, cost performance, appearance, etc. Meanwhile, it should pay more attention to independent research and development, product design and development. In addition, battery and speed performance should be enhanced.

  10. Visualizing the semantic content of large text databases using text maps

    Science.gov (United States)

    Combs, Nathan

    1993-01-01

    A methodology for generating text map representations of the semantic content of text databases is presented. Text maps provide a graphical metaphor for conceptualizing and visualizing the contents and data interrelationships of large text databases. Described are a set of experiments conducted against the TIPSTER corpora of Wall Street Journal articles. These experiments provide an introduction to current work in the representation and visualization of documents by way of their semantic content.

  11. Learning from Conflicting Texts: The Role of Intertextual Conflict Resolution in Between-Text Integration

    Science.gov (United States)

    Kobayashi, Keiichi

    2015-01-01

    The present study examined the effect of intertextual conflict resolution on learning from conflicting texts. In two experiments, participants read sets of two texts under the condition of being encouraged either to resolve a conflict between the texts' arguments (the resolution condition) or to comprehend the arguments (the comprehension…

  12. Text Skimming: The Process and Effectiveness of Foraging through Text under Time Pressure

    Science.gov (United States)

    Duggan, Geoffrey B.; Payne, Stephen J.

    2009-01-01

    Is Skim reading effective? How do readers allocate their attention selectively? The authors report 3 experiments that use expository texts and allow readers only enough time to read half of each document. Experiment 1 found that, relative to reading half the text, skimming improved memory for important ideas from a text but did not improve memory…

  13. Interview als Text vs. Interview als Interaktion

    Directory of Open Access Journals (Sweden)

    Arnulf Deppermann

    2013-09-01

    Full Text Available Das Interview ist nach wie vor das beliebteste sozialwissenschaftliche Verfahren des Datengewinns. Ökonomie der Erhebung, Vergleichbarkeit und die Möglichkeit, Einsicht in Praxisbereiche und historisch-biografische Dimensionen zu erhalten, die der direkten Beobachtung kaum zugänglich sind, machen seine Attraktivität aus. Zugleich mehren sich Kritiken, die seine Leistungsfähigkeit problematisieren, indem sie auf die begrenzte Reichweite der Explikationsfähigkeiten der Befragten, die Reaktivität der Erhebung oder die Differenz zwischen Handeln und dem Bericht über Handeln verweisen. Im Beitrag wird zwischen Ansätzen, die das Interview als Text, und solchen, die es als Interaktion verstehen, unterschieden. Nach dem Text-Verständnis werden Interviews unter inhaltlichen Gesichtspunkten analysiert und als Zugang zu einer vorgängigen sozialen oder psychischen Wirklichkeit angesehen. Das Interaktions-Verständnis versteht Interviews dagegen als situierte Praxis, in welcher im Hier und Jetzt von InterviewerInnen und Befragten gemeinsam soziale Sinnstrukturen hergestellt werden. Anhand ubiquitärer Phänomene der Interviewinteraktion – Fragen, Antworten und die Selbstpositionierung von InterviewerInnen und Befragten – werden Praktiken des interaktiv-performativen Handelns im Interview dargestellt. Ihre Relevanz für die Interviewkonstitution und ihre Erkenntnispotenziale für die Interviewauswertung werden aufgezeigt. Es wird dafür plädiert, die interaktive Konstitutionsweise von Interviews empirisch zu erforschen und methodisch konsequent zu berücksichtigen. URN: http://nbn-resolving.de/urn:nbn:de:0114-fqs1303131

  14. Layout-aware text extraction from full-text PDF of scientific articles

    Directory of Open Access Journals (Sweden)

    Ramakrishnan Cartic

    2012-05-01

    Full Text Available Abstract Background The Portable Document Format (PDF is the most commonly used file format for online scientific publications. The absence of effective means to extract text from these PDF files in a layout-aware manner presents a significant challenge for developers of biomedical text mining or biocuration informatics systems that use published literature as an information source. In this paper we introduce the ‘Layout-Aware PDF Text Extraction’ (LA-PDFText system to facilitate accurate extraction of text from PDF files of research articles for use in text mining applications. Results Our paper describes the construction and performance of an open source system that extracts text blocks from PDF-formatted full-text research articles and classifies them into logical units based on rules that characterize specific sections. The LA-PDFText system focuses only on the textual content of the research articles and is meant as a baseline for further experiments into more advanced extraction methods that handle multi-modal content, such as images and graphs. The system works in a three-stage process: (1 Detecting contiguous text blocks using spatial layout processing to locate and identify blocks of contiguous text, (2 Classifying text blocks into rhetorical categories using a rule-based method and (3 Stitching classified text blocks together in the correct order resulting in the extraction of text from section-wise grouped blocks. We show that our system can identify text blocks and classify them into rhetorical categories with Precision1 = 0.96% Recall = 0.89% and F1 = 0.91%. We also present an evaluation of the accuracy of the block detection algorithm used in step 2. Additionally, we have compared the accuracy of the text extracted by LA-PDFText to the text from the Open Access subset of PubMed Central. We then compared this accuracy with that of the text extracted by the PDF2Text system, 2commonly used to extract text from PDF

  15. Methods for Mining and Summarizing Text Conversations

    CERN Document Server

    Carenini, Giuseppe; Murray, Gabriel

    2011-01-01

    Due to the Internet Revolution, human conversational data -- in written forms -- are accumulating at a phenomenal rate. At the same time, improvements in speech technology enable many spoken conversations to be transcribed. Individuals and organizations engage in email exchanges, face-to-face meetings, blogging, texting and other social media activities. The advances in natural language processing provide ample opportunities for these "informal documents" to be analyzed and mined, thus creating numerous new and valuable applications. This book presents a set of computational methods

  16. CCM: A Text Classification Method by Clustering

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    In this paper, a new Cluster based Classification Model (CCM) for suspicious email detection and other text classification tasks, is presented. Comparative experiments of the proposed model against traditional classification models and the boosting algorithm are also discussed. Experimental results...... show that the CCM outperforms traditional classification models as well as the boosting algorithm for the task of suspicious email detection on terrorism domain email dataset and topic categorization on the Reuters-21578 and 20 Newsgroups datasets. The overall finding is that applying a cluster based...

  17. Intertext: On Connecting Text in the Building Process

    DEFF Research Database (Denmark)

    Christensen, Lars Rune

    2015-01-01

    Actors in the building process are critically dependent on a corpus of written text that draws the distributed work tasks together. This paper introduces, on the basis of a field study, the concepts of corpus, intertext and intertextuality to the analysis of text in cooperative work practice. Thi...... type and the mediated type, may constitute the intertext of a particular task. By employing the concepts of corpus, intertext and intertextuality with respect to the study of the building process, this paper outlines an approach to the investigation of text in cooperative work.......Actors in the building process are critically dependent on a corpus of written text that draws the distributed work tasks together. This paper introduces, on the basis of a field study, the concepts of corpus, intertext and intertextuality to the analysis of text in cooperative work practice....... This paper shows that actors in the building process create intertext (connections) between complementary texts, in a particular situation and for a particular task. This has an integrating effect on the building process. Several types of intertextuality, including the complementary type, the intratextual...

  18. Text Mining Improves Prediction of Protein Functional Sites

    Science.gov (United States)

    Cohn, Judith D.; Ravikumar, Komandur E.

    2012-01-01

    We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. PMID:22393388

  19. Enhancing biomedical text summarization using semantic relation extraction.

    Directory of Open Access Journals (Sweden)

    Yue Shang

    Full Text Available Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1 We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2 We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3 For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.

  20. OntoGene web services for biomedical text mining.

    Science.gov (United States)

    Rinaldi, Fabio; Clematide, Simon; Marques, Hernani; Ellendorff, Tilia; Romacker, Martin; Rodriguez-Esteban, Raul

    2014-01-01

    Text mining services are rapidly becoming a crucial component of various knowledge management pipelines, for example in the process of database curation, or for exploration and enrichment of biomedical data within the pharmaceutical industry. Traditional architectures, based on monolithic applications, do not offer sufficient flexibility for a wide range of use case scenarios, and therefore open architectures, as provided by web services, are attracting increased interest. We present an approach towards providing advanced text mining capabilities through web services, using a recently proposed standard for textual data interchange (BioC). The web services leverage a state-of-the-art platform for text mining (OntoGene) which has been tested in several community-organized evaluation challenges,with top ranked results in several of them.

  1. A Study on Text-Score Disagreement in Online Reviews

    DEFF Research Database (Denmark)

    Fazzolari, Michela; Cozza, Vittoria; Petrocchi, Marinella

    2017-01-01

    expressing different sentiments may feature the same score (and vice-versa), and (2) detecting and analyzing the mismatches between the review content and the actual score may benefit both service providers and consumers, by highlighting specific factors of satisfaction (and dissatisfaction) in texts....... To prove the intuitions, we adopt sentiment analysis techniques and we concentrate on hotel reviews, to find polarity mismatches therein. In particular, we first train a text classifier with a set of annotated hotel reviews, taken from the Booking website. Then, we analyze a large dataset, with around 160k...... between the text polarity and the score, we find that-on a scale of five stars-those reviews ranked with middle scores include a mixture of positive and negative aspects. The approach proposed here, beside acting as a polarity detector, provides an effective selection of reviews-on an initial very large...

  2. Understanding iconic texts from Spanish - literature course in pre - university

    Directory of Open Access Journals (Sweden)

    Marialina Ana García Escobio

    2014-09-01

    Full Text Available It is presented in this article the development of the skills to understand better, which is one of the essential objectives of the teaching of the mother tongue, that is why to understand the image as iconic sign requires taking its value as a system of meaning, but also to sustain their particular difference in front of the purely denotative structures, especially against the model of all semiotic: the linguistic sign in this process of understandin g is necessary to teach students to decode texts of different codes and place them in more complex communicative situations, that is why the present proposal is new; a way to understand iconic texts, from the teaching of Spanish and Literature in preuniver sity, so that the iconic communication is common for all subjects. It is also assumed the historical and cultural approach from Vigotsky and followers, and the developing didactics presented by Castellanos and others.

  3. On the origin of long-range correlations in texts.

    Science.gov (United States)

    Altmann, Eduardo G; Cristadoro, Giampaolo; Esposti, Mirko Degli

    2012-07-17

    The complexity of human interactions with social and natural phenomena is mirrored in the way we describe our experiences through natural language. In order to retain and convey such a high dimensional information, the statistical properties of our linguistic output has to be highly correlated in time. An example are the robust observations, still largely not understood, of correlations on arbitrary long scales in literary texts. In this paper we explain how long-range correlations flow from highly structured linguistic levels down to the building blocks of a text (words, letters, etc..). By combining calculations and data analysis we show that correlations take form of a bursty sequence of events once we approach the semantically relevant topics of the text. The mechanisms we identify are fairly general and can be equally applied to other hierarchical settings.

  4. Measuring complexity with multifractals in texts. Translation effects

    International Nuclear Information System (INIS)

    Ausloos, M.

    2012-01-01

    Highlights: ► Two texts in English and one in Esperanto are transformed into 6 time series. ► D(q) and f(alpha) of such (and shuffled) time series are obtained. ► A model for text construction is presented based on a parametrized Cantor set. ► The model parameters can also be used when examining machine translated texts. ► Suggested extensions to higher dimensions: in 2D image analysis and on hypertexts. - Abstract: Should quality be almost a synonymous of complexity? To measure quality appears to be audacious, even very subjective. It is hereby proposed to use a multifractal approach in order to quantify quality, thus through complexity measures. A one-dimensional system is examined. It is known that (all) written texts can be one-dimensional nonlinear maps. Thus, several written texts by the same author are considered, together with their translation, into an unusual language, Esperanto, and asa baseline their corresponding shuffled versions. Different one-dimensional time series can be used: e.g. (i) one based on word lengths, (ii) the other based on word frequencies; both are used for studying, comparing and discussing the map structure. It is shown that a variety in style can be measured through the D(q) and f(α) curves characterizing multifractal objects. This allows to observe on the one hand whether natural and artificial languages significantly influence the writing and the translation, and whether one author’s texts differ technically from each other. In fact, the f(α) curves of the original texts are similar to each other, but the translated text shows marked differences. However in each case, the f(α) curves are far from being parabolic, – in contrast to the shuffled texts. Moreover, the Esperanto text has more extreme values. Criteria are thereby suggested for estimating a text quality, as if it is a time series only. A model is introduced in order to substantiate the findings: it consists in considering a text as a random Cantor set

  5. Texte, Mathématiques, Philosophie et Sujet

    Directory of Open Access Journals (Sweden)

    Jean-Michel Salanskis

    2004-04-01

    Full Text Available Dans cet article sont menées deux réflexions. La première tente de juger du rapport de la philosophie à sa textualisation d’après le rapport des mathématiques à leur textualisation, et ce à trois niveaux : 1 en essayant de tirer des manières dont le texte mathématique excède sa forme logique des enseignements quant à la pertinence et la viabilité d’une réduction du texte philosophique à sa forme logique ; 2 en posant le problème d’une étude externaliste du texte philosophique à la lumière des difficultés particulières que suscite l’approche externaliste du texte mathématique ; 3 en examinant ce qu’il en est de l’hybridation du philosophique et du mathématique dans certains textes. La seconde porte sur un aspect particulier de la textualisation : sur l’intervention du marqueur du sujet de l’énonciation (« Je » dans les textes philosophiques.Two reflexions are carried out in this article. The first one tries to judge the relationship between philosophy and its textualization after the relationship between mathematics and their textualization at three different levels : 1 trying to draw ways in which the mathematical text exceeds the logical form of teaching with regards to the relevance and the viability of a reduction of the philosophical text up to its logical form ; 2 setting the problem of an externalist study of the philosophical text in the light of the peculiar difficulties aroused by the externalist approach of the mathematical text ; 3 examining what concerns the hybridization of the philosophical and the mathematical in certain texts. The second one deals with a particular aspect of textualization : with the intervention of the marker of enunciation subject (« I » in philosophical texts.

  6. Computational text analysis and reading comprehension exam complexity towards automatic text classification

    CERN Document Server

    Liontou, Trisevgeni

    2014-01-01

    This book delineates a range of linguistic features that characterise the reading texts used at the B2 (Independent User) and C1 (Proficient User) levels of the Greek State Certificate of English Language Proficiency exams in order to help define text difficulty per level of competence. In addition, it examines whether specific reader variables influence test takers' perceptions of reading comprehension difficulty. The end product is a Text Classification Profile per level of competence and a formula for automatically estimating text difficulty and assigning levels to texts consistently and re

  7. Layout-aware text extraction from full-text PDF of scientific articles.

    Science.gov (United States)

    Ramakrishnan, Cartic; Patnia, Abhishek; Hovy, Eduard; Burns, Gully Apc

    2012-05-28

    The Portable Document Format (PDF) is the most commonly used file format for online scientific publications. The absence of effective means to extract text from these PDF files in a layout-aware manner presents a significant challenge for developers of biomedical text mining or biocuration informatics systems that use published literature as an information source. In this paper we introduce the 'Layout-Aware PDF Text Extraction' (LA-PDFText) system to facilitate accurate extraction of text from PDF files of research articles for use in text mining applications. Our paper describes the construction and performance of an open source system that extracts text blocks from PDF-formatted full-text research articles and classifies them into logical units based on rules that characterize specific sections. The LA-PDFText system focuses only on the textual content of the research articles and is meant as a baseline for further experiments into more advanced extraction methods that handle multi-modal content, such as images and graphs. The system works in a three-stage process: (1) Detecting contiguous text blocks using spatial layout processing to locate and identify blocks of contiguous text, (2) Classifying text blocks into rhetorical categories using a rule-based method and (3) Stitching classified text blocks together in the correct order resulting in the extraction of text from section-wise grouped blocks. We show that our system can identify text blocks and classify them into rhetorical categories with Precision1 = 0.96% Recall = 0.89% and F1 = 0.91%. We also present an evaluation of the accuracy of the block detection algorithm used in step 2. Additionally, we have compared the accuracy of the text extracted by LA-PDFText to the text from the Open Access subset of PubMed Central. We then compared this accuracy with that of the text extracted by the PDF2Text system, 2commonly used to extract text from PDF. Finally, we discuss preliminary error analysis for

  8. Validating presupposed versus focused text information.

    Science.gov (United States)

    Singer, Murray; Solar, Kevin G; Spear, Jackie

    2017-04-01

    There is extensive evidence that readers continually validate discourse accuracy and congruence, but that they may also overlook conspicuous text contradictions. Validation may be thwarted when the inaccurate ideas are embedded sentence presuppositions. In four experiments, we examined readers' validation of presupposed ("given") versus new text information. Throughout, a critical concept, such as a truck versus a bus, was introduced early in a narrative. Later, a character stated or thought something about the truck, which therefore matched or mismatched its antecedent. Furthermore, truck was presented as either given or new information. Mismatch target reading times uniformly exceeded the matching ones by similar magnitudes for given and new concepts. We obtained this outcome using different grammatical constructions and with different antecedent-target distances. In Experiment 4, we examined only given critical ideas, but varied both their matching and the main verb's factivity (e.g., factive know vs. nonfactive think). The Match × Factivity interaction closely resembled that previously observed for new target information (Singer, 2006). Thus, readers can successfully validate given target information. Although contemporary theories tend to emphasize either deficient or successful validation, both types of theory can accommodate the discourse and reader variables that may regulate validation.

  9. Robust keyword retrieval method for OCRed text

    Science.gov (United States)

    Fujii, Yusaku; Takebe, Hiroaki; Tanaka, Hiroshi; Hotta, Yoshinobu

    2011-01-01

    Document management systems have become important because of the growing popularity of electronic filing of documents and scanning of books, magazines, manuals, etc., through a scanner or a digital camera, for storage or reading on a PC or an electronic book. Text information acquired by optical character recognition (OCR) is usually added to the electronic documents for document retrieval. Since texts generated by OCR generally include character recognition errors, robust retrieval methods have been introduced to overcome this problem. In this paper, we propose a retrieval method that is robust against both character segmentation and recognition errors. In the proposed method, the insertion of noise characters and dropping of characters in the keyword retrieval enables robustness against character segmentation errors, and character substitution in the keyword of the recognition candidate for each character in OCR or any other character enables robustness against character recognition errors. The recall rate of the proposed method was 15% higher than that of the conventional method. However, the precision rate was 64% lower.

  10. Marketing the sun: Text of presentations

    International Nuclear Information System (INIS)

    1995-04-01

    The title conference and the texts of the presentations give insight into the activities and the developments in the field of solar energy and of the parties involved. The increased interest in the application of solar energy is shown in the subjects dealt with, while they are aimed more at the market than at research. Three parallel sesions and one plenary session were held. In each parallel session 20 presentations were given. In session one on the subjects market development of thermal solar energy, new housing construction in series, minimal energy houses, new solar water heaters, technical and economical options in sunny countries. In session two on the subjects policy, maintenance and renovation of existing houses, solar energy and planning, grid-connected photovoltaic systems, perspectives and experiences in sunny countries. In session three on the subjects solar cells, autonomous photovoltaic systems, future developemnts of thermal solar energy, solar architecture, market for photovoltaic systems in sunny countries. In the plenary session 3 presentations were held on the subject market for solar energy in newly built houses. Apart from the text of the session presentations 43 posters are presented on the subjects thermal solar energy (8), solar cells (9), autonomous photovoltaic systems (2), grid-connected PV systems (14), policy (2), integration (6), and sunny countries (1). tabs., figs., refs

  11. Text and Subject Position after Althusser

    Directory of Open Access Journals (Sweden)

    Antony Easthope

    1994-01-01

    Full Text Available Althusser's achievement is that he redefined Marxism. He reconceptualizes history and totality in terms of different times, construes knowledge as the outcome of a process of construction, and interprets subjectivity as an effect of ideology and unconscious processes. Unfortunately, Althusser's functionalist view of ideology claims that the subject recognizes itself as a subject because it duplicates— reflects—an absolute subject. However, Lacan's notion of the mirror stage remedies this fault. Lacan's subject always misrecognizes itself in a process of contradiction that threatens the stability of any given social order. Moreover, unlike Foucault's subject, which is limited in that subjectivity is folded back into a vaguely expanded notion of "power," this revised Althusserian subject allows careful reading of texts. The critic does not simply read against the grain; he or she exposes the multiple points of identification offered the reader. For example, Wordsworth's "The Solitary Reaper" installs the reader in multiple positions: a devotee of high culture and the national canon, a lover of the verbal signifier and its play, a consumer of confessional discourse, and a masculine "I" desiring a laboring, singing woman.

  12. General description of magnetic fluctuations in TEXT

    International Nuclear Information System (INIS)

    Kim, Y.J.

    1989-01-01

    The magnetic fluctuations in TEXT (R = 1m, a = 0.26m, ohmically heated tokamak with a full poloidal limiter) have been extensively measured with magnetic probes in the shadow of the limiter with an instrumental range of f -1 (m rms p (f > 50kHz) at the limiter radius is found to be of order 10 -5 T, which is too small to produce significant transport directly. Over the range of discharge parameters in TEXT, the B rms p (f > 50kHz) is observed to have a strong q a dependence (q a -2.2 ) and also a density dependence (n eo -0.8 ). Furthermore, the magnetic fluctuations show a significant correlation with edge electrostatic density fluctuations measured by Langmiur probe inside the limiter radius, and extending along magnetic field lines. Phase variation of the correlated components suggests k double-prime/k perpendicular ∼ 0.005. The B p rms (f >50kHz) is also found to be little dependent on parallel electric field E double-prime. Magnetic fluctuations in both low and high frequency ranges have been characterized by their response to gas puffing, pellet injection, impurity injection, and the effect of an ergodic magnetic limiter. The behavior of magnetic fluctuations with electron cyclotron resonance heating (ECRH) has been also investigated in detail

  13. El manual como texto Schoolbook as text

    Directory of Open Access Journals (Sweden)

    Agustín Escolano Benito

    2012-12-01

    Full Text Available Este trabajo aborda la cuestión de la identidad del libro escolar como un género textual específico en el contexto de la manualística clásica y moderna, contextualizando los análisis en el marco de la cultura de la escuela tradicional y en la era de la revolución digital y bajo una perspectiva historiográfica y teórica. También plantea el nacimiento y primeros desarrollos de la manualística como campo intelectual y académico y sus contribuciones a la definición de la identidad del libro escolar.This paper discusses the question of identifying a coursebook as a specific text genre in the context of the classical and modern manualistics, situating the analysis within the traditional school culture and the digital revolution era, under a historical and theoretical perspective. It also covers the birth and initial development of manualistics as an intelectual and academic field and its contributions to the definition of the schoolbook identity.

  14. Challenges in the interpretation of lyric texts

    Directory of Open Access Journals (Sweden)

    Buljan-Legati Ivana

    2016-01-01

    Full Text Available It is possible, perhaps, to choose the right path to the answer to the questions how poetry has been disappearing over the centuries and has lost its purpose in the ever greater void of outer space and how it has turned from a common and welcome social activity into a phenomenon that will have to leave its fellow-townspeople due to enormous suspicion about the communal language, the world view of the majority and the material world, if first, (at least a rough reconstruction of the sense and nature of continual changes in the poetic mechanism has been done (a more detailed overview would extend the paper enormously, as well as of the changes in style and the reception of poetry, since each choice of a possible linguistic system in a particular historical period soon heralded its own boundary line. From a popular, entertaining and educational genre as a transparent means of social communication, which has brought the individual into a community by generating stable certainty, and gave him the sense of control over his own destiny and meaning, lyrics will outgrow proportionally the aesthetic dimensions of its texts (which will subsequently substitute the foretoken of literacy, becoming less comparable and surmountable, in certain periods almost a nontransferable artistic view. In such circumstances, the public will start to have less understanding and tolerance for its 'weaknesses'.

  15. VisualUrText: A Text Analytics Tool for Unstructured Textual Data

    Science.gov (United States)

    Zainol, Zuraini; Jaymes, Mohd T. H.; Nohuddin, Puteri N. E.

    2018-05-01

    The growing amount of unstructured text over Internet is tremendous. Text repositories come from Web 2.0, business intelligence and social networking applications. It is also believed that 80-90% of future growth data is available in the form of unstructured text databases that may potentially contain interesting patterns and trends. Text Mining is well known technique for discovering interesting patterns and trends which are non-trivial knowledge from massive unstructured text data. Text Mining covers multidisciplinary fields involving information retrieval (IR), text analysis, natural language processing (NLP), data mining, machine learning statistics and computational linguistics. This paper discusses the development of text analytics tool that is proficient in extracting, processing, analyzing the unstructured text data and visualizing cleaned text data into multiple forms such as Document Term Matrix (DTM), Frequency Graph, Network Analysis Graph, Word Cloud and Dendogram. This tool, VisualUrText, is developed to assist students and researchers for extracting interesting patterns and trends in document analyses.

  16. Mobile text messaging solutions for obesity prevention

    Science.gov (United States)

    Akopian, David; Jayaram, Varun; Aaleswara, Lakshmipathi; Esfahanian, Moosa; Mojica, Cynthia; Parra-Medina, Deborah; Kaghyan, Sahak

    2011-02-01

    Cellular telephony has become a bright example of co-evolution of human society and information technology. This trend has also been reflected in health care and health promotion projects which included cell phones in data collection and communication chain. While many successful projects have been realized, the review of phone-based data collection techniques reveals that the existing technologies do not completely address health promotion research needs. The paper presents approaches which close this gap by extending existing versatile platforms. The messaging systems are designed for a health-promotion research to prevent obesity and obesity-related health disparities among low-income Latino adolescent girls. Messaging and polling mechanisms are used to communicate and automatically process response data for the target constituency. Preliminary survey data provide an insight on phone availability and technology perception for the study group.

  17. Text Mining Effectively Scores and Ranks the Literature for Improving Chemical-Gene-Disease Curation at the Comparative Toxicogenomics Database

    Science.gov (United States)

    Johnson, Robin J.; Lay, Jean M.; Lennon-Hopkins, Kelley; Saraceni-Richards, Cynthia; Sciaky, Daniela; Murphy, Cynthia Grondin; Mattingly, Carolyn J.

    2013-01-01

    The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) is a public resource that curates interactions between environmental chemicals and gene products, and their relationships to diseases, as a means of understanding the effects of environmental chemicals on human health. CTD provides a triad of core information in the form of chemical-gene, chemical-disease, and gene-disease interactions that are manually curated from scientific articles. To increase the efficiency, productivity, and data coverage of manual curation, we have leveraged text mining to help rank and prioritize the triaged literature. Here, we describe our text-mining process that computes and assigns each article a document relevancy score (DRS), wherein a high DRS suggests that an article is more likely to be relevant for curation at CTD. We evaluated our process by first text mining a corpus of 14,904 articles triaged for seven heavy metals (cadmium, cobalt, copper, lead, manganese, mercury, and nickel). Based upon initial analysis, a representative subset corpus of 3,583 articles was then selected from the 14,094 articles and sent to five CTD biocurators for review. The resulting curation of these 3,583 articles was analyzed for a variety of parameters, including article relevancy, novel data content, interaction yield rate, mean average precision, and biological and toxicological interpretability. We show that for all measured parameters, the DRS is an effective indicator for scoring and improving the ranking of literature for the curation of chemical-gene-disease information at CTD. Here, we demonstrate how fully incorporating text mining-based DRS scoring into our curation pipeline enhances manual curation by prioritizing more relevant articles, thereby increasing data content, productivity, and efficiency. PMID:23613709

  18. Scaling laws for TEXT plasma profiles

    International Nuclear Information System (INIS)

    McCool, S.C.; Bravenec, R.V.; Chen, J.Y.; Foster, M.S.; Li, W.L.; Ouroura, A.; Phillips, P.E.; Richards, B.; Wenzel, K.W.; Zhang, Z.M.

    1994-01-01

    Regression analysis has been performed on a number of measured profiles including temperature and density vs. nominal macroscopic operating parameters for TEXT tokamak (pre-upgrade) ohmic plasmas. The resulting simple empirical model has enabled the authors to quickly approximate profiles of electron temperature and density, ion temperature, and soft x-ray brightness, as well as the scalar quantities: total radiated power, q=1 radius, sawtooth period and amplitude, and energy confinement time as a power law of toroidal field, plasma current, chord average density, and fueling gas atomic weight. The model profiles are only applicable to the plasma interior, i.e. within the limiter radius. In most cases the predicted model profiles are within the experimental error bars of measured profiles and are more accurate at predicting profile variation for small operating parameter changes than the measured profiles

  19. Politeness Strategies Used in Text Messaging

    Directory of Open Access Journals (Sweden)

    Shahrzad Eshghinejad

    2016-03-01

    Full Text Available One aspect of short message service (SMS communication through a cell phone is use of politeness strategies. As it is extensively argued that females are more polite language users, the present study sought to describe the strategies used by these two groups and to find out whether there is any significant difference between male and female English as a foreign language (EFL learners in the use of positive and negative politeness strategies in sending SMS to their professors, considering that there is an asymmetric power relation and social distance between them. To this end, a corpus of 300 L1 (Persian and L2 (English request messages was compiled. Results of qualitative and quantitative data analysis showed no significant difference between the two groups. Results of the study have implication in politeness research.

  20. Text book of dose calculation for operators

    International Nuclear Information System (INIS)

    Aoyagi, Haruki; Gonda, Kozo

    1979-07-01

    This is a text book of dose calculation for the operators of the reprocessing factory of Power Reactor and Nuclear Fuel Development Corporation. The radiations considered are beta-ray and gamma-ray. The method used is a point attenuation nuclear integral method. Radiation sources are considered as the assemblies of point sources. Dose from each point source is calculated, then, total dose is obtained by the integration for all sources. Attenuation is calculated by considering the attenuation owing to distance and the absorption by absorbers. The build-up factor is introduced for the correction for scattered gamma-ray. The build-up factor is given in a table for various scatterers. The operators are able to calculate dose by themselves. The results of integral calculation expressed with formulas are given in graphs. (Kato, T.)