WorldWideScience

Sample records for science-related informational texts

  1. Text Maps: Helping Students Navigate Informational Texts.

    Science.gov (United States)

    Spencer, Brenda H.

    2003-01-01

    Notes that a text map is an instructional approach designed to help students gain fluency in reading content area materials. Discusses how the goal is to teach students about the important features of the material and how the maps can be used to build new understandings. Presents the procedures for preparing and using a text map. (SG)

  2. Text Genres in Information Organization

    Science.gov (United States)

    Nahotko, Marek

    2016-01-01

    Introduction: Text genres used by so-called information organizers in the processes of information organization in information systems were explored in this research. Method: The research employed text genre socio-functional analysis. Five genre groups in information organization were distinguished. Every genre group used in information…

  3. Informational Text and the CCSS

    Science.gov (United States)

    Aspen Institute, 2012

    2012-01-01

    What constitutes an informational text covers a broad swath of different types of texts. Biographies & memoirs, speeches, opinion pieces & argumentative essays, and historical, scientific or technical accounts of a non-narrative nature are all included in what the Common Core State Standards (CCSS) envisions as informational text. Also included…

  4. Teaching Text Structure: Examining the Affordances of Children's Informational Texts

    Science.gov (United States)

    Jones, Cindy D.; Clark, Sarah K.; Reutzel, D. Ray

    2016-01-01

    This study investigated the affordances of informational texts to serve as model texts for teaching text structure to elementary school children. Content analysis of a random sampling of children's informational texts from top publishers was conducted on text structure organization and on the inclusion of text features as signals of text…

  5. Unsupervised information extraction by text segmentation

    CERN Document Server

    Cortez, Eli

    2013-01-01

    A new unsupervised approach to the problem of Information Extraction by Text Segmentation (IETS) is proposed, implemented and evaluated herein. The authors' approach relies on information available on pre-existing data to learn how to associate segments in the input string with attributes of a given domain relying on a very effective set of content-based features. The effectiveness of the content-based features is also exploited to directly learn from test data structure-based features, with no previous human-driven training, a feature unique to the presented approach. Based on the approach, a

  6. Validating presupposed versus focused text information.

    Science.gov (United States)

    Singer, Murray; Solar, Kevin G; Spear, Jackie

    2017-04-01

    There is extensive evidence that readers continually validate discourse accuracy and congruence, but that they may also overlook conspicuous text contradictions. Validation may be thwarted when the inaccurate ideas are embedded sentence presuppositions. In four experiments, we examined readers' validation of presupposed ("given") versus new text information. Throughout, a critical concept, such as a truck versus a bus, was introduced early in a narrative. Later, a character stated or thought something about the truck, which therefore matched or mismatched its antecedent. Furthermore, truck was presented as either given or new information. Mismatch target reading times uniformly exceeded the matching ones by similar magnitudes for given and new concepts. We obtained this outcome using different grammatical constructions and with different antecedent-target distances. In Experiment 4, we examined only given critical ideas, but varied both their matching and the main verb's factivity (e.g., factive know vs. nonfactive think). The Match × Factivity interaction closely resembled that previously observed for new target information (Singer, 2006). Thus, readers can successfully validate given target information. Although contemporary theories tend to emphasize either deficient or successful validation, both types of theory can accommodate the discourse and reader variables that may regulate validation.

  7. Lexical Information in Memory for Text.

    Science.gov (United States)

    Hayes-Roth, Barbara

    Cued-recall and two-alternative, forced-choice recognition measures were used to evaluate subjects' retention of the specific wordings of studied texts. Results obtained after 10-minute and 24 hour retention intervals suggest that the studied wordings of texts are functional components of their memory representations. Theories that assume…

  8. Text Analysis: Critical Component of Planning for Text-Based Discussion Focused on Comprehension of Informational Texts

    Science.gov (United States)

    Kucan, Linda; Palincsar, Annemarie Sullivan

    2018-01-01

    This investigation focuses on a tool used in a reading methods course to introduce reading specialist candidates to text analysis as a critical component of planning for text-based discussions. Unlike planning that focuses mainly on important text content or information, a text analysis approach focuses both on content and how that content is…

  9. Addressing Information Proliferation: Applications of Information Extraction and Text Mining

    Science.gov (United States)

    Li, Jingjing

    2013-01-01

    The advent of the Internet and the ever-increasing capacity of storage media have made it easy to store, deliver, and share enormous volumes of data, leading to a proliferation of information on the Web, in online libraries, on news wires, and almost everywhere in our daily lives. Since our ability to process and absorb this information remains…

  10. Mining knowledge from text repositories using information extraction ...

    Indian Academy of Sciences (India)

    Information extraction (IE); text mining; text repositories; knowledge discovery from .... general purpose English words. However ... of precision and recall, as extensive experimentation is required due to lack of public tagged corpora. 4. Mining ...

  11. Integrating conflicting information from multiple texts: Effects of prior attitudes and text format

    NARCIS (Netherlands)

    Van Strien, Johan; Brand-Gruwel, Saskia; Boshuizen, Els

    2011-01-01

    Van Strien, J. L. H., Brand-Gruwel, S., & Boshuizen, H. P. A. (2011, August). Integrating conflicting information from multiple texts: Effects of prior attitudes and text format. Round table session presented at the Junior Researchers pre-conference of the biannual meeting of the European

  12. Teaching Scientific Metaphors through Informational Text Read-Alouds

    Science.gov (United States)

    Barnes, Erica M.; Oliveira, Alandeom W.

    2018-01-01

    Elementary students are expected to use various features of informational texts to build knowledge in the content areas. In science informational texts, scientific metaphors are commonly used to make sense of complex and invisible processes. Although elementary students may be familiar with literary metaphors as used in narratives, they may be…

  13. Comprehension and Analysis of Information in Text: I. Construction and Evaluation of Brief Texts.

    Science.gov (United States)

    Kozminsky, Ely; And Others

    This report describes a series of studies designed to construct and validate a set of text materials necessary to the pursuance of a long-term research project on information analysis and integration in semantically rich, naturalistic domains, primarily in the domain of the stock market. The methods and results of six separate experiments on…

  14. Finding Text Information in the Ocean of Electronic Documents

    Energy Technology Data Exchange (ETDEWEB)

    Medvick, Patricia A.; Calapristi, Augustin J.

    2003-02-05

    Information management in natural resources has become an overwhelming task. A massive amount of electronic documents and data is now available for creating informed decisions. The problem is finding the relevant information to support the decision-making process. Determining gaps in knowledge in order to propose new studies or to determine which proposals to fund for maximum potential is a time-consuming and difficult task. Additionally, available data stores are increasing in complexity; they now may include not only text and numerical data, but also images, sounds, and video recordings. Information visualization specialists at Pacific Northwest National Laboratory (PNNL) have software tools for exploring electronic data stores and for discovering and exploiting relationships within data sets. These provide capabilities for unstructured text explorations, the use of data signatures (a compact format for the essence of a set of scientific data) for visualization (Wong et al 2000), visualizations for multiple query results (Havre et al. 2001), and others (http://www.pnl.gov/infoviz ). We will focus on IN-SPIRE, a MS Windows vision of PNNL’s SPIRE (Spatial Paradigm for Information Retrieval and Exploration). IN-SPIRE was developed to assist information analysts find and discover information in huge masses of text documents.

  15. Information Gain Based Dimensionality Selection for Classifying Text Documents

    Energy Technology Data Exchange (ETDEWEB)

    Dumidu Wijayasekara; Milos Manic; Miles McQueen

    2013-06-01

    Selecting the optimal dimensions for various knowledge extraction applications is an essential component of data mining. Dimensionality selection techniques are utilized in classification applications to increase the classification accuracy and reduce the computational complexity. In text classification, where the dimensionality of the dataset is extremely high, dimensionality selection is even more important. This paper presents a novel, genetic algorithm based methodology, for dimensionality selection in text mining applications that utilizes information gain. The presented methodology uses information gain of each dimension to change the mutation probability of chromosomes dynamically. Since the information gain is calculated a priori, the computational complexity is not affected. The presented method was tested on a specific text classification problem and compared with conventional genetic algorithm based dimensionality selection. The results show an improvement of 3% in the true positives and 1.6% in the true negatives over conventional dimensionality selection methods.

  16. Text

    International Nuclear Information System (INIS)

    Anon.

    2009-01-01

    The purpose of this act is to safeguard against the dangers and harmful effects of radioactive waste and to contribute to public safety and environmental protection by laying down requirements for the safe and efficient management of radioactive waste. We will find definitions, interrelation with other legislation, responsibilities of the state and local governments, responsibilities of radioactive waste management companies and generators, formulation of the basic plan for the control of radioactive waste, radioactive waste management ( with public information, financing and part of spent fuel management), Korea radioactive waste management corporation ( business activities, budget), establishment of a radioactive waste fund in order to secure the financial resources required for radioactive waste management, and penalties in case of improper operation of radioactive waste management. (N.C.)

  17. Eye movements during the recollection of text information reflect content rather than the text itself

    DEFF Research Database (Denmark)

    Traub, Franziska; Johansson, Roger; Holmqvist, Kenneth

    Several studies have reported that spontaneous eye movements occur when visuospatial information is recalled from memory. Such gazes closely reflect the content and spatial relations from the original scene layout (e.g., Johansson et al., 2012). However, when someone has originally read a scene....... Recollection was performed orally while gazing at a blank screen. Results demonstrate that participant’s gaze patterns during recall more closely reflect the spatial layout of the scene than the physical locations of the text. Memory data provide evidence that mental models representing either the situation...... description, the memory of the physical layout of the text itself might compete with the memory of the spatial arrangement of the described scene. The present study was designed to address this fundamental issue by having participants read scene descriptions that where manipulated to be either congruent...

  18. Make It Real: Strategies for Success with Informational Texts.

    Science.gov (United States)

    Hoyt, Linda

    This book provides a practical classroom guide to unlocking the treasures of informational texts. It also aims to demonstrate that reading and writing nonfiction can overcome the gender gap, allowing girls and boys to share interests in any subject from bugs and magnets to gardens and cake baking. It explains the use of a range of instructional…

  19. Access to information technology and willingness to receive text ...

    African Journals Online (AJOL)

    Over the past decade, new technologies and methods of communication have ... To determine access to information technology and willingness to receive short message service (SMS) text message reminders for childhood immunisation .... Table 1 shows the attitude of the mothers towards reminders for immunisations.

  20. Access to information technology and willingness to receive text ...

    African Journals Online (AJOL)

    Background. Effective communication is imperative for the delivery and receipt of adequate health care services. Aim. To determine access to information technology and willingness to receive short message service (SMS) text message reminders for childhood immunisation services among mothers in Lagos, Nigeria.

  1. STYLISTIC FEATURES OF ADVERTISING TEXTS OF INFORMATIVE AND COMPARATIVE TYPES

    Directory of Open Access Journals (Sweden)

    Poddubskaya, O.N.

    2016-06-01

    Full Text Available The relevance of this article is related to the fact that nowadays advertising has a very strong impact both on the consumer market, political and cultural life of society, and on the language and its development as a system. Advertising has given rise to the development of a special set of stylistic features of a text, formed under the influence of reviving advertising traditions in the Russian language and under the active impact of energetic and pushy European advertising. The purpose of this study is to explore stylistic features of informative and comparative advertising texts. The object of research is Russian-language advertising in printed media and on television. In the end of the article we made conclusions about groups of language means used for different stylistic devices in informative and comparative advertising texts. Analysis of stylistic features of modern informative and comparative advertising texts can be of great interest to specialists in the field of theoretical studies of modern advertising.

  2. Information Retrieval and Text Mining Technologies for Chemistry.

    Science.gov (United States)

    Krallinger, Martin; Rabal, Obdulia; Lourenço, Anália; Oyarzabal, Julen; Valencia, Alfonso

    2017-06-28

    Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.

  3. PaperBLAST: Text Mining Papers for Information about Homologs.

    Science.gov (United States)

    Price, Morgan N; Arkin, Adam P

    2017-01-01

    Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST's database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. PaperBLAST is available at http://papers.genomics.lbl.gov/. IMPORTANCE With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins' functions.

  4. PaperBLAST: Text Mining Papers for Information about Homologs

    International Nuclear Information System (INIS)

    Price, Morgan N.; Arkin, Adam P.

    2017-01-01

    Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins’ functions.

  5. NAMED ENTITY RECOGNITION FROM BIOMEDICAL TEXT -AN INFORMATION EXTRACTION TASK

    Directory of Open Access Journals (Sweden)

    N. Kanya

    2016-07-01

    Full Text Available Biomedical Text Mining targets the Extraction of significant information from biomedical archives. Bio TM encompasses Information Retrieval (IR and Information Extraction (IE. The Information Retrieval will retrieve the relevant Biomedical Literature documents from the various Repositories like PubMed, MedLine etc., based on a search query. The IR Process ends up with the generation of corpus with the relevant document retrieved from the Publication databases based on the query. The IE task includes the process of Preprocessing of the document, Named Entity Recognition (NER from the documents and Relationship Extraction. This process includes Natural Language Processing, Data Mining techniques and machine Language algorithm. The preprocessing task includes tokenization, stop word Removal, shallow parsing, and Parts-Of-Speech tagging. NER phase involves recognition of well-defined objects such as genes, proteins or cell-lines etc. This process leads to the next phase that is extraction of relationships (IE. The work was based on machine learning algorithm Conditional Random Field (CRF.

  6. Knowledge Dictionary for Information Extraction on the Arabic Text Data

    Directory of Open Access Journals (Sweden)

    Wahyu Jauharis Saputra

    2013-04-01

    Full Text Available Information extraction is an early stage of a process of textual data analysis. Information extraction is required to get information from textual data that can be used for process analysis, such as classification and categorization. A textual data is strongly influenced by the language. Arabic is gaining a significant attention in many studies because Arabic language is very different from others, and in contrast to other languages, tools and research on the Arabic language is still lacking. The information extracted using the knowledge dictionary is a concept of expression. A knowledge dictionary is usually constructed manually by an expert and this would take a long time and is specific to a problem only. This paper proposed a method for automatically building a knowledge dictionary. Dictionary knowledge is formed by classifying sentences having the same concept, assuming that they will have a high similarity value. The concept that has been extracted can be used as features for subsequent computational process such as classification or categorization. Dataset used in this paper was the Arabic text dataset. Extraction result was tested by using a decision tree classification engine and the highest precision value obtained was 71.0% while the highest recall value was 75.0%. 

  7. ONTOGRABBING: Extracting Information from Texts Using Generative Ontologies

    DEFF Research Database (Denmark)

    Nilsson, Jørgen Fischer; Szymczak, Bartlomiej Antoni; Jensen, P.A.

    2009-01-01

    We describe principles for extracting information from texts using a so-called generative ontology in combination with syntactic analysis. Generative ontologies are introduced as semantic domains for natural language phrases. Generative ontologies extend ordinary finite ontologies with rules...... for producing recursively shaped terms representing the ontological content (ontological semantics) of NL noun phrases and other phrases. We focus here on achieving a robust, often only partial, ontology-driven parsing of and ascription of semantics to a sentence in the text corpus. The aim of the ontological...... analysis is primarily to identify paraphrases, thereby achieving a search functionality beyond mere keyword search with synsets. We further envisage use of the generative ontology as a phrase-based rather than word-based browser into text corpora....

  8. Domain-independent information extraction in unstructured text

    Energy Technology Data Exchange (ETDEWEB)

    Irwin, N.H. [Sandia National Labs., Albuquerque, NM (United States). Software Surety Dept.

    1996-09-01

    Extracting information from unstructured text has become an important research area in recent years due to the large amount of text now electronically available. This status report describes the findings and work done during the second year of a two-year Laboratory Directed Research and Development Project. Building on the first-year`s work of identifying important entities, this report details techniques used to group words into semantic categories and to output templates containing selective document content. Using word profiles and category clustering derived during a training run, the time-consuming knowledge-building task can be avoided. Though the output still lacks in completeness when compared to systems with domain-specific knowledge bases, the results do look promising. The two approaches are compatible and could complement each other within the same system. Domain-independent approaches retain appeal as a system that adapts and learns will soon outpace a system with any amount of a priori knowledge.

  9. Texting Styles and Information Change of SMS Text Messages in Filipino

    Science.gov (United States)

    Cabatbat, Josephine Jill T.; Tapang, Giovanni A.

    2013-02-01

    We identify the different styles of texting in Filipino short message service (SMS) texts and analyze the change in unigram and bigram frequencies due to these styles. Style preference vectors for sample texts were calculated and used to identify the style combination used by an average individual. The change in Shannon entropy of the SMS text is explained in light of a coding process.

  10. Automated Extraction of Substance Use Information from Clinical Texts.

    Science.gov (United States)

    Wang, Yan; Chen, Elizabeth S; Pakhomov, Serguei; Arsoniadis, Elliot; Carter, Elizabeth W; Lindemann, Elizabeth; Sarkar, Indra Neil; Melton, Genevieve B

    2015-01-01

    Within clinical discourse, social history (SH) includes important information about substance use (alcohol, drug, and nicotine use) as key risk factors for disease, disability, and mortality. In this study, we developed and evaluated a natural language processing (NLP) system for automated detection of substance use statements and extraction of substance use attributes (e.g., temporal and status) based on Stanford Typed Dependencies. The developed NLP system leveraged linguistic resources and domain knowledge from a multi-site social history study, Propbank and the MiPACQ corpus. The system attained F-scores of 89.8, 84.6 and 89.4 respectively for alcohol, drug, and nicotine use statement detection, as well as average F-scores of 82.1, 90.3, 80.8, 88.7, 96.6, and 74.5 respectively for extraction of attributes. Our results suggest that NLP systems can achieve good performance when augmented with linguistic resources and domain knowledge when applied to a wide breadth of substance use free text clinical notes.

  11. Evaluating Text-Based Information on the World Wide Web

    Science.gov (United States)

    Wopereis, Iwan G. J. H.; van Merrienboer, Jeroen J. G.

    2011-01-01

    This special section contributes to an inclusive cognitive model of information problem solving (IPS) activity, touches briefly IPS learning, and brings to the notice methodological pitfalls related to uncovering IPS processes. Instead of focusing on the IPS process as a whole, the contributing articles turn their attention to what is regarded the…

  12. TXTGate: profiling gene groups with text-based information

    DEFF Research Database (Denmark)

    Glenisson, P.; Coessens, B.; Van Vooren, S.

    2004-01-01

    We implemented a framework called TXTGate that combines literature indices of selected public biological resources in a flexible text-mining system designed towards the analysis of groups of genes. By means of tailored vocabularies, term-as well as gene-centric views are offered on selected textual...

  13. Context and Domain Knowledge Enhanced Entity Spotting in Informal Text

    Science.gov (United States)

    Gruhl, Daniel; Nagarajan, Meena; Pieper, Jan; Robson, Christine; Sheth, Amit

    This paper explores the application of restricted relationship graphs (RDF) and statistical NLP techniques to improve named entity annotation in challenging Informal English domains. We validate our approach using on-line forums discussing popular music. Named entity annotation is particularly difficult in this domain because it is characterized by a large number of ambiguous entities, such as the Madonna album "Music" or Lilly Allen's pop hit "Smile".

  14. Text de-identification for privacy protection: a study of its impact on clinical text information content.

    Science.gov (United States)

    Meystre, Stéphane M; Ferrández, Óscar; Friedlin, F Jeffrey; South, Brett R; Shen, Shuying; Samore, Matthew H

    2014-08-01

    As more and more electronic clinical information is becoming easier to access for secondary uses such as clinical research, approaches that enable faster and more collaborative research while protecting patient privacy and confidentiality are becoming more important. Clinical text de-identification offers such advantages but is typically a tedious manual process. Automated Natural Language Processing (NLP) methods can alleviate this process, but their impact on subsequent uses of the automatically de-identified clinical narratives has only barely been investigated. In the context of a larger project to develop and investigate automated text de-identification for Veterans Health Administration (VHA) clinical notes, we studied the impact of automated text de-identification on clinical information in a stepwise manner. Our approach started with a high-level assessment of clinical notes informativeness and formatting, and ended with a detailed study of the overlap of select clinical information types and Protected Health Information (PHI). To investigate the informativeness (i.e., document type information, select clinical data types, and interpretation or conclusion) of VHA clinical notes, we used five different existing text de-identification systems. The informativeness was only minimally altered by these systems while formatting was only modified by one system. To examine the impact of de-identification on clinical information extraction, we compared counts of SNOMED-CT concepts found by an open source information extraction application in the original (i.e., not de-identified) version of a corpus of VHA clinical notes, and in the same corpus after de-identification. Only about 1.2-3% less SNOMED-CT concepts were found in de-identified versions of our corpus, and many of these concepts were PHI that was erroneously identified as clinical information. To study this impact in more details and assess how generalizable our findings were, we examined the overlap between

  15. Extracting of implicit information in English advertising texts with phonetic and lexical-morphological means

    Directory of Open Access Journals (Sweden)

    Traikovskaya Natalya Petrovna

    2015-12-01

    Full Text Available The article deals with phonetic and lexical-morphological language means participating in the process of extracting implicit information in English-speaking advertising texts for men and women. The functioning of phonetic means of the English language is not the basis for implication of information in advertising texts. Lexical and morphological means play the role of markers of relevant information, playing the role of the activator ofimplicit information in the texts of advertising.

  16. Informational Text Comprehension: Its Challenges and How Collaborative Strategic Reading Can Help

    Science.gov (United States)

    McCown, Margaret Averill; Thomason, Gina B.

    2014-01-01

    With the increased emphasis on informational text with Common Core State Standards and the difficulty many students have with this type of text, this study examined the effects of Collaborative Strategic Reading (CSR) on informational text comprehension and metacognitive awareness of fifth grade students. Participating students included a…

  17. Automatically Tracing Information Flow of Vulnerability and Cyber-Attack Information through Text Strings

    National Research Council Canada - National Science Library

    Rowe, Neil C; Sjoberg, Eric; Adams, Paige

    2008-01-01

    ... of it. We are developing data mining techniques to track the flow of such information by comparing important information-security Web sites, alert messages, and strings in packets to find similar words and sentences...

  18. Tagline: Information Extraction for Semi-Structured Text Elements in Medical Progress Notes

    Science.gov (United States)

    Finch, Dezon Kile

    2012-01-01

    Text analysis has become an important research activity in the Department of Veterans Affairs (VA). Statistical text mining and natural language processing have been shown to be very effective for extracting useful information from medical documents. However, neither of these techniques is effective at extracting the information stored in…

  19. Information of the public in an emergency and preparation of text blocks

    International Nuclear Information System (INIS)

    Miska, H.

    1997-01-01

    In addition to the advance information, the EU also demands and regulates the information in an emergency. A prompt dissemination of the required news is facilitated by text blocks which can be prepared and harmonised with neighbouring administrations. Not before a press Center has been established, detailed texts may be compiled. (orig.) [de

  20. Selecting Information to Answer Questions: Strategic Individual Differences when Searching Texts

    Science.gov (United States)

    Cerdan, Raquel; Gilabert, Ramiro; Vidal-Abarca, Eduardo

    2011-01-01

    The purpose of the study was to explore students' selection of information strategies in a task-oriented reading situation. 72 secondary school students read two texts and answered six questions per text, three of which were manipulated to induce a misleading matching between the wording of the question and distracting pieces of information in the…

  1. Children’s comprehension of informational text: Reading, engaging, and learning

    Directory of Open Access Journals (Sweden)

    Linda BAKER

    2011-11-01

    Full Text Available The Reading, Engaging, and Learning project (REAL investigated whether a classroom intervention that enhanced young children's experience with informational books would increase reading achievement and engagement. Participants attended schools serving low income neighborhoods with 86% African American enrollment. The longitudinal study spanned second through fourth grades. Treatment conditions were: (1 Text Infusion/Reading for Learning Instruction -- students were given greater access to informational books in their classroom libraries and in reading instruction; (2 Text Infusion Alone -- the same books were provided but teachers were not asked to alter their instruction; (3 Traditional Instruction -- students experienced business as usual in the classroom. Children were assessed each year on measures of reading and reading engagement, and classroom instructional practices were observed. On most measures, the informational text infusion intervention did not yield differential growth over time. However, the results inform efforts to increase children’s facility with informational text in the early years in order to improve reading comprehension.

  2. Have Recommended Book Lists Changed to Reflect Current Expectations for Informational Text in K-3 Classrooms?

    Science.gov (United States)

    Dreher, Mariam Jean; Kletzien, Sharon B.

    2016-01-01

    Despite both longstanding and recent calls for more informational text in K-3 classrooms, research indicates that narrative text remains in the majority for read alouds, classroom libraries, and instruction, thus limiting children's opportunity to experience the demands of expository text. Because national associations' recommended book lists are…

  3. Fourth and fifth grade Latino(a) students making meaning of scientific informational texts

    Science.gov (United States)

    Croce, Keri-Anne

    Using a socio-psycholinguistic perspective of literacy and a social-semiotic analysis of texts, this study investigates how six students made meaning of informational texts. The students came to school from a variety of English and Spanish language backgrounds. The research question being asked was 'How do Latino(a) fourth and fifth grade students make meaning of English informational texts?' Miscue analysis was used as a tool to investigate how students who have been labeled non-struggling readers by their classroom teacher and are from various language backgrounds approached five informational texts. In order to investigate students' responses to the nature of informational texts, this dissertation draws on commonly occurring structures within texts. Primary data collected included read alouds and retellings of five texts, retrospective miscue analysis, and interviews with six participant students. Two of these participants are discussed within this dissertation. Secondary data included classroom observations and teacher interviews. This study proposes that non-native speakers may use scientific concept placeholders as they transact with informational texts. The use of scientific concept placeholders by a reader indicates that the reader is engaged in the meaning making process and possesses evolving scientific knowledge about a phenomenon. The findings suggest that Latino(a) students' understandings of English informational texts is influenced not only by a student's language development but also (1) the nature of the text; (2) the reading strategies that a student uses, such as the use of placeholders; (3) the influence of the researcher during the aided retelling. This study contributes methodological tools to assess English language learners' reading. The conclusions presented within this study also support the idea that students from a variety of language backgrounds slightly altered their reliance on certain cuing systems as they encountered various sub

  4. Effects of Surrounding Information and Line Length on Text Comprehension from the Web

    Directory of Open Access Journals (Sweden)

    Jess McMullin

    2002-02-01

    Full Text Available The World Wide Web (Web is becoming a popular medium for transmission of information and online learning. We need to understand how people comprehend information from the Web to design Web sites that maximize the acquisition of information. We examined two features of Web page design that are easily modified by developers, namely line length and the amount of surrounding information, or whitespace. Undergraduate university student participants read text and answered comprehension questions on the Web. Comprehension was affected by whitespace; participants had better comprehension for information surrounded by whitespace than for information surrounded by meaningless information. Participants were not affected by line length. These findings demonstrate that reading from the Web is not the same as reading print and have implications for instructional Web design.

  5. Representing nested semantic information in a linear string of text using XML.

    Science.gov (United States)

    Krauthammer, Michael; Johnson, Stephen B; Hripcsak, George; Campbell, David A; Friedman, Carol

    2002-01-01

    XML has been widely adopted as an important data interchange language. The structure of XML enables sharing of data elements with variable degrees of nesting as long as the elements are grouped in a strict tree-like fashion. This requirement potentially restricts the usefulness of XML for marking up written text, which often includes features that do not properly nest within other features. We encountered this problem while marking up medical text with structured semantic information from a Natural Language Processor. Traditional approaches to this problem separate the structured information from the actual text mark up. This paper introduces an alternative solution, which tightly integrates the semantic structure with the text. The resulting XML markup preserves the linearity of the medical texts and can therefore be easily expanded with additional types of information.

  6. Lithium NLP: A System for Rich Information Extraction from Noisy User Generated Text on Social Media

    OpenAIRE

    Bhargava, Preeti; Spasojevic, Nemanja; Hu, Guoning

    2017-01-01

    In this paper, we describe the Lithium Natural Language Processing (NLP) system - a resource-constrained, high- throughput and language-agnostic system for information extraction from noisy user generated text on social media. Lithium NLP extracts a rich set of information including entities, topics, hashtags and sentiment from text. We discuss several real world applications of the system currently incorporated in Lithium products. We also compare our system with existing commercial and acad...

  7. Validation and Comprehension of Text Information: Two Sides of the Same Coin

    Science.gov (United States)

    Richter, Tobias

    2015-01-01

    In psychological research, the comprehension of linguistic information and the knowledge-based assessment of its validity are often regarded as two separate stages of information processing. Recent findings in psycholinguistics and text comprehension research call this two-stage model into question. In particular, validation can affect…

  8. Semi-supervised probabilistics approach for normalising informal short text messages

    CSIR Research Space (South Africa)

    Modupe, A

    2017-03-01

    Full Text Available The growing use of informal social text messages on Twitter is one of the known sources of big data. These type of messages are noisy and frequently rife with acronyms, slangs, grammatical errors and non-standard words causing grief for natural...

  9. Text Processing of Domain-Related Information for Individuals with High and Low Domain Knowledge.

    Science.gov (United States)

    Spilich, George J.; And Others

    1979-01-01

    The way in which previously acquired knowledge affects the processing on new domain-related information was investigated. Text processing was studied in two groups differing in knowledge of the domain of baseball. A knowledge structure for the domain was constructed, and text propositions were classified. (SW)

  10. A Comparison of Two Strategies for Teaching Third Graders to Summarize Information Text

    Science.gov (United States)

    Dromsky, Ann Marie

    2011-01-01

    Summarizing text is one of the most effective comprehension strategies (National Institute of Child Health and Human Development, 2000) and an effective way to learn from information text (Dole, Duffy, Roehler, & Pearson, 1991; Pressley & Woloshyn, 1995). In addition, much research supports the explicit instruction of such strategies as…

  11. Schematizing and Processing Informational Texts with Mind Maps in Fifth and Sixth Grade

    Science.gov (United States)

    Merchie, Emmelien; Van Keer, Hilde

    2013-01-01

    From the age of 11-13, children start to spend increasingly more time on learning from texts. The need arises to support them in dealing with this text information and engaging them in self-regulated learning (SRL). This study is embedded within the cognitive component of SRL and focuses on mind mapping as a promising organizational learning…

  12. An Introduction to Topic Modeling as an Unsupervised Machine Learning Way to Organize Text Information

    Science.gov (United States)

    Snyder, Robin M.

    2015-01-01

    The field of topic modeling has become increasingly important over the past few years. Topic modeling is an unsupervised machine learning way to organize text (or image or DNA, etc.) information such that related pieces of text can be identified. This paper/session will present/discuss the current state of topic modeling, why it is important, and…

  13. METHODS OF TEXT INFORMATION CLASSIFICATION ON THE BASIS OF ARTIFICIAL NEURAL AND SEMANTIC NETWORKS

    Directory of Open Access Journals (Sweden)

    L. V. Serebryanaya

    2016-01-01

    Full Text Available The article covers the use of perseptron, Hopfild artificial neural network and semantic network for classification of text information. Network training algorithms are studied. An algorithm of inverse mistake spreading for perceptron network and convergence algorithm for Hopfild network are implemented. On the basis of the offered models and algorithms automatic text classification software is developed and its operation results are evaluated.

  14. Scene text detection by leveraging multi-channel information and local context

    Science.gov (United States)

    Wang, Runmin; Qian, Shengyou; Yang, Jianfeng; Gao, Changxin

    2018-03-01

    As an important information carrier, texts play significant roles in many applications. However, text detection in unconstrained scenes is a challenging problem due to cluttered backgrounds, various appearances, uneven illumination, etc.. In this paper, an approach based on multi-channel information and local context is proposed to detect texts in natural scenes. According to character candidate detection plays a vital role in text detection system, Maximally Stable Extremal Regions(MSERs) and Graph-cut based method are integrated to obtain the character candidates by leveraging the multi-channel image information. A cascaded false positive elimination mechanism are constructed from the perspective of the character and the text line respectively. Since the local context information is very valuable for us, these information is utilized to retrieve the missing characters for boosting the text detection performance. Experimental results on two benchmark datasets, i.e., the ICDAR 2011 dataset and the ICDAR 2013 dataset, demonstrate that the proposed method have achieved the state-of-the-art performance.

  15. A Review Paper On Exploring Text Link And Spacial-Temporal Information In Social Media Networks

    Directory of Open Access Journals (Sweden)

    Dr. Mamta Madan

    2015-03-01

    Full Text Available ABSTRACT The objective of this paper is to have a literature review on the various methods to mine the knowledge from the social media by taking advantage of embedded heterogeneous information. Specifically we are trying to review different types of mining framework which provides us useful information from these networks that have heterogeneous data types including text spacial-temporal and data association LINK information. Firstly we will discuss the link mining to study the link structure with respect to Social Media SM. Secondly we summarize the various text mining models thirdly we shall review spacial as well the temporal models to extract or detect the frequent related topics from SM. Fourthly we will try to figure out few improvised models that take advantage of the link textual temporal and spacial information which motivates to discover progressive principles and fresh methodologies for DM Data Mining in social media networks SMNs.

  16. Undergraduate female science-related career choices: A phenomenological study

    Science.gov (United States)

    Curry, Kathy S.

    This qualitative phenomenological study used a modified Groenewald's five steps method with semi-structured, recorded, and transcribed interviews to focus on the underrepresentation of females in science-related careers. The study explored the lived experiences of a purposive sample of 25 senior female college students attending a college in Macon, Georgia. Ten major themes emerged from the research study that included (a) journey to a science-related career; (b) realization of career interest; (c) family support (d) society's role; (e) professors' treatment of students; (f) lack of mentors and models; (g) gender and career success; (h) females and other disadvantages in science-related careers; (i) rewards of the journey; and (j) advice for the journey. The three minor themes identified were (a) decision-making; (b) career awareness; and (c) guidance. The key findings revealed that females pursuing a science degree or subsequent science-related career, shared their experience with other females interested in science as a career choice, dealt with barriers standing in the way of their personal goals, lack role models, and received little or no support from family and friends. The study findings may offer information to female college students interested in pursuing science-related careers and further foundational research on gender disparities in career choice.

  17. Pre- and in-Service Teachers Reading and Discussing Informational Texts

    Directory of Open Access Journals (Sweden)

    Theresa A. Deeney

    2016-05-01

    Full Text Available This study investigates U.S. elementary (kindergarten-Grade 6, ages 5-12 pre- and in-service teachers’ discussions of informational texts to understand current practices and identify needs with respect to how teachers support students in building knowledge from complex informational text as specified in the grade-level expectations of the Common Core State Standards adopted in many U.S. states. Transcripts and reflections from 17 in-service and 31 pre-service teachers’ informational text discussions were analyzed for teachers’ focus on the text, background knowledge, and text/background knowledge. In addition, transcripts were analyzed for the types of text ideas teachers targeted (details/main ideas, the comprehension demands placed on students, how teachers used follow-up moves to encourage higher level thinking, and how teachers use transcripts of their discussions to analyze and critique their own practice. Findings suggest that both pre- and in-service teachers draw heavily on students’ background knowledge and text details in their questioning; but differences exist in how pre- and in-service teachers use follow-up responses to promote knowledge building. Findings also suggest that both pre- and in-service teachers can use their transcripts to recognize areas of need, and offer themselves suggestions to better support students’ understanding. Implications are offered for teacher education and professional development.

  18. Using text mining techniques to extract phenotypic information from the PhenoCHF corpus.

    Science.gov (United States)

    Alnazzawi, Noha; Thompson, Paul; Batista-Navarro, Riza; Ananiadou, Sophia

    2015-01-01

    Phenotypic information locked away in unstructured narrative text presents significant barriers to information accessibility, both for clinical practitioners and for computerised applications used for clinical research purposes. Text mining (TM) techniques have previously been applied successfully to extract different types of information from text in the biomedical domain. They have the potential to be extended to allow the extraction of information relating to phenotypes from free text. To stimulate the development of TM systems that are able to extract phenotypic information from text, we have created a new corpus (PhenoCHF) that is annotated by domain experts with several types of phenotypic information relating to congestive heart failure. To ensure that systems developed using the corpus are robust to multiple text types, it integrates text from heterogeneous sources, i.e., electronic health records (EHRs) and scientific articles from the literature. We have developed several different phenotype extraction methods to demonstrate the utility of the corpus, and tested these methods on a further corpus, i.e., ShARe/CLEF 2013. Evaluation of our automated methods showed that PhenoCHF can facilitate the training of reliable phenotype extraction systems, which are robust to variations in text type. These results have been reinforced by evaluating our trained systems on the ShARe/CLEF corpus, which contains clinical records of various types. Like other studies within the biomedical domain, we found that solutions based on conditional random fields produced the best results, when coupled with a rich feature set. PhenoCHF is the first annotated corpus aimed at encoding detailed phenotypic information. The unique heterogeneous composition of the corpus has been shown to be advantageous in the training of systems that can accurately extract phenotypic information from a range of different text types. Although the scope of our annotation is currently limited to a single

  19. The Distribution of the Informative Intensity of the Text in Terms of its Structure (On Materials of the English Texts in the Mining Sphere)

    Science.gov (United States)

    Znikina, Ludmila; Rozhneva, Elena

    2017-11-01

    The article deals with the distribution of informative intensity of the English-language scientific text based on its structural features contributing to the process of formalization of the scientific text and the preservation of the adequacy of the text with derived semantic information in relation to the primary. Discourse analysis is built on specific compositional and meaningful examples of scientific texts taken from the mining field. It also analyzes the adequacy of the translation of foreign texts into another language, the relationships between elements of linguistic systems, the degree of a formal conformance, translation with the specific objectives and information needs of the recipient. Some key words and ideas are emphasized in the paragraphs of the English-language mining scientific texts. The article gives the characteristic features of the structure of paragraphs of technical text and examples of constructions in English scientific texts based on a mining theme with the aim to explain the possible ways of their adequate translation.

  20. An improved algorithm for information hiding based on features of Arabic text: A Unicode approach

    Directory of Open Access Journals (Sweden)

    A.A. Mohamed

    2014-07-01

    Full Text Available Steganography means how to hide secret information in a cover media, so that other individuals fail to realize their existence. Due to the lack of data redundancy in the text file in comparison with other carrier files, text steganography is a difficult problem to solve. In this paper, we proposed a new promised steganographic algorithm for Arabic text based on features of Arabic text. The focus is on more secure algorithm and high capacity of the carrier. Our extensive experiments using the proposed algorithm resulted in a high capacity of the carrier media. The embedding capacity rate ratio of the proposed algorithm is high. In addition, our algorithm can resist traditional attacking methods since it makes the changes in carrier text as minimum as possible.

  1. Using WordNet to Complement Training Information in Text Categorization

    OpenAIRE

    Rodriguez, Manuel de Buenaga; Hidalgo, Jose Maria Gomez; Agudo, Belen Diaz

    1997-01-01

    Automatic Text Categorization (TC) is a complex and useful task for many natural language applications, and is usually performed through the use of a set of manually classified documents, a training collection. We suggest the utilization of additional resources like lexical databases to increase the amount of information that TC systems make use of, and thus, to improve their performance. Our approach integrates WordNet information with two training approaches through the Vector Space Model. ...

  2. Representing nested semantic information in a linear string of text using XML.

    OpenAIRE

    Krauthammer, Michael; Johnson, Stephen B.; Hripcsak, George; Campbell, David A.; Friedman, Carol

    2002-01-01

    XML has been widely adopted as an important data interchange language. The structure of XML enables sharing of data elements with variable degrees of nesting as long as the elements are grouped in a strict tree-like fashion. This requirement potentially restricts the usefulness of XML for marking up written text, which often includes features that do not properly nest within other features. We encountered this problem while marking up medical text with structured semantic information from a N...

  3. Developing resources for sentiment analysis of informal Arabic text in social media

    OpenAIRE

    Itani, Maher; Roast, Chris; Al-Khayatt, Samir

    2017-01-01

    Natural Language Processing (NLP) applications such as text categorization, machine translation, sentiment analysis, etc., need annotated corpora and lexicons to check quality and performance. This paper describes the development of resources for sentiment analysis specifically for Arabic text in social media. A distinctive feature of the corpora and lexicons developed are that they are determined from informal Arabic that does not conform to grammatical or spelling standards. We refer to Ara...

  4. LanguageNet: A Novel Framework for Processing Unstructured Text Information

    DEFF Research Database (Denmark)

    Qureshi, Pir Abdul Rasool; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    In this paper we present LanguageNet—a novel framework for processing unstructured text information from human generated content. The state of the art information processing frameworks have some shortcomings: modeled in generalized form, trained on fixed (limited) data sets, and leaving...... the specialization necessary for information consolidation to the end users. The proposed framework is the first major attempt to address these shortcomings. LanguageNet provides extended support of graphical methods contributing added value to the capabilities of information processing. We discuss the benefits...... of the framework and compare it with the available state of the art. We also describe how the framework improves the information gathering process and contribute towards building systems with better performance in the domain of Open Source Intelligence....

  5. Ontology-based retrieval of bio-medical information based on microarray text corpora

    DEFF Research Database (Denmark)

    Hansen, Kim Allan; Zambach, Sine; Have, Christian Theil

    are exponentially growing, the text corpora are sparse and inconsistent in spite of attempts to standardize the format. Ordinary keyword search may in some cases be insucient to nd rele- vant information and the potential benet of using a semantic approach in this context has only been investigated to a limited...

  6. Fixed versus dynamic co-occurrence windows in TextRank term weights for information retrieval

    DEFF Research Database (Denmark)

    Lu, Wei; Cheng, Qikai; Lioma, Christina

    2012-01-01

    iteratively is a score for each vertex, i.e. a term weight, that can be used for information retrieval (IR) just like conventional term frequency based term weights. So far, when computing TextRank term weights over co-occurrence graphs, the window of term co-occurrence is always fixed. This work departs from...

  7. Network and Ensemble Enabled Entity Extraction in Informal Text (NEEEEIT) final report

    Energy Technology Data Exchange (ETDEWEB)

    Kegelmeyer, Philip W. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Shead, Timothy M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Dunlavy, Daniel M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2013-09-01

    This SAND report summarizes the activities and outcomes of the Network and Ensemble Enabled Entity Extraction in Information Text (NEEEEIT) LDRD project, which addressed improving the accuracy of conditional random fields for named entity recognition through the use of ensemble methods.

  8. The Relationship between Teacher Attitudes toward the Common Core State Standards and Informational Text

    Science.gov (United States)

    Estruch, Marcie Jane

    2018-01-01

    This study sought to determine the relationship between teachers' attitudes toward the Common Core State Standards and three predetermined factors. These factors were (1) teachers' attitudes toward the practicality of pedagogical shift three, balancing informational and literary texts, (2) teachers' attitudes toward school support with the…

  9. Using Digital Think-Alouds to Build Comprehension of Online Informational Texts

    Science.gov (United States)

    White, Amber

    2016-01-01

    The ability to navigate and comprehend online informational text is essential for 21st century learners. It requires orchestrating a mix of old and new reading strategies--and it's easy for teachers and students to feel overwhelmed! This column describes a way of combining student screencasting with Reciprocal Teaching to help students--and…

  10. A proposal for a drug information database and text templates for generating package inserts

    Directory of Open Access Journals (Sweden)

    Okuya R

    2013-07-01

    Full Text Available Ryo Okuya,1 Masaomi Kimura,2 Michiko Ohkura,2 Fumito Tsuchiya3 1Graduate School of Engineering and Science, 2Faculty of Engineering, Shibaura Institute of Technology, Tokyo, 3School of Pharmacy, International University of Health and Welfare, Tokyo, Japan Abstract: To prevent prescription errors caused by information systems, a database to store complete and accurate drug information in a user-friendly format is needed. In previous studies, the primary method for obtaining data stored in a database is to extract drug information from package inserts by employing pattern matching or more sophisticated methods such as text mining. However, it is difficult to obtain a complete database because there is no strict rule concerning expressions used to describe drug information in package inserts. The authors' strategy was to first build a database and then automatically generate package inserts by embedding data in the database using templates. To create this database, the support of pharmaceutical companies to input accurate data is required. It is expected that this system will work, because these companies can earn merit for newly developed drugs to decrease the effort to create package inserts from scratch. This study designed the table schemata for the database and text templates to generate the package inserts. To handle the variety of drug-specific information in the package inserts, this information in drug composition descriptions was replaced with labels and the replacement descriptions utilizing cluster analysis were analyzed. To improve the method by which frequently repeated ingredient information and/or supplementary information are stored, the method was modified by introducing repeat tags in the templates to indicate repetition and improving the insertion of data into the database. The validity of this method was confirmed by inputting the drug information described in existing package inserts and checking that the method could

  11. SERVICES OF FULL-TEXT SEARCHING IN A DISTRIBUTED INFORMATION ENVIRONMENT (PROJECT HUMANITARIANA

    Directory of Open Access Journals (Sweden)

    S. K. Lyapin

    2015-01-01

    Full Text Available Problem statement. We justify the possibility of full-text search services application in both universal and specialized (in terms of resource base digital libraries for the extraction and analysis of the context knowledge in the humanities. The architecture and services of virtual information and resource center for extracting knowledge from the humanitarian texts generated by «Humanitariana» project are described. The functional integration of the resources and services for a full-text search in a distributed decentralized environment, organized in the Internet / Intranet architecture under the control of the client (user browser accessing a variety of independent servers. An algorithm for a distributed full-text query implementation is described. Methods. Method of combining requency-ranked and paragraph-oriented full-text queries is used: the first are used for the preliminary analysis of the subject area or a combination product (explication of "vertical" context, or macro context, the second - for the explication of "horizontal" context, or micro context within copyright paragraph. The results of the frequency-ranked queries are used to compile paragraph-oriented queries. Results. The results of textual research are shown on the topics "The question of fact in Russian philosophy", "The question of loneliness in Russian philosophy and culture". About 50 pieces of context knowledge on the total resource base of about 2,500 full-text resources have been explicated and briefly described to their further expert investigating. Practical significance. The proposed technology (advanced full-text searching services in a distributed information environment can be used for the information support of humanitarian studies and education in the humanities, for functional integration of resources and services of various organizations, for carrying out interdisciplinary research.

  12. Harmony between the information contained in the text and figures of Brazilian companies’ annual reports

    Directory of Open Access Journals (Sweden)

    Marcelo Sanches Pagliarussi

    2015-03-01

    Full Text Available The purpose of this study is to analyze the harmony between information provided in the narrative sections of annual reports and the corporate financial performance. The sample consisted of 120 companies listed on BM&FBovespa (Brazilian Stock Exchange in 2009 -- 60 with greater positive variation and 60 with greater negative variation in net accounting profit. Keywords relating to three central topics were selected: profitability, growth and management. Through content analysis, the meaning of these keywords in the reports was evaluated (quantitative positive, qualitative positive, quantitative negative and qualitative negative. Once the frequencies were obtained, two logistic regressions were performed to compare the text to the numbers, one on the raw frequency of words and another by a weighting of terms. The results indicate that, in the information linked to the topic of profitability, the text is harmonious with the numbers. In the information relating to growth, harmony is partial. In addition, when it comes to information linked to management, there is conflict between the narrative sections and corporate performance. Finally, it was found that, the more subjective the information, the greater the conflict.

  13. The effects of two health information texts on patient recognition memory: a randomized controlled trial.

    Science.gov (United States)

    Freed, Erin; Long, Debra; Rodriguez, Tonantzin; Franks, Peter; Kravitz, Richard L; Jerant, Anthony

    2013-08-01

    To compare the effects of two health information texts on patient recognition memory, a key aspect of comprehension. Randomized controlled trial (N=60), comparing the effects of experimental and control colorectal cancer (CRC) screening texts on recognition memory, measured using a statement recognition test, accounting for response bias (score range -0.91 to 5.34). The experimental text had a lower Flesch-Kincaid reading grade level (7.4 versus 9.6), was more focused on addressing screening barriers, and employed more comparative tables than the control text. Recognition memory was higher in the experimental group (2.54 versus 1.09, t=-3.63, P=0.001), including after adjustment for age, education, and health literacy (β=0.42, 95% CI: 0.17, 0.68, P=0.001), and in analyses limited to persons with college degrees (β=0.52, 95% CI: 0.18, 0.86, P=0.004) or no self-reported health literacy problems (β=0.39, 95% CI: 0.07, 0.71, P=0.02). An experimental CRC screening text improved recognition memory, including among patients with high education and self-assessed health literacy. CRC screening texts comparable to our experimental text may be warranted for all screening-eligible patients, if such texts improve screening uptake. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  14. Text Mining for Information Systems Researchers: An Annotated Topic Modeling Tutorial

    DEFF Research Database (Denmark)

    Debortoli, Stefan; Müller, Oliver; Junglas, Iris

    2016-01-01

    , such as manual coding. Yet, the size of text data setsobtained from the Internet makes manual analysis virtually impossible. In this tutorial, we discuss the challengesencountered when applying automated text-mining techniques in information systems research. In particular, weshowcase the use of probabilistic...... researchers,this tutorial provides some guidance for conducting text mining studies on their own and for evaluating the quality ofothers.......t is estimated that more than 80 percent of today’s data is stored in unstructured form (e.g., text, audio, image, video);and much of it is expressed in rich and ambiguous natural language. Traditionally, the analysis of natural languagehas prompted the use of qualitative data analysis approaches...

  15. "Really," "Not Possible," "I Can't Believe It": Exploring Informational Text in Literature Circles

    Science.gov (United States)

    Barone, Diane; Barone, Rebecca

    2016-01-01

    Fifth graders' interpretations of nonfiction or informational text were explored. Each literature circle group read and responded to informational text. Discoveries included that students' conversations and written responses were closely connected to text and that students created multimodal responses.

  16. A Comparative Analysis of Information Hiding Techniques for Copyright Protection of Text Documents

    Directory of Open Access Journals (Sweden)

    Milad Taleby Ahvanooey

    2018-01-01

    Full Text Available With the ceaseless usage of web and other online services, it has turned out that copying, sharing, and transmitting digital media over the Internet are amazingly simple. Since the text is one of the main available data sources and most widely used digital media on the Internet, the significant part of websites, books, articles, daily papers, and so on is just the plain text. Therefore, copyrights protection of plain texts is still a remaining issue that must be improved in order to provide proof of ownership and obtain the desired accuracy. During the last decade, digital watermarking and steganography techniques have been used as alternatives to prevent tampering, distortion, and media forgery and also to protect both copyright and authentication. This paper presents a comparative analysis of information hiding techniques, especially on those ones which are focused on modifying the structure and content of digital texts. Herein, various text watermarking and text steganography techniques characteristics are highlighted along with their applications. In addition, various types of attacks are described and their effects are analyzed in order to highlight the advantages and weaknesses of current techniques. Finally, some guidelines and directions are suggested for future works.

  17. The Distribution of the Informative Intensity of the Text in Terms of its Structure (On Materials of the English Texts in the Mining Sphere

    Directory of Open Access Journals (Sweden)

    Znikina Ludmila

    2017-01-01

    Full Text Available The article deals with the distribution of informative intensity of the English-language scientific text based on its structural features contributing to the process of formalization of the scientific text and the preservation of the adequacy of the text with derived semantic information in relation to the primary. Discourse analysis is built on specific compositional and meaningful examples of scientific texts taken from the mining field. It also analyzes the adequacy of the translation of foreign texts into another language, the relationships between elements of linguistic systems, the degree of a formal conformance, translation with the specific objectives and information needs of the recipient. Some key words and ideas are emphasized in the paragraphs of the English-language mining scientific texts. The article gives the characteristic features of the structure of paragraphs of technical text and examples of constructions in English scientific texts based on a mining theme with the aim to explain the possible ways of their adequate translation.

  18. Automated Text Markup for Information Retrieval from an Electronic Textbook of Infectious Disease

    Science.gov (United States)

    Berrios, Daniel C.; Kehler, Andrew; Kim, David K.; Yu, Victor L.; Fagan, Lawrence M.

    1998-01-01

    The information needs of practicing clinicians frequently require textbook or journal searches. Making these sources available in electronic form improves the speed of these searches, but precision (i.e., the fraction of relevant to total documents retrieved) remains low. Improving the traditional keyword search by transforming search terms into canonical concepts does not improve search precision greatly. Kim et al. have designed and built a prototype system (MYCIN II) for computer-based information retrieval from a forthcoming electronic textbook of infectious disease. The system requires manual indexing by experts in the form of complex text markup. However, this mark-up process is time consuming (about 3 person-hours to generate, review, and transcribe the index for each of 218 chapters). We have designed and implemented a system to semiautomate the markup process. The system, information extraction for semiautomated indexing of documents (ISAID), uses query models and existing information-extraction tools to provide support for any user, including the author of the source material, to mark up tertiary information sources quickly and accurately.

  19. EnvMine: A text-mining system for the automatic extraction of contextual information

    Directory of Open Access Journals (Sweden)

    de Lorenzo Victor

    2010-06-01

    Full Text Available Abstract Background For ecological studies, it is crucial to count on adequate descriptions of the environments and samples being studied. Such a description must be done in terms of their physicochemical characteristics, allowing a direct comparison between different environments that would be difficult to do otherwise. Also the characterization must include the precise geographical location, to make possible the study of geographical distributions and biogeographical patterns. Currently, there is no schema for annotating these environmental features, and these data have to be extracted from textual sources (published articles. So far, this had to be performed by manual inspection of the corresponding documents. To facilitate this task, we have developed EnvMine, a set of text-mining tools devoted to retrieve contextual information (physicochemical variables and geographical locations from textual sources of any kind. Results EnvMine is capable of retrieving the physicochemical variables cited in the text, by means of the accurate identification of their associated units of measurement. In this task, the system achieves a recall (percentage of items retrieved of 92% with less than 1% error. Also a Bayesian classifier was tested for distinguishing parts of the text describing environmental characteristics from others dealing with, for instance, experimental settings. Regarding the identification of geographical locations, the system takes advantage of existing databases such as GeoNames to achieve 86% recall with 92% precision. The identification of a location includes also the determination of its exact coordinates (latitude and longitude, thus allowing the calculation of distance between the individual locations. Conclusion EnvMine is a very efficient method for extracting contextual information from different text sources, like published articles or web pages. This tool can help in determining the precise location and physicochemical

  20. SAIL: Summation-bAsed Incremental Learning for Information-Theoretic Text Clustering.

    Science.gov (United States)

    Cao, Jie; Wu, Zhiang; Wu, Junjie; Xiong, Hui

    2013-04-01

    Information-theoretic clustering aims to exploit information-theoretic measures as the clustering criteria. A common practice on this topic is the so-called Info-Kmeans, which performs K-means clustering with KL-divergence as the proximity function. While expert efforts on Info-Kmeans have shown promising results, a remaining challenge is to deal with high-dimensional sparse data such as text corpora. Indeed, it is possible that the centroids contain many zero-value features for high-dimensional text vectors, which leads to infinite KL-divergence values and creates a dilemma in assigning objects to centroids during the iteration process of Info-Kmeans. To meet this challenge, in this paper, we propose a Summation-bAsed Incremental Learning (SAIL) algorithm for Info-Kmeans clustering. Specifically, by using an equivalent objective function, SAIL replaces the computation of KL-divergence by the incremental computation of Shannon entropy. This can avoid the zero-feature dilemma caused by the use of KL-divergence. To improve the clustering quality, we further introduce the variable neighborhood search scheme and propose the V-SAIL algorithm, which is then accelerated by a multithreaded scheme in PV-SAIL. Our experimental results on various real-world text collections have shown that, with SAIL as a booster, the clustering performance of Info-Kmeans can be significantly improved. Also, V-SAIL and PV-SAIL indeed help improve the clustering quality at a lower cost of computation.

  1. Text mining of cancer-related information: review of current status and future directions.

    Science.gov (United States)

    Spasić, Irena; Livsey, Jacqueline; Keane, John A; Nenadić, Goran

    2014-09-01

    This paper reviews the research literature on text mining (TM) with the aim to find out (1) which cancer domains have been the subject of TM efforts, (2) which knowledge resources can support TM of cancer-related information and (3) to what extent systems that rely on knowledge and computational methods can convert text data into useful clinical information. These questions were used to determine the current state of the art in this particular strand of TM and suggest future directions in TM development to support cancer research. A review of the research on TM of cancer-related information was carried out. A literature search was conducted on the Medline database as well as IEEE Xplore and ACM digital libraries to address the interdisciplinary nature of such research. The search results were supplemented with the literature identified through Google Scholar. A range of studies have proven the feasibility of TM for extracting structured information from clinical narratives such as those found in pathology or radiology reports. In this article, we provide a critical overview of the current state of the art for TM related to cancer. The review highlighted a strong bias towards symbolic methods, e.g. named entity recognition (NER) based on dictionary lookup and information extraction (IE) relying on pattern matching. The F-measure of NER ranges between 80% and 90%, while that of IE for simple tasks is in the high 90s. To further improve the performance, TM approaches need to deal effectively with idiosyncrasies of the clinical sublanguage such as non-standard abbreviations as well as a high degree of spelling and grammatical errors. This requires a shift from rule-based methods to machine learning following the success of similar trends in biological applications of TM. Machine learning approaches require large training datasets, but clinical narratives are not readily available for TM research due to privacy and confidentiality concerns. This issue remains the main

  2. Examining the Reading of Informational Text in 4th Grade Class and Its Relation with Students' Reading Performance

    Science.gov (United States)

    Li, Dan; Beecher, Constance; Cho, Byeong-Young

    2018-01-01

    Being proficient in independently reading and writing complex informational text has become a need for college and career success. While there is a great deal of agreement on the importance of the reading of informational text in early grades and teachers are encouraged to increase amount of the reading of informational text in early grades, few…

  3. A Synthesis of Research on Informational Text Reading Interventions for Elementary Students With Learning Disabilities.

    Science.gov (United States)

    Ciullo, Stephen; Lo, Yu-Ling Sabrina; Wanzek, Jeanne; Reed, Deborah K

    2016-01-01

    This research synthesis was conducted to understand the effectiveness of interventions designed to improve learning from informational text for students with learning disabilities in elementary school (K-5). The authors identified 18 studies through a comprehensive search. The interventions were evaluated to determine treatment effects and to understand implementation and methodological variables that influenced outcomes. Moderate to large effect sizes on researcher-developed measures for cognitive strategy interventions were reported. Interventions that utilized graphic organizers as study guides to support social studies learning were also associated with improved outcomes. The findings are considered within the context of limited implementation of standardized measures. The authors extend findings from previous research by reporting a paucity of interventions to enhance higher-level cognitive and comprehension skills. The majority of reviewed studies targeted fact acquisition and main idea identification, and overall encouraging findings were noted for these skills. Implications for future research are discussed. © Hammill Institute on Disabilities 2014.

  4. Decision-making in information seeking on texts: an eye-fixation-related potentials investigation

    Science.gov (United States)

    Frey, Aline; Ionescu, Gelu; Lemaire, Benoit; López-Orozco, Francisco; Baccino, Thierry; Guérin-Dugué, Anne

    2013-01-01

    Reading on a web page is known to be not linear and people need to make fast decisions about whether they have to stop or not reading. In such context, reading, and decision-making processes are intertwined and this experiment attempts to separate them through electrophysiological patterns provided by the Eye-Fixation-Related Potentials technique (EFRPs). We conducted an experiment in which EFRPs were recorded while participants read blocks of text that were semantically highly related, moderately related, and unrelated to a given goal. Participants had to decide as fast as possible whether the text was related or not to the semantic goal given at a prior stage. Decision making (stopping information search) may occur when the paragraph is highly related to the goal (positive decision) or when it is unrelated to the goal (negative decision). EFRPs were analyzed on and around typical eye fixations: either on words belonging to the goal (target), subjected to a high rate of positive decisions, or on low frequency unrelated words (incongruent), subjected to a high rate of negative decisions. In both cases, we found EFRPs specific patterns (amplitude peaking between 51 to 120 ms after fixation onset) spreading out on the next words following the goal word and the second fixation after an incongruent word, in parietal and occipital areas. We interpreted these results as delayed late components (P3b and N400), reflecting the decision to stop information searching. Indeed, we show a clear spill-over effect showing that the effect on word N spread out on word N + 1 and N + 2. PMID:23966913

  5. Decision-making in information seeking on texts: an eye-fixation-related potentials investigation.

    Science.gov (United States)

    Frey, Aline; Ionescu, Gelu; Lemaire, Benoit; López-Orozco, Francisco; Baccino, Thierry; Guérin-Dugué, Anne

    2013-01-01

    Reading on a web page is known to be not linear and people need to make fast decisions about whether they have to stop or not reading. In such context, reading, and decision-making processes are intertwined and this experiment attempts to separate them through electrophysiological patterns provided by the Eye-Fixation-Related Potentials technique (EFRPs). We conducted an experiment in which EFRPs were recorded while participants read blocks of text that were semantically highly related, moderately related, and unrelated to a given goal. Participants had to decide as fast as possible whether the text was related or not to the semantic goal given at a prior stage. Decision making (stopping information search) may occur when the paragraph is highly related to the goal (positive decision) or when it is unrelated to the goal (negative decision). EFRPs were analyzed on and around typical eye fixations: either on words belonging to the goal (target), subjected to a high rate of positive decisions, or on low frequency unrelated words (incongruent), subjected to a high rate of negative decisions. In both cases, we found EFRPs specific patterns (amplitude peaking between 51 to 120 ms after fixation onset) spreading out on the next words following the goal word and the second fixation after an incongruent word, in parietal and occipital areas. We interpreted these results as delayed late components (P3b and N400), reflecting the decision to stop information searching. Indeed, we show a clear spill-over effect showing that the effect on word N spread out on word N + 1 and N + 2.

  6. Text Mining to inform construction of Earth and Environmental Science Ontologies

    Science.gov (United States)

    Schildhauer, M.; Adams, B.; Rebich Hespanha, S.

    2013-12-01

    There is a clear need for better semantic representation of Earth and environmental concepts, to facilitate more effective discovery and re-use of information resources relevant to scientists doing integrative research. In order to develop general-purpose Earth and environmental science ontologies, however, it is necessary to represent concepts and relationships that span usage across multiple disciplines and scientific specialties. Traditional knowledge modeling through ontologies utilizes expert knowledge but inevitably favors the particular perspectives of the ontology engineers, as well as the domain experts who interacted with them. This often leads to ontologies that lack robust coverage of synonymy, while also missing important relationships among concepts that can be extremely useful for working scientists to be aware of. In this presentation we will discuss methods we have developed that utilize statistical topic modeling on a large corpus of Earth and environmental science articles, to expand coverage and disclose relationships among concepts in the Earth sciences. For our work we collected a corpus of over 121,000 abstracts from many of the top Earth and environmental science journals. We performed latent Dirichlet allocation topic modeling on this corpus to discover a set of latent topics, which consist of terms that commonly co-occur in abstracts. We match terms in the topics to concept labels in existing ontologies to reveal gaps, and we examine which terms are commonly associated in natural language discourse, to identify relationships that are important to formally model in ontologies. Our text mining methodology uncovers significant gaps in the content of some popular existing ontologies, and we show how, through a workflow involving human interpretation of topic models, we can bootstrap ontologies to have much better coverage and richer semantics. Because we base our methods directly on what working scientists are communicating about their

  7. Can Music Foster Learning – Effects of Different Text Modalities on Learning and Information Retrieval

    OpenAIRE

    Lehmann, Janina A. M.; Seufert, Tina

    2018-01-01

    This study investigates the possibilities of fostering learning based on differences in recall and comprehension after learning with texts which were presented in one of three modalities: either in a spoken, written, or sung version. All three texts differ regarding their processing, especially when considering working memory. Overall, we assume the best recall performance after learning with the written text and the best comprehension performance after learning with the sung text, respective...

  8. Processing and memory of information presented in narrative or expository texts.

    Science.gov (United States)

    Wolfe, Michael B W; Woodwyk, Joshua M

    2010-09-01

    Previous research suggests that narrative and expository texts differ in the extent to which they prompt students to integrate to-be-learned content with relevant prior knowledge during comprehension. We expand on previous research by examining on-line processing and representation in memory of to-be-learned content that is embedded in narrative or expository texts. We are particularly interested in how differences in the use of relevant prior knowledge leads to differences in terms of levels of discourse representation (textbase vs. situation model). A total of 61 university undergraduates in Expt 1, and 160 in Expt 2. In Expt 1, subjects thought out loud while comprehending circulatory system content embedded in a narrative or expository text, followed by free recall of text content. In Expt 2, subjects read silently and completed a sentence recognition task to assess memory. In Expt 1, subjects made more associations to prior knowledge while reading the expository text, and recalled more content. Content recall was also correlated with amount of relevant prior knowledge for subjects who read the expository text but not the narrative text. In Expt 2, subjects reading the expository text (compared to the narrative text) had a weaker textbase representation of the to-be-learned content, but a marginally stronger situation model. Results suggest that in terms of to-be-learned content, expository texts trigger students to utilize relevant prior knowledge more than narrative texts.

  9. Can Music Foster Learning – Effects of Different Text Modalities on Learning and Information Retrieval

    Directory of Open Access Journals (Sweden)

    Janina A. M. Lehmann

    2018-01-01

    Full Text Available This study investigates the possibilities of fostering learning based on differences in recall and comprehension after learning with texts which were presented in one of three modalities: either in a spoken, written, or sung version. All three texts differ regarding their processing, especially when considering working memory. Overall, we assume the best recall performance after learning with the written text and the best comprehension performance after learning with the sung text, respectively, compared to both other text modalities. We also analyzed whether the melody of the sung material functions as a mnemonic aid for the learners in the sung text condition. If melody and text of the sung version are closely linked, presentation of the melody during the post-test phase could foster text retrieval. 108 students either learned from a sung text performed by a professional singer, a printed text, or the same text read out loud. Half of the participants worked on the post-test while listening to the melody used for the musical learning material and the other half did not listen to a melody. The written learning modality led to significantly better recall than with the spoken (d = 0.97 or sung text (d = 0.78. However, comprehension after learning with the sung modality was significantly superior compared to when learning with the written learning modality (d = 0.40. Reading leads to more focus on details, which is required to answer recall questions, while listening fosters a general understanding of the text, leading to higher levels of comprehension. Listening to the melody during the post-test phase negatively affected comprehension, irrespective of the modality during the learning phase. This can be explained by the seductive detail effect, as listening to the melody during the post-test phase may distract learners from their main task. In closing, theoretical and practical implications are discussed.

  10. Can Music Foster Learning - Effects of Different Text Modalities on Learning and Information Retrieval.

    Science.gov (United States)

    Lehmann, Janina A M; Seufert, Tina

    2017-01-01

    This study investigates the possibilities of fostering learning based on differences in recall and comprehension after learning with texts which were presented in one of three modalities: either in a spoken, written, or sung version. All three texts differ regarding their processing, especially when considering working memory. Overall, we assume the best recall performance after learning with the written text and the best comprehension performance after learning with the sung text, respectively, compared to both other text modalities. We also analyzed whether the melody of the sung material functions as a mnemonic aid for the learners in the sung text condition. If melody and text of the sung version are closely linked, presentation of the melody during the post-test phase could foster text retrieval. 108 students either learned from a sung text performed by a professional singer, a printed text, or the same text read out loud. Half of the participants worked on the post-test while listening to the melody used for the musical learning material and the other half did not listen to a melody. The written learning modality led to significantly better recall than with the spoken ( d = 0.97) or sung text ( d = 0.78). However, comprehension after learning with the sung modality was significantly superior compared to when learning with the written learning modality ( d = 0.40). Reading leads to more focus on details, which is required to answer recall questions, while listening fosters a general understanding of the text, leading to higher levels of comprehension. Listening to the melody during the post-test phase negatively affected comprehension, irrespective of the modality during the learning phase. This can be explained by the seductive detail effect, as listening to the melody during the post-test phase may distract learners from their main task. In closing, theoretical and practical implications are discussed.

  11. Text mining scientific papers: a survey on FCA-based information retrieval research

    NARCIS (Netherlands)

    Poelmans, J.; Ignatov, D.I.; Viaene, S.; Dedene, G.; Kuznetsov, S.O.

    2012-01-01

    Formal Concept Analysis (FCA) is an unsupervised clustering technique and many scientific papers are devoted to applying FCA in Information Retrieval (IR) research. We collected 103 papers published between 2003-2009 which mention FCA and information retrieval in the abstract, title or keywords.

  12. Standard Chinese: A Modular Approach. Student Text. Module 1: Orientation; Module 2: Biographic Information.

    Science.gov (United States)

    Defense Language Inst., Monterey, CA.

    Texts in spoken Standard Chinese were developed to improve and update Chinese materials to reflect current usage in Beijing and Taipei. The focus is on communicating in Chinese in practical situations, and the texts summarize and supplement tapes. The overall course is organized into 10 situational modules, student workbooks, and resource modules.…

  13. Using Multiple Sources of Information in Establishing Text Complexity. Reading Research Report. #11.03

    Science.gov (United States)

    Hiebert, Elfrieda H.

    2011-01-01

    A focus of the Common Core State Standards/English Language Arts (CCSS/ELA) is that students become increasingly more capable with complex text over their school careers. This focus has redirected attention to the measurement of text complexity. Although CCSS/ELA suggests multiple criteria for this task, the standards offer a single measure of…

  14. Reaching rural women: breast cancer prevention information seeking behaviors and interest in Internet, cell phone, and text use.

    Science.gov (United States)

    Kratzke, Cynthia; Wilson, Susan; Vilchis, Hugo

    2013-02-01

    The purpose of this study was to examine the breast cancer prevention information seeking behaviors among rural women, the prevalence of Internet, cell, and text use, and interest to receive breast cancer prevention information cell and text messages. While growing literature for breast cancer information sources supports the use of the Internet, little is known about breast cancer prevention information seeking behaviors among rural women and mobile technology. Using a cross-sectional study design, data were collected using a survey. McGuire's Input-Ouput Model was used as the framework. Self-reported data were obtained from a convenience sample of 157 women with a mean age of 60 (SD = 12.12) at a rural New Mexico imaging center. Common interpersonal information sources were doctors, nurses, and friends and common channel information sources were television, magazines, and Internet. Overall, 87% used cell phones, 20% had an interest to receive cell phone breast cancer prevention messages, 47% used text messaging, 36% had an interest to receive text breast cancer prevention messages, and 37% had an interest to receive mammogram reminder text messages. Bivariate analysis revealed significant differences between age, income, and race/ethnicity and use of cell phones or text messaging. There were no differences between age and receiving text messages or text mammogram reminders. Assessment of health information seeking behaviors is important for community health educators to target populations for program development. Future research may identify additional socio-cultural differences.

  15. National Waste Terminal Storage Program information meeting, December 7-8, 1976. [Slides only, no text

    Energy Technology Data Exchange (ETDEWEB)

    1976-12-06

    Volume II of the report comprises copies of the slides from the talks presented at the second session of the National Waste Terminal Storage Program information meeting. This session was devoted to geologic studies. (LK)

  16. Impact of Metadata on Full-text Information Retrieval Performance: An Experimental Research on a Small Scale Turkish Corpus

    Directory of Open Access Journals (Sweden)

    Çağdaş Çapkın

    2016-12-01

    Full Text Available Information institutions use text-based information retrieval systems to store, index and retrieve metadata, full-text, or both metadata and full-text (hybrid contents. The aim of this research was to evaluate impact of these contents on information retrieval performance. For this purpose, metadata (MIR, full-text (FIR and hybrid (HIR content information retrieval systems were developed with default Lucene information retrieval model for a small scale Turkish corpus. In order to evaluate performance of this three systems, “precision - recall” and “normalized recall” tests were conducted. Experimental findings showed that there were no significant differences between MIR and FIR in mean average precision (MAP performance. On the other hand, MAP performance of HIR was significantly higher in comparison to MIR and FIR. When information retrieval performance was evaluated as user-centered, the “normalized recall” performances of MIR and HIR were significantly higher than FIR. Additionally, there were no significant differences between the systems in retrieved relevant document means. Processing different types of contents such as metadata and full-text had some advantages and disadvantages for information retrieval systems in terms of term management. The advantages brought together in hybrid content processing (HIR and information retrieval performance improved.

  17. The Role of Domain and System Knowledge on Text Comprehension and Information Search in Hypermedia

    Science.gov (United States)

    Waniek, Jacqueline; Schafer, Thomas

    2009-01-01

    The goal of this study was to examine the role of domain and system knowledge on learner performance in reading and information search in hypermedia. Previous studies have shown that prior knowledge is an important individual factor for effective hypermedia use. However, current research lacks a full understanding of how these two aspects of prior…

  18. Surveillance in the Information Age: Text Quantification, Anomaly Detection, and Empirical Evaluation

    Science.gov (United States)

    Lu, Hsin-Min

    2010-01-01

    Deep penetration of personal computers, data communication networks, and the Internet has created a massive platform for data collection, dissemination, storage, and retrieval. Large amounts of textual data are now available at a very low cost. Valuable information, such as consumer preferences, new product developments, trends, and opportunities,…

  19. Exploiting Structure and Conventions of Movie Scripts for Information Retrieval and Text Mining

    DEFF Research Database (Denmark)

    Jhala, Arnav

    2008-01-01

    Movie scripts are documents that describe the story, stage direction for actors and camera, and dialogue. Script writers, directors, and cinematographers have standardized the format and language that is used in script writing. Scripts contain a wealth of information about narrative patterns, cha...

  20. Differences among college women for breast cancer prevention acquired information-seeking, desired apps and texts, and daughter-initiated information to mothers.

    Science.gov (United States)

    Kratzke, Cynthia; Amatya, Anup; Vilchis, Hugo

    2014-04-01

    The purpose of this study was to examine among college women acquired breast cancer prevention information-seeking, desired apps and texts, and information given to mothers. Using a cross-sectional study, a survey was administered to college women at a southwestern university. College women (n = 546) used the Internet (44 %) for active breast cancer prevention information-seeking and used the Internet (74 %), magazines (69 %), and television (59 %) for passive information receipt. Over half of the participants desired breast cancer prevention apps (54 %) and texts (51 %). Logistic regression analyses revealed predictors for interest to receive apps were ethnicity (Hispanic), lower self-efficacy, actively seeking online information, and older age and predictors for interest to receive texts were lower self-efficacy and higher university level. Eighteen percent of college women (n = 99) reported giving information to mothers and reported in an open-ended item the types of information given to mothers. Predictors for giving information to mothers were actively and passively seeking online information, breast self-exam practice, and higher university level. Screenings were the most frequent types of information given to mothers. Breast cancer prevention information using apps, texts, or Internet and daughter-initiated information for mothers should be considered in health promotion targeting college students or young women in communities. Future research is needed to examine the quality of apps, texts, and online information and cultural differences for breast cancer prevention sources.

  1. Semantic Web and Contextual Information: Semantic Network Analysis of Online Journalistic Texts

    Science.gov (United States)

    Lim, Yon Soo

    This study examines why contextual information is important to actualize the idea of semantic web, based on a case study of a socio-political issue in South Korea. For this study, semantic network analyses were conducted regarding English-language based 62 blog posts and 101 news stories on the web. The results indicated the differences of the meaning structures between blog posts and professional journalism as well as between conservative journalism and progressive journalism. From the results, this study ascertains empirical validity of current concerns about the practical application of the new web technology, and discusses how the semantic web should be developed.

  2. Text, pictures or animations in instructions for use : a validation of different media for specific types of information

    NARCIS (Netherlands)

    Westendorp, P.H.; Ensink, T.; Sauer, C.

    1996-01-01

    The Bieger and Glock taxonomy of information types is applied to test the relative effectiveness of text, pictures and animation in on-line help systems. On the basis of this taxonomy seven versions of an on-line help system for telephones were designed, varying text, picture and animation for the

  3. The Effects of Literacy Support Tools on the Comprehension of Informational e-Books and Print-Based Text

    Science.gov (United States)

    Herman, Heather A.

    2017-01-01

    This mixed methods research explores the effects of literacy support tools to support comprehension strategies when reading informational e-books and print-based text with 14 first-grade students. This study focused on the following comprehension strategies: annotating connections, annotating "I wonders," and looking back in the text.…

  4. Comparison of Two Different Presentations of Graphic Organizers in Recalling Information in Expository Texts with Intellectually Disabled Students

    Science.gov (United States)

    Ozmen, Ruya Guzel

    2011-01-01

    The purpose of this study was to compare the effectiveness of two different presentations of graphic organizers on recalling information from compare/contrast text which is a kind of expository text in intellectually disabled students. The first presentation included graphic organizers which were presented before reading whereas in the second…

  5. [Formula: see text]Executive functions and social information processing in adolescents with severe behavior problems.

    Science.gov (United States)

    Van Nieuwenhuijzen, M; Van Rest, M M; Embregts, P J C M; Vriens, A; Oostermeijer, S; Van Bokhoven, I; Matthys, W

    2017-02-01

    One tradition in research for explaining aggression and antisocial behavior has focused on social information processing (SIP). Aggression and antisocial behavior have also been studied from the perspective of executive functions (EFs), the higher-order cognitive abilities that affect other cognitive processes, such as social cognitive processes. The main goal of the present study is to provide insight into the relation between EFs and SIP in adolescents with severe behavior problems. Because of the hierarchical relation between EFs and SIP, we examined EFs as predictors of SIP. We hypothesized that, first, focused attention predicts encoding and interpretation, second, inhibition predicts interpretation, response generation, evaluation, and selection, and third, working memory predicts response generation and selection. The participants consisted of 94 respondents living in residential facilities aged 12-20 years, all showing behavior problems in the clinical range according to care staff. EFs were assessed using subtests from the Amsterdam Neuropsychological Test battery. Focused attention was measured by the Flanker task, inhibition by the GoNoGo task, and working memory by the Visual Spatial Sequencing task. SIP was measured by video vignettes and a structured interview. The results indicate that positive evaluation of aggressive responses is predicted by impaired inhibition and selection of aggressive responses by a combination of impaired focused attention and inhibition. It is concluded that different components of EFs as higher-order cognitive abilities affect SIP.

  6. Slovene specialized text corpus of Library and Information Science – An advanced lexicographic tool for library terminology research

    OpenAIRE

    Kanič, Ivan

    2013-01-01

    To support the research in the field of library and information science terminology and dictionary construction in Slovene language a specialized text corpus has been designed and constructed. The corpus has reached 3,6 million words extracted from 625 Slovene technical and scientific texts of the field. It supports a variety of specialized search methods, display of search results, and their statistic computation. The web based application is in open public access.

  7. The Effects of Collaborative Strategic Reading on Informational Text Comprehension and Metacognitive Awareness of Fifth Grade Students

    Science.gov (United States)

    McCown, Margaret Averill

    2013-01-01

    This study examined the effects of Collaborative Strategic Reading (CSR) on informational text comprehension and metacognitive awareness of fifth grade students. This study tested the theories of metacognition and social cognition with a focus on self-regulation and self-efficacy. Participating students included a heterogeneous mix of regular…

  8. Explicit Instruction of Graphic Organizers as an Informational Text Reading Comprehension Strategy: Third-Grade Students' Strategies and Perceptions

    Science.gov (United States)

    Fealy, Erin Marie

    2010-01-01

    The purpose of this case study research was to explore the effects of explicit instruction of graphic organizers to support students' understandings of informational text. An additional purpose was to investigate students' perceptions of using graphic organizers as a comprehension strategy. Using case study methodology, this study occurred…

  9. Students' Consideration of Source Information during the Reading of Multiple Texts and Its Effect on Intertextual Conflict Resolution

    Science.gov (United States)

    Kobayashi, Keiichi

    2014-01-01

    This study investigated students' spontaneous use of source information for the resolution of conflicts between texts. One-hundred fifty-four undergraduate students read two conflicting explanations concerning the relationship between blood type and personality under two conditions: either one explanation with a higher credibility source and…

  10. Teaching Matters: Is There a Text in This Class? E-readers, E-books, and Information Literacy

    Directory of Open Access Journals (Sweden)

    Janelle M. Zauha

    2012-04-01

    Full Text Available This column explores current issues with e-books and e-readers in academic classrooms. It suggests ways the academic library can explore and meet the information literacy needs of students, faculty, and staff who are using these new devices or seeing them in use in their classrooms.

  11. AMPLIFICATION AND COMPRESSION OF THE TEXT AND ITS TITLE AS A MEANS OF CONVEYING THE INFORMATION STRUCTURE

    Directory of Open Access Journals (Sweden)

    Buyanova, E.V.

    2017-03-01

    Full Text Available This article takes stock of the basic notions of information structure. There are two communicative goals to satisfy: making the information conveyed by the discourse easier for the reader/hearer to understand; indicating what the enunciator considers to be the most important. When translating from one language into another the information structure in most cases remains unchanged. However the text in the target language may not always be completely clear to the new recipient for a number of reasons, such as social and national differences between speakers of the two languages, or lack of realia in the target language. In this case the information structure needs extension in the form of descriptions, definitions, commentaries. This results either in amplification of the text in the target language or in its compression. The present work is based on an analysis of papers from American and British journals and periodicals. The article also deals with the peculiarities of the metaphor as a means of broader text compression in the titles of newspaper articles.

  12. Science Informational Trade Books: An Exploration of Text-based Practices and Interactions in a First-grade Classroom

    Science.gov (United States)

    Schreier, Virginia A.

    Although scholars have long advocated the use of informational texts in the primary grades, gaps and inconsistencies in research have produced conflicting reports on how teachers used these texts in the primary curriculum, and how primary students dealt with them during instruction and on their own (e.g., Saul & Dieckman, 2005). Thus, to add to research on informational texts in the primary grades, the purpose of this study was to examine: (a) a first-grade teacher's use of science informational trade books (SITBs) in her classroom, (b) the ways students responded to her instruction, and (c) how students interacted with these texts. My study was guided by a sociocultural perspective (e.g., Bakhtin, 1981; Vygotsky, 1978), providing me a lens to examine participants during naturally occurring social practices in the classroom, mediated by language and other symbolic tools. Data were collected by means of 28 observations, 6 semi-structured interviews, 21 unstructured interviews, and 26 documents over the course of 10 weeks. Three themes generated from the data to provide insight into the teacher's and students' practices and interactions with SITBs. First, the first-grade teacher used SITBs as teaching tools during guided conversations around the text to scaffold students' understanding of specialized vocabulary, science concepts, and text features. Her instruction with SITBs included shared reading lessons, interactive read-alouds and learning activities during two literacy/science units. However, there was limited use of SITBs during the rest of her reading program, in which she demonstrated a preference for narrative. Second, students responded to instruction by participating in guided conversations around the text, in which they used prior knowledge, shared ideas, and visual representations (e.g., illustrations, diagrams, labels, and captions) to actively make meaning of the text. Third, students interacted with SITBs on their own to make sense of science, in

  13. Extraction of Pluvial Flood Relevant Volunteered Geographic Information (VGI by Deep Learning from User Generated Texts and Photos

    Directory of Open Access Journals (Sweden)

    Yu Feng

    2018-01-01

    Full Text Available In recent years, pluvial floods caused by extreme rainfall events have occurred frequently. Especially in urban areas, they lead to serious damages and endanger the citizens’ safety. Therefore, real-time information about such events is desirable. With the increasing popularity of social media platforms, such as Twitter or Instagram, information provided by voluntary users becomes a valuable source for emergency response. Many applications have been built for disaster detection and flood mapping using crowdsourcing. Most of the applications so far have merely used keyword filtering or classical language processing methods to identify disaster relevant documents based on user generated texts. As the reliability of social media information is often under criticism, the precision of information retrieval plays a significant role for further analyses. Thus, in this paper, high quality eyewitnesses of rainfall and flooding events are retrieved from social media by applying deep learning approaches on user generated texts and photos. Subsequently, events are detected through spatiotemporal clustering and visualized together with these high quality eyewitnesses in a web map application. Analyses and case studies are conducted during flooding events in Paris, London and Berlin.

  14. Calling, texting, and searching for information while riding a motorcycle: A study of university students in Vietnam.

    Science.gov (United States)

    Truong, Long T; De Gruyter, Chris; Nguyen, Hang T T

    2017-08-18

    The objective of this study was to investigate the prevalence of calling, texting, and searching for information while riding a motorcycle among university students and the influences of sociodemographic characteristics, social norms, and risk perceptions on these behaviors. Students at 2 university campuses in Hanoi and Ho Chi Minh City, the 2 largest cities in Vietnam, were invited to participate in an anonymous online survey. Data collection was conducted during March and May 2016. There were 741 respondents, of whom nearly 90% of students (665) were motorcycle riders. Overall prevalence of mobile phone use while riding is 80.9% (95% confidence interval [CI], 77.9-83.9%) with calling having a higher level of prevalence than texting or searching for information while riding: 74% (95% CI, 70.7-77.3%) vs. 51.7% (95% CI, 47.9-55.5%) and 49.9% (95% CI, 46.1-53.7%), respectively. Random parameter ordered probit modeling results indicate that mobile phone use while riding is associated with gender, motorcycle license duration, perceived crash risk, perceived risk of mobile phone snatching, and perceptions of friends' mobile phone use while riding. Mobile phone use while riding a motorcycle is highly prevalent among university students. Educational programs should focus on the crash and economic risk of all types of mobile phone use while riding, including calling, texting, and searching for information. In addition, they should consider targeting the influence of social norms and peers on mobile phone use while riding.

  15. Comparing data accuracy between structured abstracts and full-text journal articles: implications in their use for informing clinical decisions.

    Science.gov (United States)

    Fontelo, Paul; Gavino, Alex; Sarmiento, Raymond Francis

    2013-12-01

    The abstract is the most frequently read section of a research article. The use of 'Consensus Abstracts', a clinician-oriented web application formatted for mobile devices to search MEDLINE/PubMed, for informing clinical decisions was proposed recently; however, inaccuracies between abstracts and the full-text article have been shown. Efforts have been made to improve quality. We compared data in 60 recent-structured abstracts and full-text articles from six highly read medical journals. Data inaccuracies were identified and then classified as either clinically significant or not significant. Data inaccuracies were observed in 53.33% of articles ranging from 3.33% to 45% based on the IMRAD format sections. The Results section showed the highest discrepancies (45%) although these were deemed to be mostly not significant clinically except in one. The two most common discrepancies were mismatched numbers or percentages (11.67%) and numerical data or calculations found in structured abstracts but not mentioned in the full text (40%). There was no significant relationship between journals and the presence of discrepancies (Fisher's exact p value =0.3405). Although we found a high percentage of inaccuracy between structured abstracts and full-text articles, these were not significant clinically. The inaccuracies do not seem to affect the conclusion and interpretation overall. Structured abstracts appear to be informative and may be useful to practitioners as a resource for guiding clinical decisions.

  16. Assessing Unmet Information Needs of Breast Cancer Survivors: Exploratory Study of Online Health Forums Using Text Classification and Retrieval.

    Science.gov (United States)

    McRoy, Susan; Rastegar-Mojarad, Majid; Wang, Yanshan; Ruddy, Kathryn J; Haddad, Tufia C; Liu, Hongfang

    2018-05-15

    Patient education materials given to breast cancer survivors may not be a good fit for their information needs. Needs may change over time, be forgotten, or be misreported, for a variety of reasons. An automated content analysis of survivors' postings to online health forums can identify expressed information needs over a span of time and be repeated regularly at low cost. Identifying these unmet needs can guide improvements to existing education materials and the creation of new resources. The primary goals of this project are to assess the unmet information needs of breast cancer survivors from their own perspectives and to identify gaps between information needs and current education materials. This approach employs computational methods for content modeling and supervised text classification to data from online health forums to identify explicit and implicit requests for health-related information. Potential gaps between needs and education materials are identified using techniques from information retrieval. We provide a new taxonomy for the classification of sentences in online health forum data. 260 postings from two online health forums were selected, yielding 4179 sentences for coding. After annotation of data and training alternative one-versus-others classifiers, a random forest-based approach achieved F1 scores from 66% (Other, dataset2) to 90% (Medical, dataset1) on the primary information types. 136 expressions of need were used to generate queries to indexed education materials. Upon examination of the best two pages retrieved for each query, 12% (17/136) of queries were found to have relevant content by all coders, and 33% (45/136) were judged to have relevant content by at least one. Text from online health forums can be analyzed effectively using automated methods. Our analysis confirms that breast cancer survivors have many information needs that are not covered by the written documents they typically receive, as our results suggest that at most

  17. Text analysis of radiation information in newspaper articles headlines and internet contents after the Fukushima Nuclear Power Plant accident

    International Nuclear Information System (INIS)

    Kanda, Reiko; Tsuji, Satsuki; Yonehara, Hidenori

    2014-01-01

    In general, the press is considered to have amplified the level of public's anxiety and perception of risk. In the present study, we analyzed newspaper article headlines and Internet contents that were released from March 11, 2011 to January 31, 2012 using text mining techniques. The aim is to reveal the particular characteristics of the information propagated regarding the Fukushima NPP Accident. The article headlines of the newspapers which had a largest circulation were chosen for analysis, and contents of Internet media were chose based on the number of times they were linked or retweeted. According to our text mining analysis, newspaper frequently reported the 'measurement, investigation and examination' of radiation/radioactive materials caused by the Fukushima Accident, and this information might be spread selectively via the social media. On the other hand, the words related to health effects of radiation exposure (i. e., cancer, hereditary effects) were rare in newspaper headlines. Instead, words like 'anxiety' and 'safe' were often used to convey the degree of health effects. Particularly in March of 2011, the concept of 'danger' was used frequently in newspaper headlines. These indirect characterizations of the situation may have contributed more or less to the misunderstanding of the health effects and to the enhanced perception of risk felt by the public. In conclusion, there were found no evidence to suggest that newspaper or Internet media users released sensational information that increased the health anxiety of reads throughout the period of analysis. (author)

  18. A Study on Environmental Research Trends Using Text-Mining Method - Focus on Spatial information and ICT -

    Science.gov (United States)

    Lee, M. J.; Oh, K. Y.; Joung-ho, L.

    2016-12-01

    Recently there are many research about analysing the interaction between entities by text-mining analysis in various fields. In this paper, we aimed to quantitatively analyse research-trends in the area of environmental research relating either spatial information or ICT (Information and Communications Technology) by Text-mining analysis. To do this, we applied low-dimensional embedding method, clustering analysis, and association rule to find meaningful associative patterns of key words frequently appeared in the articles. As the authors suppose that KCI (Korea Citation Index) articles reflect academic demands, total 1228 KCI articles that have been published from 1996 to 2015 were reviewed and analysed by Text-mining method. First, we derived KCI articles from NDSL(National Discovery for Science Leaders) site. And then we pre-processed their key-words elected from abstract and then classified those in separable sectors. We investigated the appearance rates and association rule of key-words for articles in the two fields: spatial-information and ICT. In order to detect historic trends, analysis was conducted separately for the four periods: 1996-2000, 2001-2005, 2006-2010, 2011-2015. These analysis were conducted with the usage of R-software. As a result, we conformed that environmental research relating spatial information mainly focused upon such fields as `GIS(35%)', `Remote-Sensing(25%)', `environmental theme map(15.7%)'. Next, `ICT technology(23.6%)', `ICT service(5.4%)', `mobile(24%)', `big data(10%)', `AI(7%)' are primarily emerging from environmental research relating ICT. Thus, from the analysis results, this paper asserts that research trends and academic progresses are well-structured to review recent spatial information and ICT technology and the outcomes of the analysis can be an adequate guidelines to establish environment policies and strategies. KEY WORDS: Big data, Test-mining, Environmental research, Spatial-information, ICT Acknowledgements: The

  19. KID - an algorithm for fast and efficient text mining used to automatically generate a database containing kinetic information of enzymes

    Directory of Open Access Journals (Sweden)

    Schomburg Dietmar

    2010-07-01

    Full Text Available Abstract Background The amount of available biological information is rapidly increasing and the focus of biological research has moved from single components to networks and even larger projects aiming at the analysis, modelling and simulation of biological networks as well as large scale comparison of cellular properties. It is therefore essential that biological knowledge is easily accessible. However, most information is contained in the written literature in an unstructured way, so that methods for the systematic extraction of knowledge directly from the primary literature have to be deployed. Description Here we present a text mining algorithm for the extraction of kinetic information such as KM, Ki, kcat etc. as well as associated information such as enzyme names, EC numbers, ligands, organisms, localisations, pH and temperatures. Using this rule- and dictionary-based approach, it was possible to extract 514,394 kinetic parameters of 13 categories (KM, Ki, kcat, kcat/KM, Vmax, IC50, S0.5, Kd, Ka, t1/2, pI, nH, specific activity, Vmax/KM from about 17 million PubMed abstracts and combine them with other data in the abstract. A manual verification of approx. 1,000 randomly chosen results yielded a recall between 51% and 84% and a precision ranging from 55% to 96%, depending of the category searched. The results were stored in a database and are available as "KID the KInetic Database" via the internet. Conclusions The presented algorithm delivers a considerable amount of information and therefore may aid to accelerate the research and the automated analysis required for today's systems biology approaches. The database obtained by analysing PubMed abstracts may be a valuable help in the field of chemical and biological kinetics. It is completely based upon text mining and therefore complements manually curated databases. The database is available at http://kid.tu-bs.de. The source code of the algorithm is provided under the GNU General Public

  20. TCMGeneDIT: a database for associated traditional Chinese medicine, gene and disease information using text mining

    Directory of Open Access Journals (Sweden)

    Chen Hsin-Hsi

    2008-10-01

    Full Text Available Abstract Background Traditional Chinese Medicine (TCM, a complementary and alternative medical system in Western countries, has been used to treat various diseases over thousands of years in East Asian countries. In recent years, many herbal medicines were found to exhibit a variety of effects through regulating a wide range of gene expressions or protein activities. As available TCM data continue to accumulate rapidly, an urgent need for exploring these resources systematically is imperative, so as to effectively utilize the large volume of literature. Methods TCM, gene, disease, biological pathway and protein-protein interaction information were collected from public databases. For association discovery, the TCM names, gene names, disease names, TCM ingredients and effects were used to annotate the literature corpus obtained from PubMed. The concept to mine entity associations was based on hypothesis testing and collocation analysis. The annotated corpus was processed with natural language processing tools and rule-based approaches were applied to the sentences for extracting the relations between TCM effecters and effects. Results We developed a database, TCMGeneDIT, to provide association information about TCMs, genes, diseases, TCM effects and TCM ingredients mined from vast amount of biomedical literature. Integrated protein-protein interaction and biological pathways information are also available for exploring the regulations of genes associated with TCM curative effects. In addition, the transitive relationships among genes, TCMs and diseases could be inferred through the shared intermediates. Furthermore, TCMGeneDIT is useful in understanding the possible therapeutic mechanisms of TCMs via gene regulations and deducing synergistic or antagonistic contributions of the prescription components to the overall therapeutic effects. The database is now available at http://tcm.lifescience.ntu.edu.tw/. Conclusion TCMGeneDIT is a unique database

  1. Systematically extracting metal- and solvent-related occupational information from free-text responses to lifetime occupational history questionnaires.

    Science.gov (United States)

    Friesen, Melissa C; Locke, Sarah J; Tornow, Carina; Chen, Yu-Cheng; Koh, Dong-Hee; Stewart, Patricia A; Purdue, Mark; Colt, Joanne S

    2014-06-01

    Lifetime occupational history (OH) questionnaires often use open-ended questions to capture detailed information about study participants' jobs. Exposure assessors use this information, along with responses to job- and industry-specific questionnaires, to assign exposure estimates on a job-by-job basis. An alternative approach is to use information from the OH responses and the job- and industry-specific questionnaires to develop programmable decision rules for assigning exposures. As a first step in this process, we developed a systematic approach to extract the free-text OH responses and convert them into standardized variables that represented exposure scenarios. Our study population comprised 2408 subjects, reporting 11991 jobs, from a case-control study of renal cell carcinoma. Each subject completed a lifetime OH questionnaire that included verbatim responses, for each job, to open-ended questions including job title, main tasks and activities (task), tools and equipment used (tools), and chemicals and materials handled (chemicals). Based on a review of the literature, we identified exposure scenarios (occupations, industries, tasks/tools/chemicals) expected to involve possible exposure to chlorinated solvents, trichloroethylene (TCE) in particular, lead, and cadmium. We then used a SAS macro to review the information reported by study participants to identify jobs associated with each exposure scenario; this was done using previously coded standardized occupation and industry classification codes, and a priori lists of associated key words and phrases related to possibly exposed tasks, tools, and chemicals. Exposure variables representing the occupation, industry, and task/tool/chemicals exposure scenarios were added to the work history records of the study respondents. Our identification of possibly TCE-exposed scenarios in the OH responses was compared to an expert's independently assigned probability ratings to evaluate whether we missed identifying

  2. The role of readability in effective health communication: an experiment using a Japanese health information text on chronic suppurative otitis media.

    Science.gov (United States)

    Sakai, Yukiko

    2013-09-01

    This study identifies the most significant readability factors and examines ways of improving and evaluating Japanese health information text in terms of ease of reading and understanding. Six different Japanese texts were prepared based on an original short text written by a medical doctor for a hospital web site intended for laypersons regarding chronic suppurative otitis media. Four were revised for single readability factor (syntax, vocabulary, or text structure) and two were modified in all three factors. Using a web-based survey, 270 high school students read one of the seven texts, including the original, completed two kinds of comprehension tests, and answered questions on their impressions of the text's readability. Significantly higher comprehension test scores were shown in the true or false test for a mixed text that presented important information first for better text structure. They were also found in the cloze test for a text using common vocabulary and a cohesive mixed text. Vocabulary could be a critical single readability factor when presumably combined with better text structure. Using multiple evaluation methods can help assess comprehensive readability. The findings on improvement and evaluation methods of readability can be applied to support effective health communication. © 2013 The authors. Health Information and Libraries Journal © 2013 Health Libraries Group Health Information and Libraries Journal.

  3. About the role of stylistic and syntactic devices of expansion in the informational complex of dicteme of a German advertising text

    Directory of Open Access Journals (Sweden)

    Артур Нарманович Мамедов

    2012-12-01

    Full Text Available The article highlights stylistic and syntactic devices of expansion, which act as compositional means, vary normative syntactic structure of an advertising text, contribute to sense formation, creating conditions for the purpose of advertiser’s intent. By means of these language elements expressing invariant tactic sense the advertiser consciously expands and/or complicates the informative complex of dicteme, an acting text unit, transmitting superfluous impressive information together with factual one. Combination of factual and impressive items of information activates both rational and emotional perceptional channels of prospective consumer, intensifies the positioning process of an advertised article.

  4. Computerized extraction of information on the quality of diabetes care from free text in electronic patient records of general practitioners

    NARCIS (Netherlands)

    Voorham, Jaco; Denig, Petra

    2007-01-01

    Objective: This study evaluated a computerized method for extracting numeric clinical measurements related to diabetes care from free text in electronic patient records (EPR) of general practitioners. Design and Measurements: Accuracy of this number-oriented approach was compared to manual chart

  5. Text Mining.

    Science.gov (United States)

    Trybula, Walter J.

    1999-01-01

    Reviews the state of research in text mining, focusing on newer developments. The intent is to describe the disparate investigations currently included under the term text mining and provide a cohesive structure for these efforts. A summary of research identifies key organizations responsible for pushing the development of text mining. A section…

  6. Dealing with Uncertainty: Readers' Memory for and Use of Conflicting Information from Science Texts as Function of Presentation Format and Source Expertise

    Science.gov (United States)

    Stadtler, Marc; Scharrer, Lisa; Brummernhenrich, Benjamin; Bromme, Rainer

    2013-01-01

    Past research has shown that readers often fail to notice conflicts in text. In our present study we investigated whether accessing information from multiple documents instead of a single document might alleviate this problem by motivating readers to integrate information. We further tested whether this effect would be moderated by source…

  7. When Bitcoin encounters information in an online forum: Using text mining to analyse user opinions and predict value fluctuation.

    Directory of Open Access Journals (Sweden)

    Young Bin Kim

    Full Text Available Bitcoin is an online currency that is used worldwide to make online payments. It has consequently become an investment vehicle in itself and is traded in a way similar to other open currencies. The ability to predict the price fluctuation of Bitcoin would therefore facilitate future investment and payment decisions. In order to predict the price fluctuation of Bitcoin, we analyse the comments posted in the Bitcoin online forum. Unlike most research on Bitcoin-related online forums, which is limited to simple sentiment analysis and does not pay sufficient attention to note-worthy user comments, our approach involved extracting keywords from Bitcoin-related user comments posted on the online forum with the aim of analytically predicting the price and extent of transaction fluctuation of the currency. The effectiveness of the proposed method is validated based on Bitcoin online forum data ranging over a period of 2.8 years from December 2013 to September 2016.

  8. The Deviser sequence: a new type of informative text from the choral interviews of MARCA.com

    Directory of Open Access Journals (Sweden)

    Daniel BARREDO IBÁÑEZ

    2014-10-01

    Full Text Available The development of Internet has brought the emergence of new forms of discourse and, therefore, new journalistic forms: new cybergenres (as the coral interviews in MARCA.com, which take advantage of the technical – ideological substrate of the net. In the next article we will focus on the morphological aspects of these participative forms, and then we will show the evolution of a case of a new type of journalism –linked to the participatory journalism – that we have named deviser to define a kind of communication which starts from a non professional transmitter. Thanks to the coral interviews these transmissions are being spread in the mass media and the social networks and after a polyphasic trip they end melted into the cultural heritage. In our ethnographic analysis, we have observed in the corals interviews vestiges of a purejournalism, a journalism which was not necessarily contaminated by organizational issues, and in general a journalistic positioning which tends to abolish hierarchies (spelling, structural, towards a more horizontality or transcendence, in what some theorists have defined as “heterarchical regimes” (Bruns, 2006, p. 6 or cybercommunism (Barbrook, 2000.

  9. Text-Based Language Teaching and the Analysis of Tasks Presented in English Course Books for Students of Information Technology and Computing

    Directory of Open Access Journals (Sweden)

    Valerija Marina

    2011-04-01

    Full Text Available The paper describes the essential features of a connected text helping to raise learners’ awareness of its structure and organization and improve their skills of reading comprehension. Classroom applications of various approaches to handling texts and text-based activities are also discussed and their main advantages and disadvantages are outlined.Tasks based on text transformation and reconstruction found in the course books of English for students of computing and information technology are analysed and their types are determined. The efficiency of the tasks is determined by considering the experience of the authors gained in using text-based assignments provided in these course books with the students of the above specialities. Some problems encountered in classroom application of the considered text-based tasks are also outlined.

  10. Contextual Text Mining

    Science.gov (United States)

    Mei, Qiaozhu

    2009-01-01

    With the dramatic growth of text information, there is an increasing need for powerful text mining systems that can automatically discover useful knowledge from text. Text is generally associated with all kinds of contextual information. Those contexts can be explicit, such as the time and the location where a blog article is written, and the…

  11. The Development of Strategies for the Assignment of Semantic Information to Unknown Lexemes in Text. Lenguas para Objetivos Especificas (Languages for Special Purposes), No. 5.

    Science.gov (United States)

    Alderson, Charles; Alvarez, Guadalupe

    An English for Special Purposes (ESP) course being developed aims to give the students a series of techniques to help them handle vocabulary in a text, and teach them strategies for identifying meaning in context. Traditional strategies, such as the study of morphology, use of grammatical information, and exercises in dictionary usage, are…

  12. A randomized trial of computer-based communications using imagery and text information to alter representations of heart disease risk and motivate protective behaviour.

    Science.gov (United States)

    Lee, Tarryn J; Cameron, Linda D; Wünsche, Burkhard; Stevens, Carey

    2011-02-01

    Advances in web-based animation technologies provide new opportunities to develop graphic health communications for dissemination throughout communities. We developed imagery and text contents of brief, computer-based programmes about heart disease risk, with both imagery and text contents guided by the common-sense model (CSM) of self-regulation. The imagery depicts a three-dimensional, beating heart tailored to user-specific information. A 2 × 2 × 4 factorial design was used to manipulate concrete imagery (imagery vs. no imagery) and conceptual information (text vs. no text) about heart disease risk in prevention-oriented programmes and assess changes in representations and behavioural motivations from baseline to 2 days, 2 weeks, and 4 weeks post-intervention. Sedentary young adults (N= 80) were randomized to view one of four programmes: imagery plus text, imagery only, text only, or control. Participants completed measures of risk representations, worry, and physical activity and healthy diet intentions and behaviours at baseline, 2 days post-intervention (except behaviours), and 2 weeks (intentions and behaviours only) and 4 weeks later. The imagery contents increased representational beliefs and mental imagery relating to heart disease, worry, and intentions at post-intervention. Increases in sense of coherence (understanding of heart disease) and worry were sustained after 1 month. The imagery contents also increased healthy diet efforts after 2 weeks. The text contents increased beliefs about causal factors, mental images of clogged arteries, and worry at post-intervention, and increased physical activity 2 weeks later and sense of coherence 1 month later. The CSM-based programmes induced short-term changes in risk representations and behaviour motivation. The combination of CSM-based text and imagery appears to be most effective in instilling risk representations that motivate protective behaviour. ©2010 The British Psychological Society.

  13. Breast cancer prevention information seeking behavior and interest on cell phone and text use: a cross-sectional study in Malaysia.

    Science.gov (United States)

    Akhtari-Zavare, Mehrnoosh; Ghanbari-Baghestan, Abbas; Latiff, Latiffah A; Khaniki, Hadi

    2015-01-01

    Breast cancer is the most common cancer and the second principal cause of cancer deaths among women worldwide, including Malaysia. This study focused on media choice and attempted to determine the communication channels mostly used and preferred by women in seeking information and knowledge about breast cancer. A cross sectional study was carried out to examine the breast cancer prevention information seeking behavior among 450 students at one private university in Malaysia. The mean age of respondents was 25±4.3 years. Common interpersonal information sources were doctors, friends, and nurses and common channel information sources were television, brochure, and internet. Overall, 89.9% used cell phones, 46.1% had an interest in receiving cell phone breast cancer prevention messages, 73.9% used text messaging, and 36.7% had an interest in receiving text breast cancer prevention messages. Bivariate analysis revealed significant differences among age, eduation, nationality and use of cell phones. Assessment of health information seeking behavior is important for community health educators to target populations for program development.

  14. The relationship of mentoring on middle school girls' science-related attitudes

    Science.gov (United States)

    Clark, Lynette M.

    This quantitative study examined the science-related attitudes of middle school girls who attended a science-focused mentoring program and those of middle school girls who attended a traditional mentoring program. Theories related to this study include social cognitive theory, cognitive development theory, and possible selves' theory. These theories emphasize social and learning experiences that may impact the science-related attitudes of middle school girls. The research questions examined the science-related attitudes of middle school girls who participate in a science-related mentoring program. The hypotheses suggested that there are significant differences that exist between the attitudes of middle school female participants in a science-related mentoring program and female participants in a traditional mentoring program. The quantitative data were collected through a survey entitled the Test of Science-Related Attitudes (TOSRA) which measures science-related attitudes. The population of interest for this study is 11-15 year old middle school girls of various racial and socio-economic backgrounds. The sample groups for the study were middle school girls participating in either a science-focused mentoring program or a traditional mentoring program. Results of the study indicated that no significant difference existed between the science-related attitudes of middle school girls in a science-related mentoring program and the attitudes of those in a traditional mentoring program. The practical implications for examining the concerns of the study would be further investigations to increase middle school girls' science-related attitudes.

  15. What Technology Skills Do Developers Need? A Text Analysis of Job Listings in Library and Information Science (LIS from Jobs.code4lib.org

    Directory of Open Access Journals (Sweden)

    Monica Maceli

    2015-09-01

    Full Text Available Technology plays an indisputably vital role in library and information science (LIS work; this rapidly moving landscape can create challenges for practitioners and educators seeking to keep pace with such change.  In pursuit of building our understanding of currently sought technology competencies in developer-oriented positions within LIS, this paper reports the results of a text analysis of a large collection of job listings culled from the Code4lib jobs website.  Beginning over a decade ago as a popular mailing list covering the intersection of technology and library work, the Code4lib organization's current offerings include a website that collects and organizes LIS-related technology job listings.  The results of the text analysis of this dataset suggest the currently vital technology skills and concepts that existing and aspiring practitioners may target in their continuing education as developers.

  16. Advantages of combined touch screen technology and text hyperlink for the pathology grossing manual: a simple approach to access instructive information in biohazardous environments.

    Science.gov (United States)

    Qu, Zhenhong; Ghorbani, Rhonda P; Li, Hongyan; Hunter, Robert L; Hannah, Christina D

    2007-03-01

    Gross examination, encompassing description, dissection, and sampling, is a complex task and an essential component of surgical pathology. Because of the complexity of the task, standardized protocols to guide the gross examination often become a bulky manual that is difficult to use. This problem is further compounded by the high specimen volume and biohazardous nature of the task. As a result, such a manual is often underused, leading to errors that are potentially harmful and time consuming to correct-a common chronic problem affecting many pathology laboratories. To combat this problem, we have developed a simple method that incorporates complex text and graphic information of a typical procedure manual and yet allows easy access to any intended instructive information in the manual. The method uses the Object-Linking-and-Embedding function of Microsoft Word (Microsoft, Redmond, WA) to establish hyperlinks among different contents, and then it uses the touch screen technology to facilitate navigation through the manual on a computer screen installed at the cutting bench with no need for a physical keyboard or a mouse. It takes less than 4 seconds to reach any intended information in the manual by 3 to 4 touches on the screen. A 3-year follow-up study shows that this method has increased use of the manual and has improved the quality of gross examination. The method is simple and can be easily tailored to different formats of instructive information, allowing flexible organization, easy access, and quick navigation. Increased compliance to instructive information reduces errors at the grossing bench and improves work efficiency.

  17. The Perfect Text.

    Science.gov (United States)

    Russo, Ruth

    1998-01-01

    A chemistry teacher describes the elements of the ideal chemistry textbook. The perfect text is focused and helps students draw a coherent whole out of the myriad fragments of information and interpretation. The text would show chemistry as the central science necessary for understanding other sciences and would also root chemistry firmly in the…

  18. Secondary school students' perceptions of working life skills in science-related careers

    Science.gov (United States)

    Salonen, Anssi; Hartikainen-Ahia, Anu; Hense, Jonathan; Scheersoi, Annette; Keinonen, Tuula

    2017-07-01

    School students demonstrate a lack of interest in choosing science studies and science-related careers. To better understand the underlying reasons, this study aims to examine secondary school students' perceptions of working life skills and how these perceptions relate to the skills of the twenty-first century. The participants in this study were 144 Finnish 7th graders (aged 13-14 years). Using a questionnaire and qualitative content analysis, we examined their perceptions of working life skills in 'careers in science' and 'careers with science'. Results reveal that although students have a great deal of knowledge about working life skills, it is often just stereotyped. Sector-specific knowledge and skills were highlighted in particular but skills related to society, organisation, time and higher order thinking, were often omitted. Results also indicate that students do not associate 'careers in science' with creativity, innovation, collaboration or technology and ICT skills. Conversely, according to the students, these careers demand more sector-specific knowledge and responsibility than 'careers with science'. We conclude that students need more wide-ranging information about scientific careers and the competencies demanded; such information can be acquired by e.g. interacting with professionals and their real working life problems.

  19. Female adolescents' perceptions, beliefs, motivations, and attitudes in the negotiation of science texts

    Science.gov (United States)

    Bennett, Camille

    This study was an investigation of female adolescents' perceptions, attitudes, and beliefs towards science and reading science-related texts. Three surveys were used to collect data from 253 middle school students in Grade 7 and Grade 8 and six interviews were conducted with students. The interviews allowed a deeper analysis of the value students placed on science and on reading science-related texts. The quantitative data were collected through the following surveys: Test of Science Related Attitudes, Motivation for Reading Informational Books in School adapted, and Metacognitive Awareness Reading Strategies Inventory adapted. The purpose of the surveys was to provide a comprehensive picture of students' self-reported perceptions, attitudes, and beliefs towards science and the motivation to engage. Literacy processes and practices make engagement and learning in science possible; however, intrinsic motivation and cognitive strategies are critical influential components that educators cannot overlook. The female adolescents in this study expressed greater competence when involved in learning science through inquiry experimentation integrated with literacy presented in different formats.

  20. Dictionaries for text production

    DEFF Research Database (Denmark)

    Fuertes-Olivera, Pedro; Bergenholtz, Henning

    2018-01-01

    Dictionaries for Text Production are information tools that are designed and constructed for helping users to produce (i.e. encode) texts, both oral and written texts. These can be broadly divided into two groups: (a) specialized text production dictionaries, i.e., dictionaries that only offer...... a small amount of lexicographic data, most or all of which are typically used in a production situation, e.g. synonym dictionaries, grammar and spelling dictionaries, collocation dictionaries, concept dictionaries such as the Longman Language Activator, which is advertised as the World’s First Production...... Dictionary; (b) general text production dictionaries, i.e., dictionaries that offer all or most of the lexicographic data that are typically used in a production situation. A review of existing production dictionaries reveals that there are many specialized text production dictionaries but only a few general...

  1. Using Gloss to Help Fifth and Sixth Graders Comprehend Social Studies Text: An Informal Study of a Learning Aid. Working Paper No. 295.

    Science.gov (United States)

    Witte, Pauline

    A two-part study examined the effectiveness of glossing (writing comments or questions in text to improve comprehension) when students use it in social studies texts in combination with discussions and other activities. Students were divided into two groups, one of which learned glossing while the other engaged in assigned workbook activities.…

  2. Text analysis methods, text analysis apparatuses, and articles of manufacture

    Science.gov (United States)

    Whitney, Paul D; Willse, Alan R; Lopresti, Charles A; White, Amanda M

    2014-10-28

    Text analysis methods, text analysis apparatuses, and articles of manufacture are described according to some aspects. In one aspect, a text analysis method includes accessing information indicative of data content of a collection of text comprising a plurality of different topics, using a computing device, analyzing the information indicative of the data content, and using results of the analysis, identifying a presence of a new topic in the collection of text.

  3. Directed Activities Related to Text: Text Analysis and Text Reconstruction.

    Science.gov (United States)

    Davies, Florence; Greene, Terry

    This paper describes Directed Activities Related to Text (DART), procedures that were developed and are used in the Reading for Learning Project at the University of Nottingham (England) to enhance learning from texts and that fall into two broad categories: (1) text analysis procedures, which require students to engage in some form of analysis of…

  4. A Customizable Text Classifier for Text Mining

    Directory of Open Access Journals (Sweden)

    Yun-liang Zhang

    2007-12-01

    Full Text Available Text mining deals with complex and unstructured texts. Usually a particular collection of texts that is specified to one or more domains is necessary. We have developed a customizable text classifier for users to mine the collection automatically. It derives from the sentence category of the HNC theory and corresponding techniques. It can start with a few texts, and it can adjust automatically or be adjusted by user. The user can also control the number of domains chosen and decide the standard with which to choose the texts based on demand and abundance of materials. The performance of the classifier varies with the user's choice.

  5. Secondary School Students' Perceptions of Working Life Skills in Science-Related Careers

    Science.gov (United States)

    Salonen, Anssi; Hartikainen-Ahia, Anu; Hense, Jonathan; Scheersoi, Annette; Keinonen, Tuula

    2017-01-01

    School students demonstrate a lack of interest in choosing science studies and science-related careers. To better understand the underlying reasons, this study aims to examine secondary school students' perceptions of working life skills and how these perceptions relate to the skills of the twenty-first century. The participants in this study were…

  6. Using Picture and Text Schedules to Inform Children: Effects on Distress and Pain during Needle-Related Procedures in Nitrous Oxide Sedation

    Directory of Open Access Journals (Sweden)

    Merja Vantaa Benjaminsson

    2015-01-01

    Full Text Available During hospital visits, children often undergo examinations and treatments that may involve an experience of pain and distress that is also connected to the staff’s treatment. The United Nation’s Convention on the Rights of Persons with Disability advocates the use of Universal Design. One way of implementing this idea within paediatric nursing is to increase the use of pictorial supports, and the few studies that have been published show promising results. The aim of this study was to do a comparison between two groups of children in regard to the pre- and postconditions of implementing an intervention including staff instruction and the use of pictorial support. The support consisted of a visual schedule with pictures and text, used both preparatory to and during the hospital visit. One hundred children aged 5–15 (50 children during the preinterventional data collection and 50 children postinterventionally reported pain intensity and distress during needle-related procedures in nitrous oxide sedation. The results showed that the intervention had a positive effect in significantly lowering the level of preprocedural distress. The results showed that the pain intensity was also lowered however not reaching statistical significance. This confirms other positive research results on the use of visual supports within paediatric care, a topic that has to be further studied.

  7. Informe

    Directory of Open Access Journals (Sweden)

    Egon Lichetenberger

    1950-10-01

    Full Text Available Informe del doctor Egon Lichetenberger ante el Consejo Directivo de la Facultad, sobre el  curso de especialización en Anatomía Patológica patrocinado por la Kellogg Foundation (Departamento de Patología

  8. How far can a translator go from the original text? Comparison and analysis of two translations of "The Happy Prince," regarding lexical choice, information tlow and gender

    Directory of Open Access Journals (Sweden)

    Ana Cristina Ostermann

    2012-02-01

    Full Text Available No presente artigo analiso comparativamente duas traduções-do conto "O príncipe Feliz", de Oscar Wilde, considerando-se três aspectos específicos: escolha lexical, fluxo de informações e gênero. Com este estudo tenho como objetico demonstrar como, neste caso em particular, estratégias que o/a tradutor/a utiliza podem aproximar e/ou afastar a tradução não apenas do texto original, bem como da audiência - alvo. Conforme se pode observar ao longo deste artigo, muitas destas escolhas, principalmente aquelas relacionadas ao gênero, acarretam o que se poderia considerar "danos à tradução", e geram novas modificações as quais o/a tradutor/a fica impelido a praticar. Igualmente significativas são as escolhas leXicais que limitam um texto (originalmente destinado ao público adulto e infantil ao leitor maduro e com um vocabulário mais apurado.

  9. Utah Text Retrieval Project

    Energy Technology Data Exchange (ETDEWEB)

    Hollaar, L A

    1983-10-01

    The Utah Text Retrieval project seeks well-engineered solutions to the implementation of large, inexpensive, rapid text information retrieval systems. The project has three major components. Perhaps the best known is the work on the specialized processors, particularly search engines, necessary to achieve the desired performance and cost. The other two concern the user interface to the system and the system's internal structure. The work on user interface development is not only concentrating on the syntax and semantics of the query language, but also on the overall environment the system presents to the user. Environmental enhancements include convenient ways to browse through retrieved documents, access to other information retrieval systems through gateways supporting a common command interface, and interfaces to word processing systems. The system's internal structure is based on a high-level data communications protocol linking the user interface, index processor, search processor, and other system modules. This allows them to be easily distributed in a multi- or specialized-processor configuration. It also allows new modules, such as a knowledge-based query reformulator, to be added. 15 references.

  10. Everyday science & science every day: Science-related talk & activities across settings

    Science.gov (United States)

    Zimmerman, Heather

    To understand the development of science-related thinking, acting, and learning in middle childhood, I studied youth in schools, homes, and other neighborhood settings over a three-year period. The research goal was to analyze how multiple everyday experiences influence children's participation in science-related practices and their thinking about science and scientists. Ethnographic and interaction analysis methodologies were to study the cognition and social interactions of the children as they participated in activities with peers, family, and teachers (n=128). Interviews and participant self-documentation protocols elucidated the participants' understandings of science. An Everyday Expertise (Bell et al., 2006) theoretical framework was employed to study the development of science understandings on three analytical planes: individual learner, social groups, and societal/community resources. Findings came from a cross-case analysis of urban science learners and from two within-case analyses of girls' science-related practices as they transitioned from elementary to middle school. Results included: (1) children participated actively in science across settings---including in their homes as well as in schools, (2) children's interests in science were not always aligned to the school science content, pedagogy, or school structures for participation, yet children found ways to engage with science despite these differences through crafting multiple pathways into science, (3) urban parents were active supporters of STEM-related learning environments through brokering access to social and material resources, (4) the youth often found science in their daily activities that formal education did not make use of, and (5) children's involvement with science-related practices can be developed into design principles to reach youth in culturally relevant ways.

  11. Interconnectedness und digitale Texte

    Directory of Open Access Journals (Sweden)

    Detlev Doherr

    2013-04-01

    Full Text Available Zusammenfassung Die multimedialen Informationsdienste im Internet werden immer umfangreicher und umfassender, wobei auch die nur in gedruckter Form vorliegenden Dokumente von den Bibliotheken digitalisiert und ins Netz gestellt werden. Über Online-Dokumentenverwaltungen oder Suchmaschinen können diese Dokumente gefunden und dann in gängigen Formaten wie z.B. PDF bereitgestellt werden. Dieser Artikel beleuchtet die Funktionsweise der Humboldt Digital Library, die seit mehr als zehn Jahren Dokumente von Alexander von Humboldt in englischer Übersetzung im Web als HDL (Humboldt Digital Library kostenfrei zur Verfügung stellt. Anders als eine digitale Bibliothek werden dabei allerdings nicht nur digitalisierte Dokumente als Scan oder PDF bereitgestellt, sondern der Text als solcher und in vernetzter Form verfügbar gemacht. Das System gleicht damit eher einem Informationssystem als einer digitalen Bibliothek, was sich auch in den verfügbaren Funktionen zur Auffindung von Texten in unterschiedlichen Versionen und Übersetzungen, Vergleichen von Absätzen verschiedener Dokumente oder der Darstellung von Bilden in ihrem Kontext widerspiegelt. Die Entwicklung von dynamischen Hyperlinks auf der Basis der einzelnen Textabsätze der Humboldt‘schen Werke in Form von Media Assets ermöglicht eine Nutzung der Programmierschnittstelle von Google Maps zur geographischen wie auch textinhaltlichen Navigation. Über den Service einer digitalen Bibliothek hinausgehend, bietet die HDL den Prototypen eines mehrdimensionalen Informationssystems, das mit dynamischen Strukturen arbeitet und umfangreiche thematische Auswertungen und Vergleiche ermöglicht. Summary The multimedia information services on Internet are becoming more and more comprehensive, even the printed documents are digitized and republished as digital Web documents by the libraries. Those digital files can be found by search engines or management tools and provided as files in usual formats as

  12. Text-Fabric

    NARCIS (Netherlands)

    Roorda, Dirk

    2016-01-01

    Text-Fabric is a Python3 package for Text plus Annotations. It provides a data model, a text file format, and a binary format for (ancient) text plus (linguistic) annotations. The emphasis of this all is on: data processing; sharing data; and contributing modules. A defining characteristic is that

  13. XML and Free Text.

    Science.gov (United States)

    Riggs, Ken Roger

    2002-01-01

    Discusses problems with marking free text, text that is either natural language or semigrammatical but unstructured, that prevent well-formed XML from marking text for readily available meaning. Proposes a solution to mark meaning in free text that is consistent with the intended simplicity of XML versus SGML. (Author/LRW)

  14. Text Mining Applications and Theory

    CERN Document Server

    Berry, Michael W

    2010-01-01

    Text Mining: Applications and Theory presents the state-of-the-art algorithms for text mining from both the academic and industrial perspectives.  The contributors span several countries and scientific domains: universities, industrial corporations, and government laboratories, and demonstrate the use of techniques from machine learning, knowledge discovery, natural language processing and information retrieval to design computational models for automated text analysis and mining. This volume demonstrates how advancements in the fields of applied mathematics, computer science, machine learning

  15. Predicting Prosody from Text for Text-to-Speech Synthesis

    CERN Document Server

    Rao, K Sreenivasa

    2012-01-01

    Predicting Prosody from Text for Text-to-Speech Synthesis covers the specific aspects of prosody, mainly focusing on how to predict the prosodic information from linguistic text, and then how to exploit the predicted prosodic knowledge for various speech applications. Author K. Sreenivasa Rao discusses proposed methods along with state-of-the-art techniques for the acquisition and incorporation of prosodic knowledge for developing speech systems. Positional, contextual and phonological features are proposed for representing the linguistic and production constraints of the sound units present in the text. This book is intended for graduate students and researchers working in the area of speech processing.

  16. E-text

    DEFF Research Database (Denmark)

    Finnemann, Niels Ole

    2018-01-01

    text can be defined by taking as point of departure the digital format in which everything is represented in the binary alphabet. While the notion of text, in most cases, lends itself to be independent of medium and embodiment, it is also often tacitly assumed that it is, in fact, modeled around...... the print medium, rather than written text or speech. In late 20th century, the notion of text was subject to increasing criticism as in the question raised within literary text theory: is there a text in this class? At the same time, the notion was expanded by including extra linguistic sign modalities...

  17. Texting on the Move

    Science.gov (United States)

    ... text. What's the Big Deal? The problem is multitasking. No matter how young and agile we are, ... on something other than the road. In fact, driving while texting (DWT) can be more dangerous than ...

  18. Text Coherence in Translation

    Science.gov (United States)

    Zheng, Yanping

    2009-01-01

    In the thesis a coherent text is defined as a continuity of senses of the outcome of combining concepts and relations into a network composed of knowledge space centered around main topics. And the author maintains that in order to obtain the coherence of a target language text from a source text during the process of translation, a translator can…

  19. Effective and responsible teaching of climate change in Earth Science-related disciplines

    Science.gov (United States)

    Robinson, Z. P.; Greenhough, B. J.

    2009-04-01

    Climate change is a core topic within Earth Science-related courses. This vast topic covers a wide array of different aspects that could be covered, from past climatic change across a vast range of scales to environmental (and social and economic) impacts of future climatic change and strategies for reducing anthropogenic climate change. The Earth Science disciplines play a crucial role in our understanding of past, present and future climate change and the Earth system in addition to understanding leading to development of strategies and technological solutions to achieve sustainability. However, an increased knowledge of the occurrence and causes of past (natural) climate changes can lead to a lessened concern and sense of urgency and responsibility amongst students in relation to anthropogenic causes of climatic change. Two concepts integral to the teaching of climate change are those of scientific uncertainty and complexity, yet an emphasis on these concepts can lead to scepticism about future predictions and a further loss of sense of urgency. The requirement to understand the nature of scientific uncertainty and think and move between different scales in particular relating an increased knowledge of longer timescale climatic change to recent (industrialised) climate change, are clearly areas of troublesome knowledge that affect students' sense of responsibility towards their role in achieving a sustainable society. Study of the attitudes of university students in a UK HE institution on a range of Earth Science-related programmes highlights a range of different attitudes in the student body towards the subject of climate change. Students express varied amounts of ‘climate change saturation' resulting from both media and curriculum coverage, a range of views relating to the significance of humans to the global climate and a range of opinions about the relevance of environmental citizenship to their degree programme. Climate change is therefore a challenging

  20. SparkText: Biomedical Text Mining on Big Data Framework.

    Directory of Open Access Journals (Sweden)

    Zhan Ye

    Full Text Available Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment.In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM, and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes.This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.

  1. Vocabulary Constraint on Texts

    Directory of Open Access Journals (Sweden)

    C. Sutarsyah

    2008-01-01

    Full Text Available This case study was carried out in the English Education Department of State University of Malang. The aim of the study was to identify and describe the vocabulary in the reading text and to seek if the text is useful for reading skill development. A descriptive qualitative design was applied to obtain the data. For this purpose, some available computer programs were used to find the description of vocabulary in the texts. It was found that the 20 texts containing 7,945 words are dominated by low frequency words which account for 16.97% of the words in the texts. The high frequency words occurring in the texts were dominated by function words. In the case of word levels, it was found that the texts have very limited number of words from GSL (General Service List of English Words (West, 1953. The proportion of the first 1,000 words of GSL only accounts for 44.6%. The data also show that the texts contain too large proportion of words which are not in the three levels (the first 2,000 and UWL. These words account for 26.44% of the running words in the texts.  It is believed that the constraints are due to the selection of the texts which are made of a series of short-unrelated texts. This kind of text is subject to the accumulation of low frequency words especially those of content words and limited of words from GSL. It could also defeat the development of students' reading skills and vocabulary enrichment.

  2. Knowledge Representation in Travelling Texts

    DEFF Research Database (Denmark)

    Mousten, Birthe; Locmele, Gunta

    2014-01-01

    Today, information travels fast. Texts travel, too. In a corporate context, the question is how to manage which knowledge elements should travel to a new language area or market and in which form? The decision to let knowledge elements travel or not travel highly depends on the limitation...... and the purpose of the text in a new context as well as on predefined parameters for text travel. For texts used in marketing and in technology, the question is whether culture-bound knowledge representation should be domesticated or kept as foreign elements, or should be mirrored or moulded—or should not travel...... at all! When should semantic and pragmatic elements in a text be replaced and by which other elements? The empirical basis of our work is marketing and technical texts in English, which travel into the Latvian and Danish markets, respectively....

  3. Instant Sublime Text starter

    CERN Document Server

    Haughee, Eric

    2013-01-01

    A starter which teaches the basic tasks to be performed with Sublime Text with the necessary practical examples and screenshots. This book requires only basic knowledge of the Internet and basic familiarity with any one of the three major operating systems, Windows, Linux, or Mac OS X. However, as Sublime Text 2 is primarily a text editor for writing software, many of the topics discussed will be specifically relevant to software development. That being said, the Sublime Text 2 Starter is also suitable for someone without a programming background who may be looking to learn one of the tools of

  4. SparkText: Biomedical Text Mining on Big Data Framework

    Science.gov (United States)

    He, Karen Y.; Wang, Kai

    2016-01-01

    Background Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. Results In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. Conclusions This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research. PMID:27685652

  5. SparkText: Biomedical Text Mining on Big Data Framework.

    Science.gov (United States)

    Ye, Zhan; Tafti, Ahmad P; He, Karen Y; Wang, Kai; He, Max M

    Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.

  6. Linguistics in Text Interpretation

    DEFF Research Database (Denmark)

    Togeby, Ole

    2011-01-01

    A model for how text interpretation proceeds from what is pronounced, through what is said to what is comunicated, and definition of the concepts 'presupposition' and 'implicature'.......A model for how text interpretation proceeds from what is pronounced, through what is said to what is comunicated, and definition of the concepts 'presupposition' and 'implicature'....

  7. LocText

    DEFF Research Database (Denmark)

    Cejuela, Juan Miguel; Vinchurkar, Shrikant; Goldberg, Tatyana

    2018-01-01

    trees and was trained and evaluated on a newly improved LocTextCorpus. Combined with an automatic named-entity recognizer, LocText achieved high precision (P = 86%±4). After completing development, we mined the latest research publications for three organisms: human (Homo sapiens), budding yeast...

  8. Systematic text condensation

    DEFF Research Database (Denmark)

    Malterud, Kirsti

    2012-01-01

    To present background, principles, and procedures for a strategy for qualitative analysis called systematic text condensation and discuss this approach compared with related strategies.......To present background, principles, and procedures for a strategy for qualitative analysis called systematic text condensation and discuss this approach compared with related strategies....

  9. Text 2 Mind Map

    OpenAIRE

    Iona, John

    2017-01-01

    This is a review of the web resource 'Text 2 Mind Map' www.Text2MindMap.com. It covers what the resource is, and how it might be used in Library and education context, in particular for School Librarians.

  10. Text File Comparator

    Science.gov (United States)

    Kotler, R. S.

    1983-01-01

    File Comparator program IFCOMP, is text file comparator for IBM OS/VScompatable systems. IFCOMP accepts as input two text files and produces listing of differences in pseudo-update form. IFCOMP is very useful in monitoring changes made to software at the source code level.

  11. Salton and Buckley’s Landmark Research in Experimental Text Information Retrieval. A Review of: Salton, G., & Buckley, C. (1990. Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41(4, 288–297.

    Directory of Open Access Journals (Sweden)

    Christine F. Marton

    2011-01-01

    Full Text Available Objectives – To compare the performance of the vector space model and the probabilistic weighting model of relevance feedback for the overall purpose of determining the most useful relevance feedback procedures. The amount of improvement that can be obtained from searching several test document collections with only one feedback iteration of each relevance feedback model was measured.Design – The experimental design consisted of 72 different tests: 2 different relevance feedback methods, each with 6 permutations, on 6 test document collections of various sizes. A residual collection method was utilized to ascertain the “true advantage provided by the relevance feedback process.” (Salton & Buckley, 1990, p. 293Setting – Department of Computer Science at Cornell University.Subjects – Six test document collections.Methods – Relevance feedback is an effective technique for query modification that provides significant improvement in search performance. Relevance feedback entails both “term reweighting,” the modification of term weights based on term use in retrieved relevant and non-relevant documents, and “query expansion,” which is the addition of new terms from relevant documents retrieved (Harman, 1992. Salton and Buckley (1990 evaluated two established relevance feedback models based on the vector space model (a spatial model and the probabilistic model, respectively. Harman (1992 describes the two key differences between these competing models of relevance feedback.[The vector space model merges] document vectors and original query vectors. This automatically reweights query terms by adding the weights from the actual occurrence of those query terms in the relevant documents, and subtracting the weights of those terms occurring in the non-relevant documents. Queries are automatically expanded by adding all the terms not in the original query that are in the relevant documents and non-relevant documents. They are expanded

  12. Zum Bildungspotenzial biblischer Texte

    Directory of Open Access Journals (Sweden)

    Theis, Joachim

    2017-11-01

    Full Text Available Biblical education as a holistic process goes far beyond biblical learning. It must be understood as a lifelong process, in which both biblical texts and their understanders operate appropriating their counterpart in a dialogical way. – Neither does the recipient’s horizon of understanding appear as an empty room, which had to be filled with the text only, nor is the latter a dead material one could only examine cognitively. The recipient discovers the meaning of the biblical text recomposing it by existential appropriation. So the text is brought to live in each individual reality. Both scientific insights and subjective structures as well as the understanders’ community must be included to avoid potential one-sidednesses. Unfortunately, a special negative association obscures the approach of the bible very often: Still biblical work as part of religious education appears in a cognitively oriented habit, which is neither regarding the vitality and sovereignty of the biblical texts nor the students’ desire for meaning. Moreover, the bible is getting misused for teaching moral terms or pontifications. Such downfalls can be disrupted by biblical didactics which are empowerment didactics. Regarding the sovereignty of biblical texts, these didactics assist the understander with his/her individuation by opening the texts with focus on the understander’s otherness. Thus each the text and the recipient become subjects in a dialogue. The approach of the Biblical-Enabling-Didactics leads the Bible to become always new a book of life. Understanding them from within their hermeneutics, empowerment didactics could be raised to the principle of biblical didactics in general and grow into an essential element of holistic education.

  13. download full text

    African Journals Online (AJOL)

    Dale E. Zand (1997) argues that People once stood in awe of electricity, until ... in today's information-driven organizations: knowledge, trust, and power. ..... people's culture and resistance to anti-corruption efforts constitute the firmly fixed load.

  14. EST: Evading Scientific Text.

    Science.gov (United States)

    Ward, Jeremy

    2001-01-01

    Examines chemical engineering students' attitudes to text and other parts of English language textbooks. A questionnaire was administered to a group of undergraduates. Results reveal one way students get around the problem of textbook reading. (Author/VWL)

  15. nal Sesotho texts

    African Journals Online (AJOL)

    with literary texts written in indigenous South African languages. The project ... Homi Bhabha uses the words of Salman Rushdie to underline the fact that new .... I could not conceptualise an African-language-to-African-language dictionary. An.

  16. Plagiarism in Academic Texts

    Directory of Open Access Journals (Sweden)

    Marta Eugenia Rojas-Porras

    2012-08-01

    Full Text Available The ethical and social responsibility of citing the sources in a scientific or artistic work is undeniable. This paper explores, in a preliminary way, academic plagiarism in its various forms. It includes findings based on a forensic analysis. The purpose of this paper is to raise awareness on the importance of considering these details when writing and publishing a text. Hopefully, this analysis may put the issue under discussion.

  17. Machine Translation from Text

    Science.gov (United States)

    Habash, Nizar; Olive, Joseph; Christianson, Caitlin; McCary, John

    Machine translation (MT) from text, the topic of this chapter, is perhaps the heart of the GALE project. Beyond being a well defined application that stands on its own, MT from text is the link between the automatic speech recognition component and the distillation component. The focus of MT in GALE is on translating from Arabic or Chinese to English. The three languages represent a wide range of linguistic diversity and make the GALE MT task rather challenging and exciting.

  18. TEXT Energy Storage System

    International Nuclear Information System (INIS)

    Weldon, W.F.; Rylander, H.G.; Woodson, H.H.

    1977-01-01

    The Texas Experimental Tokamak (TEXT) Enery Storage System, designed by the Center for Electromechanics (CEM), consists of four 50 MJ, 125 V homopolar generators and their auxiliaries and is designed to power the toroidal and poloidal field coils of TEXT on a two-minute duty cycle. The four 50 MJ generators connected in series were chosen because they represent the minimum cost configuration and also represent a minimal scale up from the successful 5.0 MJ homopolar generator designed, built, and operated by the CEM

  19. Text document classification

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana

    č. 62 (2005), s. 53-54 ISSN 0926-4981 R&D Projects: GA AV ČR IAA2075302; GA AV ČR KSK1019101; GA MŠk 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : document representation * categorization * classification Subject RIV: BD - Theory of Information

  20. New mathematical cuneiform texts

    CERN Document Server

    Friberg, Jöran

    2016-01-01

    This monograph presents in great detail a large number of both unpublished and previously published Babylonian mathematical texts in the cuneiform script. It is a continuation of the work A Remarkable Collection of Babylonian Mathematical Texts (Springer 2007) written by Jöran Friberg, the leading expert on Babylonian mathematics. Focussing on the big picture, Friberg explores in this book several Late Babylonian arithmetical and metro-mathematical table texts from the sites of Babylon, Uruk and Sippar, collections of mathematical exercises from four Old Babylonian sites, as well as a new text from Early Dynastic/Early Sargonic Umma, which is the oldest known collection of mathematical exercises. A table of reciprocals from the end of the third millennium BC, differing radically from well-documented but younger tables of reciprocals from the Neo-Sumerian and Old-Babylonian periods, as well as a fragment of a Neo-Sumerian clay tablet showing a new type of a labyrinth are also discussed. The material is presen...

  1. The Emar Lexical Texts

    NARCIS (Netherlands)

    Gantzert, Merijn

    2011-01-01

    This four-part work provides a philological analysis and a theoretical interpretation of the cuneiform lexical texts found in the Late Bronze Age city of Emar, in present-day Syria. These word and sign lists, commonly dated to around 1100 BC, were almost all found in the archive of a single school.

  2. Text Induced Spelling Correction

    NARCIS (Netherlands)

    Reynaert, M.W.C.

    2004-01-01

    We present TISC, a language-independent and context-sensitive spelling checking and correction system designed to facilitate the automatic removal of non-word spelling errors in large corpora. Its lexicon is derived from a very large corpus of raw text, without supervision, and contains word

  3. Texts and Readers.

    Science.gov (United States)

    Iser, Wolfgang

    1980-01-01

    Notes that, since fictional discourse need not reflect prevailing systems of meaning and norms or values, readers gain detachment from their own presuppositions; by constituting and formulating text-sense, readers are constituting and formulating their own cognition and becoming aware of the operations for doing so. (FL)

  4. Documents and legal texts

    International Nuclear Information System (INIS)

    2017-01-01

    This section treats of the following documents and legal texts: 1 - Belgium 29 June 2014 - Act amending the Act of 22 July 1985 on Third-Party Liability in the Field of Nuclear Energy; 2 - Belgium, 7 December 2016. - Act amending the Act of 22 July 1985 on Third-Party Liability in the Field of Nuclear Energy

  5. Strategy as Texts

    DEFF Research Database (Denmark)

    Obed Madsen, Søren

    of the strategy into four categories. Second, the managers produce new texts based on the original strategy document by using four different ways of translation models. The study’s findings contribute to three areas. Firstly, it shows that translation is more than a sociological process. It is also...... a craftsmanship that requires knowledge and skills, which unfortunately seems to be overlooked in both the literature and in practice. Secondly, it shows that even though a strategy text is in singular, the translation makes strategy plural. Thirdly, the article proposes a way to open up the black box of what......This article shows empirically how managers translate a strategy plan at an individual level. By analysing how managers in three organizations translate strategies, it identifies that the translation happens in two steps: First, the managers decipher the strategy by coding the different parts...

  6. Text Mining for Protein Docking.

    Directory of Open Access Journals (Sweden)

    Varsha D Badal

    2015-12-01

    Full Text Available The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking. Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu. The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound

  7. Stemming Malay Text and Its Application in Automatic Text Categorization

    Science.gov (United States)

    Yasukawa, Michiko; Lim, Hui Tian; Yokoo, Hidetoshi

    In Malay language, there are no conjugations and declensions and affixes have important grammatical functions. In Malay, the same word may function as a noun, an adjective, an adverb, or, a verb, depending on its position in the sentence. Although extensively simple root words are used in informal conversations, it is essential to use the precise words in formal speech or written texts. In Malay, to make sentences clear, derivative words are used. Derivation is achieved mainly by the use of affixes. There are approximately a hundred possible derivative forms of a root word in written language of the educated Malay. Therefore, the composition of Malay words may be complicated. Although there are several types of stemming algorithms available for text processing in English and some other languages, they cannot be used to overcome the difficulties in Malay word stemming. Stemming is the process of reducing various words to their root forms in order to improve the effectiveness of text processing in information systems. It is essential to avoid both over-stemming and under-stemming errors. We have developed a new Malay stemmer (stemming algorithm) for removing inflectional and derivational affixes. Our stemmer uses a set of affix rules and two types of dictionaries: a root-word dictionary and a derivative-word dictionary. The use of set of rules is aimed at reducing the occurrence of under-stemming errors, while that of the dictionaries is believed to reduce the occurrence of over-stemming errors. We performed an experiment to evaluate the application of our stemmer in text mining software. For the experiment, text data used were actual web pages collected from the World Wide Web to demonstrate the effectiveness of our Malay stemming algorithm. The experimental results showed that our stemmer can effectively increase the precision of the extracted Boolean expressions for text categorization.

  8. Reading Authentic Texts

    DEFF Research Database (Denmark)

    Balling, Laura Winther

    2013-01-01

    Most research on cognates has focused on words presented in isolation that are easily defined as cognate between L1 and L2. In contrast, this study investigates what counts as cognate in authentic texts and how such cognates are read. Participants with L1 Danish read news articles in their highly...... proficient L2, English, while their eye-movements were monitored. The experiment shows a cognate advantage for morphologically simple words, but only when cognateness is defined relative to translation equivalents that are appropriate in the context. For morphologically complex words, a cognate disadvantage...... word predictability indexed by the conditional probability of each word....

  9. Documents and legal texts

    International Nuclear Information System (INIS)

    2016-01-01

    This section treats of the following documents and legal texts: 1 - Brazil: Law No. 13,260 of 16 March 2016 (To regulate the provisions of item XLIII of Article 5 of the Federal Constitution on terrorism, dealing with investigative and procedural provisions and redefining the concept of a terrorist organisation; and amends Laws No. 7,960 of 21 December 1989 and No. 12,850 of 2 August 2013); 2 - India: The Atomic Energy (Amendment) Act, 2015; Department Of Atomic Energy Notification (Civil Liability for Nuclear Damage); 3 - Japan: Act on Subsidisation, etc. for Nuclear Damage Compensation Funds following the implementation of the Convention on Supplementary Compensation for Nuclear Damage

  10. Journalistic Text Production

    DEFF Research Database (Denmark)

    Haugaard, Rikke Hartmann

    , a multiple case study investigated three professional text producers’ practices as they unfolded in their natural setting at the Spanish newspaper, El Mundo. • Results indicate that journalists’ revisions are related to form markedly more often than to content. • Results suggest two writing phases serving...... at the Spanish newspaper, El Mundo, in Madrid. The study applied a combination of quantitative and qualitative methods, i.e. keystroke logging, participant observation and retrospective interview. Results indicate that journalists’ revisions are related to form markedly more often than to content (approx. three...

  11. Weitere Texte physiognomischen Inhalts

    Directory of Open Access Journals (Sweden)

    Böck, Barbara

    2004-12-01

    Full Text Available The present article offers the edition of three cuneiform texts belonging to the Akkadian handbook of omens drawn from the physical appearance as well as the morals and behaviour of man. The book comprising up to 27 chapters with more than 100 omens each was entitled in antiquity Alamdimmû. The edition of the three cuneiform tablets completes, thus, the author's monographic study on the ancient Mesopotamian divinatory discipline of physiognomy (Die babylonisch-assyrische Morphoskopie (Wien 2000 [=AfO Beih. 27].

    En este artículo se presenta la editio princeps de tres textos cuneiformes conservados en el British Museum (Londres y el Vorderasiatisches Museum (Berlín, que pertenecen al libro asirio-babilonio de presagios fisiognómicos. Este libro, titulado originalmente Alamdimmû ('forma, figura', consta de 27 capítulos, cada uno con más de cien presagios escritos en lengua acadia. Los tres textos completan así el estudio monográfico de la autora sobre la disciplina adivinatoria de la fisiognomía en el antiguo Oriente (Die babylonisch-assyrische Morphoskopie (Wien 2000 [=AfO Beih. 27].

  12. Reasoning with Annotations of Texts

    OpenAIRE

    Ma , Yue; Lévy , François; Ghimire , Sudeep

    2011-01-01

    International audience; Linguistic and semantic annotations are important features for text-based applications. However, achieving and maintaining a good quality of a set of annotations is known to be a complex task. Many ad hoc approaches have been developed to produce various types of annotations, while comparing those annotations to improve their quality is still rare. In this paper, we propose a framework in which both linguistic and domain information can cooperate to reason with annotat...

  13. Documents and legal texts

    International Nuclear Information System (INIS)

    2013-01-01

    This section reprints a selection of recently published legislative texts and documents: - Russian Federation: Federal Law No.170 of 21 November 1995 on the use of atomic energy, Adopted by the State Duma on 20 October 1995; - Uruguay: Law No.19.056 On the Radiological Protection and Safety of Persons, Property and the Environment (4 January 2013); - Japan: Third Supplement to Interim Guidelines on Determination of the Scope of Nuclear Damage resulting from the Accident at the Tokyo Electric Power Company Fukushima Daiichi and Daini Nuclear Power Plants (concerning Damages related to Rumour-Related Damage in the Agriculture, Forestry, Fishery and Food Industries), 30 January 2013; - France and the United States: Joint Statement on Liability for Nuclear Damage (Aug 2013); - Franco-Russian Nuclear Power Declaration (1 November 2013)

  14. Documents and legal texts

    International Nuclear Information System (INIS)

    2015-01-01

    This section treats of the following Documents and legal texts: 1 - Canada: Nuclear Liability and Compensation Act (An Act respecting civil liability and compensation for damage in case of a nuclear incident, repealing the Nuclear Liability Act and making consequential amendments to other acts); 2 - Japan: Act on Compensation for Nuclear Damage (The purpose of this act is to protect persons suffering from nuclear damage and to contribute to the sound development of the nuclear industry by establishing a basic system regarding compensation in case of nuclear damage caused by reactor operation etc.); Act on Indemnity Agreements for Compensation of Nuclear Damage; 3 - Slovak Republic: Act on Civil Liability for Nuclear Damage and on its Financial Coverage and on Changes and Amendments to Certain Laws (This Act regulates: a) The civil liability for nuclear damage incurred in the causation of a nuclear incident, b) The scope of powers of the Nuclear Regulatory Authority (hereinafter only as the 'Authority') in relation to the application of this Act, c) The competence of the National Bank of Slovakia in relation to the supervised financial market entities in the financial coverage of liability for nuclear damage; and d) The penalties for violation of this Act)

  15. Documents and legal texts

    International Nuclear Information System (INIS)

    2014-01-01

    This section of the Bulletin presents the recently published documents and legal texts sorted by country: - Brazil: Resolution No. 169 of 30 April 2014. - Japan: Act Concerning Exceptions to Interruption of Prescription Pertaining to Use of Settlement Mediation Procedures by the Dispute Reconciliation Committee for Nuclear Damage Compensation in relation to Nuclear Damage Compensation Disputes Pertaining to the Great East Japan Earthquake (Act No. 32 of 5 June 2013); Act Concerning Measures to Achieve Prompt and Assured Compensation for Nuclear Damage Arising from the Nuclear Plant Accident following the Great East Japan Earthquake and Exceptions to the Extinctive Prescription, etc. of the Right to Claim Compensation for Nuclear Damage (Act No. 97 of 11 December 2013); Fourth Supplement to Interim Guidelines on Determination of the Scope of Nuclear Damage Resulting from the Accident at the Tokyo Electric Power Company Fukushima Daiichi and Daini Nuclear Power Plants (Concerning Damages Associated with the Prolongation of Evacuation Orders, etc.); Outline of 'Fourth Supplement to Interim Guidelines (Concerning Damages Associated with the Prolongation of Evacuation Orders, etc.)'. - OECD Nuclear Energy Agency: Decision and Recommendation of the Steering Committee Concerning the Application of the Paris Convention to Nuclear Installations in the Process of Being Decommissioned; Joint Declaration on the Security of Supply of Medical Radioisotopes. - United Arab Emirates: Federal Decree No. (51) of 2014 Ratifying the Convention on Supplementary Compensation for Nuclear Damage; Ratification of the Federal Supreme Council of Federal Decree No. (51) of 2014 Ratifying the Convention on Supplementary Compensation for Nuclear Damage

  16. Bengali text summarization by sentence extraction

    OpenAIRE

    Sarkar, Kamal

    2012-01-01

    Text summarization is a process to produce an abstract or a summary by selecting significant portion of the information from one or more texts. In an automatic text summarization process, a text is given to the computer and the computer returns a shorter less redundant extract or abstract of the original text(s). Many techniques have been developed for summarizing English text(s). But, a very few attempts have been made for Bengali text summarization. This paper presents a method for Bengali ...

  17. The Effect of a Text Messaging Based HIV Prevention Program on Sexual Minority Male Youths: A National Evaluation of Information, Motivation and Behavioral Skills in a Randomized Controlled Trial of Guy2Guy.

    Science.gov (United States)

    Ybarra, Michele L; Liu, Weiwei; Prescott, Tonya L; Phillips, Gregory; Mustanski, Brian

    2018-04-25

    There is a paucity of literature documenting how the constructs of the Information-Motivation-Behavioral Skills (IMB) model are affected by exposure to technology-based HIV prevention programs. Guy2Guy, based on the IMB model, is the first comprehensive HIV prevention program delivered via text messaging and tested nationally among sexual minority adolescent males. Between June and November 2014, 302 14-18 year old gay, bisexual, and/or queer cisgender males were recruited across the US on Facebook and enrolled in a randomized controlled trial testing Guy2Guy versus an attention-matched control program. Among sexually inexperienced youth, those in the intervention were more than three times as likely to be in the "High motivation" group at follow-up as control youth (aOR = 3.13; P value = 0.04). The intervention effect was not significant when examined separately for those who were sexually active. HIV information did not significantly vary by experimental arm at 3 months post-intervention end, nor did behavioral skills for condom use or abstinence vary. The increase in motivation to engage in HIV preventive behavior for adolescent males with no prior sexual experience is promising, highlighting the need to tailor HIV prevention according to past sexual experience. The behavioral skills that were measured may not have reflected those most emphasized in the content (e.g., how to use lubrication to reduce risk and increase pleasure), which may explain the lack of detected intervention impact. ClinicalTrials.gov ID# NCT02113956.

  18. Information

    International Nuclear Information System (INIS)

    Boyard, Pierre.

    1981-01-01

    The fear for nuclear energy and more particularly for radioactive wastes is analyzed in the sociological context. Everybody agree on the information need, information is available but there is a problem for their diffusion. Reactions of the public are analyzed and journalists, scientists and teachers have a role to play [fr

  19. What Makes You Tick? An Empirical Study of Space Science Related Social Media Communications Using Machine Learning

    Science.gov (United States)

    Hwong, Y. L.; Oliver, C.; Van Kranendonk, M. J.

    2016-12-01

    The rise of social media has transformed the way the public engages with scientists and science organisations. `Retweet', `Like', `Share' and `Comment' are a few ways users engage with messages on Twitter and Facebook, two of the most popular social media platforms. Despite the availability of big data from these digital footprints, research into social media science communication is scant. This paper presents the results of an empirical study into the processes and outcomes of space science related social media communications using machine learning. The study is divided into two main parts. The first part is dedicated to the use of supervised learning methods to investigate the features of highly engaging messages., e.g. highly retweeted tweets and shared Facebook posts. It is hypothesised that these messages contain certain psycholinguistic features that are unique to the field of space science. We built a predictive model to forecast the engagement levels of social media posts. By using four feature sets (n-grams, psycholinguistics, grammar and social media), we were able to achieve prediction accuracies in the vicinity of 90% using three supervised learning algorithms (Naive Bayes, linear classifier and decision tree). We conducted the same experiments on social media messages from three other fields (politics, business and non-profit) and discovered several features that are exclusive to space science communications: anger, authenticity, hashtags, visual descriptions and a tentative tone. The second part of the study focuses on the extraction of topics from a corpus of texts using topic modelling. This part of the study is exploratory in nature and uses an unsupervised method called Latent Dirichlet Allocation (LDA) to uncover previously unknown topics within a large body of documents. Preliminary results indicate a strong potential of topic model algorithms to automatically uncover themes hidden within social media chatters on space related issues, with

  20. Text mining by Tsallis entropy

    Science.gov (United States)

    Jamaati, Maryam; Mehri, Ali

    2018-01-01

    Long-range correlations between the elements of natural languages enable them to convey very complex information. Complex structure of human language, as a manifestation of natural languages, motivates us to apply nonextensive statistical mechanics in text mining. Tsallis entropy appropriately ranks the terms' relevance to document subject, taking advantage of their spatial correlation length. We apply this statistical concept as a new powerful word ranking metric in order to extract keywords of a single document. We carry out an experimental evaluation, which shows capability of the presented method in keyword extraction. We find that, Tsallis entropy has reliable word ranking performance, at the same level of the best previous ranking methods.

  1. The Effects of Background Music on the Middle School Students' Recognition of the Expository Text Information%背景音乐对中学生说明文文本信息再认的影响

    Institute of Scientific and Technical Information of China (English)

    刘明; 张裕鼎; 张立春

    2012-01-01

    以某中学一年级221名学生为被试,考察了不同类型及不同声压水平的背景音乐对中学生说明文文本信息再认过程的影响。实验结果表明:(1)音乐类型主效应显著,声压水平主效应不显著,两者交互作用显著,高音条件下不同音乐类型产生显著的再认成绩差异,而低音条件下未产生显著性差异。(2)高音条件下,不同类型的背景音乐对中学生的说明文文本信息再认成绩产生不同程度的影响。与无音乐环境相比,古典音乐对成绩产生显著的促进作用;流行音乐含中文歌词、流行音乐含日语歌词两个水平均产生显著的干扰作用;流行音乐不含歌词对再认无显著影响;此外,流行音乐含中文歌词干扰作用最大,但与流行音乐含日文歌词相比差异不显著。%We investigate the effects of different types of background music on recognition of the expository text information. The 221 participants are from 5 classes of Grade 1 of one middle school, with the likely same level of the Chinese course. The results of the experiments are as follows: (1) The main effect of the type of background music is significant while that of sound pressure level is not, and the interactive effect is significant: on the condition of high sound pressure level, the type of background music significant effect on the recognition score, but it has no effect on condition of the low sound pressure level. (2) On the condition of high sound pressure level, different types of background music take significantly different effect on the recognition score of the expository text information: comparing with no sound level, the classic music take significant facilitation effect, both the pop music with Chinese lyric and with Japanese lyric produce significant interference effect, and the pop music without lyric produces no effect. Besides, the interference effect taken by the pop music with Chinese is the

  2. An Embedded Application for Degraded Text Recognition

    Directory of Open Access Journals (Sweden)

    Thillou Céline

    2005-01-01

    Full Text Available This paper describes a mobile device which tries to give the blind or visually impaired access to text information. Three key technologies are required for this system: text detection, optical character recognition, and speech synthesis. Blind users and the mobile environment imply two strong constraints. First, pictures will be taken without control on camera settings and a priori information on text (font or size and background. The second issue is to link several techniques together with an optimal compromise between computational constraints and recognition efficiency. We will present the overall description of the system from text detection to OCR error correction.

  3. Text-Picture Relations in Cooking Instructions

    NARCIS (Netherlands)

    van der Sluis, Ielka; Leito, Shadira; Redeker, Gisela; Bunt, Harry

    2016-01-01

    Like many other instructions, recipes on packages with ready-to-use ingredients for a dish combine a series of pictures with short text paragraphs. The information presentation in such multimodal instructions can be compact (either text or picture) and/or cohesive (text and picture). In an

  4. Important Text Characteristics for Early-Grades Text Complexity

    Science.gov (United States)

    Fitzgerald, Jill; Elmore, Jeff; Koons, Heather; Hiebert, Elfrieda H.; Bowen, Kimberly; Sanford-Moore, Eleanor E.; Stenner, A. Jackson

    2015-01-01

    The Common Core set a standard for all children to read increasingly complex texts throughout schooling. The purpose of the present study was to explore text characteristics specifically in relation to early-grades text complexity. Three hundred fifty primary-grades texts were selected and digitized. Twenty-two text characteristics were identified…

  5. The Balinese Unicode Text Processing

    Directory of Open Access Journals (Sweden)

    Imam Habibi

    2009-06-01

    Full Text Available In principal, the computer only recognizes numbers as the representation of a character. Therefore, there are many encoding systems to allocate these numbers although not all characters are covered. In Europe, every single language even needs more than one encoding system. Hence, a new encoding system known as Unicode has been established to overcome this problem. Unicode provides unique id for each different characters which does not depend on platform, program, and language. Unicode standard has been applied in a number of industries, such as Apple, HP, IBM, JustSystem, Microsoft, Oracle, SAP, Sun, Sybase, and Unisys. In addition, language standards and modern information exchanges such as XML, Java, ECMA Script (JavaScript, LDAP, CORBA 3.0, and WML make use of Unicode as an official tool for implementing ISO/IEC 10646. There are four things to do according to Balinese script: the algorithm of transliteration, searching, sorting, and word boundary analysis (spell checking. To verify the truth of algorithm, some applications are made. These applications can run on Linux/Windows OS platform using J2SDK 1.5 and J2ME WTK2 library. The input and output of the algorithm/application are character sequence that is obtained from keyboard punch and external file. This research produces a module or a library which is able to process the Balinese text based on Unicode standard. The output of this research is the ability, skill, and mastering of 1. Unicode standard (21-bit as a substitution to ASCII (7-bit and ISO8859-1 (8-bit as the former default character set in many applications. 2. The Balinese Unicode text processing algorithm. 3. An experience of working with and learning from an international team that consists of the foremost experts in the area: Michael Everson (Ireland, Peter Constable (Microsoft US, I Made Suatjana, and Ida Bagus Adi Sudewa.

  6. Üstverinin Tam-Metin Bilgi Erişim Performansı Üzerindeki Etkisi: Küçük Ölçekli Türkçe Külliyat Üzerinde Deneysel Bir Araştırma / Impact of Metadata on Full-text Information Retrieval Performance: An Experimental Research on a Small Scale Turkish Corpus

    OpenAIRE

    Çapkın, Çağdaş

    2016-01-01

    Information institutions use text-based information retrieval systems to store, index and retrieve metadata, full-text, or both metadata and full-text (hybrid) contents. The aim of this research was to evaluate impact of these contents on information retrieval performance. For this purpose, metadata (MIR), full-text (FIR) and hybrid (HIR) content information retrieval systems were developed with default Lucene information retrieval model for a small scale Turkish corpus. In order to evaluate ...

  7. Classroom Texting in College Students

    Science.gov (United States)

    Pettijohn, Terry F.; Frazier, Erik; Rieser, Elizabeth; Vaughn, Nicholas; Hupp-Wilds, Bobbi

    2015-01-01

    A 21-item survey on texting in the classroom was given to 235 college students. Overall, 99.6% of students owned a cellphone and 98% texted daily. Of the 138 students who texted in the classroom, most texted friends or significant others, and indicate the reason for classroom texting is boredom or work. Students who texted sent a mean of 12.21…

  8. CONAN : Text Mining in the Biomedical Domain

    NARCIS (Netherlands)

    Malik, R.

    2006-01-01

    This thesis is about Text Mining. Extracting important information from literature. In the last years, the number of biomedical articles and journals is growing exponentially. Scientists might not find the information they want because of the large number of publications. Therefore a system was

  9. Observation of [Formula: see text] and [Formula: see text] decays.

    Science.gov (United States)

    Aaij, R; Adeva, B; Adinolfi, M; Ajaltouni, Z; Akar, S; Albrecht, J; Alessio, F; Alexander, M; Ali, S; Alkhazov, G; Alvarez Cartelle, P; Alves, A A; Amato, S; Amerio, S; Amhis, Y; An, L; Anderlini, L; Andreassi, G; Andreotti, M; Andrews, J E; Appleby, R B; Archilli, F; d'Argent, P; Arnau Romeu, J; Artamonov, A; Artuso, M; Aslanides, E; Auriemma, G; Baalouch, M; Babuschkin, I; Bachmann, S; Back, J J; Badalov, A; Baesso, C; Baker, S; Baldini, W; Barlow, R J; Barschel, C; Barsuk, S; Barter, W; Baszczyk, M; Batozskaya, V; Batsukh, B; Battista, V; Bay, A; Beaucourt, L; Beddow, J; Bedeschi, F; Bediaga, I; Bel, L J; Bellee, V; Belloli, N; Belous, K; Belyaev, I; Ben-Haim, E; Bencivenni, G; Benson, S; Benton, J; Berezhnoy, A; Bernet, R; Bertolin, A; Betancourt, C; Betti, F; Bettler, M-O; van Beuzekom, M; Bezshyiko, Ia; Bifani, S; Billoir, P; Bird, T; Birnkraut, A; Bitadze, A; Bizzeti, A; Blake, T; Blanc, F; Blouw, J; Blusk, S; Bocci, V; Boettcher, T; Bondar, A; Bondar, N; Bonivento, W; Bordyuzhin, I; Borgheresi, A; Borghi, S; Borisyak, M; Borsato, M; Bossu, F; Boubdir, M; Bowcock, T J V; Bowen, E; Bozzi, C; Braun, S; Britsch, M; Britton, T; Brodzicka, J; Buchanan, E; Burr, C; Bursche, A; Buytaert, J; Cadeddu, S; Calabrese, R; Calvi, M; Calvo Gomez, M; Camboni, A; Campana, P; Campora Perez, D H; Capriotti, L; Carbone, A; Carboni, G; Cardinale, R; Cardini, A; Carniti, P; Carson, L; Carvalho Akiba, K; Casse, G; Cassina, L; Castillo Garcia, L; Cattaneo, M; Cauet, Ch; Cavallero, G; Cenci, R; Charles, M; Charpentier, Ph; Chatzikonstantinidis, G; Chefdeville, M; Chen, S; Cheung, S-F; Chobanova, V; Chrzaszcz, M; Cid Vidal, X; Ciezarek, G; Clarke, P E L; Clemencic, M; Cliff, H V; Closier, J; Coco, V; Cogan, J; Cogneras, E; Cogoni, V; Cojocariu, L; Collazuol, G; Collins, P; Comerma-Montells, A; Contu, A; Cook, A; Coombs, G; Coquereau, S; Corti, G; Corvo, M; Costa Sobral, C M; Couturier, B; Cowan, G A; Craik, D C; Crocombe, A; Cruz Torres, M; Cunliffe, S; Currie, R; D'Ambrosio, C; Da Cunha Marinho, F; Dall'Occo, E; Dalseno, J; David, P N Y; Davis, A; De Aguiar Francisco, O; De Bruyn, K; De Capua, S; De Cian, M; De Miranda, J M; De Paula, L; De Serio, M; De Simone, P; Dean, C-T; Decamp, D; Deckenhoff, M; Del Buono, L; Demmer, M; Dendek, A; Derkach, D; Deschamps, O; Dettori, F; Dey, B; Di Canto, A; Dijkstra, H; Dordei, F; Dorigo, M; Dosil Suárez, A; Dovbnya, A; Dreimanis, K; Dufour, L; Dujany, G; Dungs, K; Durante, P; Dzhelyadin, R; Dziurda, A; Dzyuba, A; Déléage, N; Easo, S; Ebert, M; Egede, U; Egorychev, V; Eidelman, S; Eisenhardt, S; Eitschberger, U; Ekelhof, R; Eklund, L; Ely, S; Esen, S; Evans, H M; Evans, T; Falabella, A; Farley, N; Farry, S; Fay, R; Fazzini, D; Ferguson, D; Fernandez Prieto, A; Ferrari, F; Ferreira Rodrigues, F; Ferro-Luzzi, M; Filippov, S; Fini, R A; Fiore, M; Fiorini, M; Firlej, M; Fitzpatrick, C; Fiutowski, T; Fleuret, F; Fohl, K; Fontana, M; Fontanelli, F; Forshaw, D C; Forty, R; Franco Lima, V; Frank, M; Frei, C; Fu, J; Furfaro, E; Färber, C; Gallas Torreira, A; Galli, D; Gallorini, S; Gambetta, S; Gandelman, M; Gandini, P; Gao, Y; Garcia Martin, L M; García Pardiñas, J; Garra Tico, J; Garrido, L; Garsed, P J; Gascon, D; Gaspar, C; Gavardi, L; Gazzoni, G; Gerick, D; Gersabeck, E; Gersabeck, M; Gershon, T; Ghez, Ph; Gianì, S; Gibson, V; Girard, O G; Giubega, L; Gizdov, K; Gligorov, V V; Golubkov, D; Golutvin, A; Gomes, A; Gorelov, I V; Gotti, C; Govorkova, E; Grabalosa Gándara, M; Graciani Diaz, R; Granado Cardoso, L A; Graugés, E; Graverini, E; Graziani, G; Grecu, A; Griffith, P; Grillo, L; Gruberg Cazon, B R; Grünberg, O; Gushchin, E; Guz, Yu; Gys, T; Göbel, C; Hadavizadeh, T; Hadjivasiliou, C; Haefeli, G; Haen, C; Haines, S C; Hall, S; Hamilton, B; Han, X; Hansmann-Menzemer, S; Harnew, N; Harnew, S T; Harrison, J; Hatch, M; He, J; Head, T; Heister, A; Hennessy, K; Henrard, P; Henry, L; Hernando Morata, J A; van Herwijnen, E; Heß, M; Hicheur, A; Hill, D; Hombach, C; Hopchev, H; Hulsbergen, W; Humair, T; Hushchyn, M; Hussain, N; Hutchcroft, D; Idzik, M; Ilten, P; Jacobsson, R; Jaeger, A; Jalocha, J; Jans, E; Jawahery, A; Jiang, F; John, M; Johnson, D; Jones, C R; Joram, C; Jost, B; Jurik, N; Kandybei, S; Kanso, W; Karacson, M; Kariuki, J M; Karodia, S; Kecke, M; Kelsey, M; Kenyon, I R; Kenzie, M; Ketel, T; Khairullin, E; Khanji, B; Khurewathanakul, C; Kirn, T; Klaver, S; Klimaszewski, K; Koliiev, S; Kolpin, M; Komarov, I; Koopman, R F; Koppenburg, P; Kosmyntseva, A; Kozachuk, A; Kozeiha, M; Kravchuk, L; Kreplin, K; Kreps, M; Krokovny, P; Kruse, F; Krzemien, W; Kucewicz, W; Kucharczyk, M; Kudryavtsev, V; Kuonen, A K; Kurek, K; Kvaratskheliya, T; Lacarrere, D; Lafferty, G; Lai, A; Lanfranchi, G; Langenbruch, C; Latham, T; Lazzeroni, C; Le Gac, R; van Leerdam, J; Lees, J-P; Leflat, A; Lefrançois, J; Lefèvre, R; Lemaitre, F; Lemos Cid, E; Leroy, O; Lesiak, T; Leverington, B; Li, Y; Likhomanenko, T; Lindner, R; Linn, C; Lionetto, F; Liu, B; Liu, X; Loh, D; Longstaff, I; Lopes, J H; Lucchesi, D; Lucio Martinez, M; Luo, H; Lupato, A; Luppi, E; Lupton, O; Lusiani, A; Lyu, X; Machefert, F; Maciuc, F; Maev, O; Maguire, K; Malde, S; Malinin, A; Maltsev, T; Manca, G; Mancinelli, G; Manning, P; Maratas, J; Marchand, J F; Marconi, U; Marin Benito, C; Marino, P; Marks, J; Martellotti, G; Martin, M; Martinelli, M; Martinez Santos, D; Martinez Vidal, F; Martins Tostes, D; Massacrier, L M; Massafferri, A; Matev, R; Mathad, A; Mathe, Z; Matteuzzi, C; Mauri, A; Maurin, B; Mazurov, A; McCann, M; McCarthy, J; McNab, A; McNulty, R; Meadows, B; Meier, F; Meissner, M; Melnychuk, D; Merk, M; Merli, A; Michielin, E; Milanes, D A; Minard, M-N; Mitzel, D S; Mogini, A; Molina Rodriguez, J; Monroy, I A; Monteil, S; Morandin, M; Morawski, P; Mordà, A; Morello, M J; Moron, J; Morris, A B; Mountain, R; Muheim, F; Mulder, M; Mussini, M; Müller, D; Müller, J; Müller, K; Müller, V; Naik, P; Nakada, T; Nandakumar, R; Nandi, A; Nasteva, I; Needham, M; Neri, N; Neubert, S; Neufeld, N; Neuner, M; Nguyen, A D; Nguyen, T D; Nguyen-Mau, C; Nieswand, S; Niet, R; Nikitin, N; Nikodem, T; Novoselov, A; O'Hanlon, D P; Oblakowska-Mucha, A; Obraztsov, V; Ogilvy, S; Oldeman, R; Onderwater, C J G; Otalora Goicochea, J M; Otto, A; Owen, P; Oyanguren, A; Pais, P R; Palano, A; Palombo, F; Palutan, M; Panman, J; Papanestis, A; Pappagallo, M; Pappalardo, L L; Parker, W; Parkes, C; Passaleva, G; Pastore, A; Patel, G D; Patel, M; Patrignani, C; Pearce, A; Pellegrino, A; Penso, G; Pepe Altarelli, M; Perazzini, S; Perret, P; Pescatore, L; Petridis, K; Petrolini, A; Petrov, A; Petruzzo, M; Picatoste Olloqui, E; Pietrzyk, B; Pikies, M; Pinci, D; Pistone, A; Piucci, A; Playfer, S; Plo Casasus, M; Poikela, T; Polci, F; Poluektov, A; Polyakov, I; Polycarpo, E; Pomery, G J; Popov, A; Popov, D; Popovici, B; Poslavskii, S; Potterat, C; Price, E; Price, J D; Prisciandaro, J; Pritchard, A; Prouve, C; Pugatch, V; Puig Navarro, A; Punzi, G; Qian, W; Quagliani, R; Rachwal, B; Rademacker, J H; Rama, M; Ramos Pernas, M; Rangel, M S; Raniuk, I; Ratnikov, F; Raven, G; Redi, F; Reichert, S; Dos Reis, A C; Remon Alepuz, C; Renaudin, V; Ricciardi, S; Richards, S; Rihl, M; Rinnert, K; Rives Molina, V; Robbe, P; Rodrigues, A B; Rodrigues, E; Rodriguez Lopez, J A; Rodriguez Perez, P; Rogozhnikov, A; Roiser, S; Rollings, A; Romanovskiy, V; Romero Vidal, A; Ronayne, J W; Rotondo, M; Rudolph, M S; Ruf, T; Ruiz Valls, P; Saborido Silva, J J; Sadykhov, E; Sagidova, N; Saitta, B; Salustino Guimaraes, V; Sanchez Mayordomo, C; Sanmartin Sedes, B; Santacesaria, R; Santamarina Rios, C; Santimaria, M; Santovetti, E; Sarti, A; Satriano, C; Satta, A; Saunders, D M; Savrina, D; Schael, S; Schellenberg, M; Schiller, M; Schindler, H; Schlupp, M; Schmelling, M; Schmelzer, T; Schmidt, B; Schneider, O; Schopper, A; Schubert, K; Schubiger, M; Schune, M-H; Schwemmer, R; Sciascia, B; Sciubba, A; Semennikov, A; Sergi, A; Serra, N; Serrano, J; Sestini, L; Seyfert, P; Shapkin, M; Shapoval, I; Shcheglov, Y; Shears, T; Shekhtman, L; Shevchenko, V; Siddi, B G; Silva Coutinho, R; Silva de Oliveira, L; Simi, G; Simone, S; Sirendi, M; Skidmore, N; Skwarnicki, T; Smith, E; Smith, I T; Smith, J; Smith, M; Snoek, H; Sokoloff, M D; Soler, F J P; Souza De Paula, B; Spaan, B; Spradlin, P; Sridharan, S; Stagni, F; Stahl, M; Stahl, S; Stefko, P; Stefkova, S; Steinkamp, O; Stemmle, S; Stenyakin, O; Stevenson, S; Stoica, S; Stone, S; Storaci, B; Stracka, S; Straticiuc, M; Straumann, U; Sun, L; Sutcliffe, W; Swientek, K; Syropoulos, V; Szczekowski, M; Szumlak, T; T'Jampens, S; Tayduganov, A; Tekampe, T; Tellarini, G; Teubert, F; Thomas, E; van Tilburg, J; Tilley, M J; Tisserand, V; Tobin, M; Tolk, S; Tomassetti, L; Tonelli, D; Topp-Joergensen, S; Toriello, F; Tournefier, E; Tourneur, S; Trabelsi, K; Traill, M; Tran, M T; Tresch, M; Trisovic, A; Tsaregorodtsev, A; Tsopelas, P; Tully, A; Tuning, N; Ukleja, A; Ustyuzhanin, A; Uwer, U; Vacca, C; Vagnoni, V; Valassi, A; Valat, S; Valenti, G; Vallier, A; Vazquez Gomez, R; Vazquez Regueiro, P; Vecchi, S; van Veghel, M; Velthuis, J J; Veltri, M; Veneziano, G; Venkateswaran, A; Vernet, M; Vesterinen, M; Viaud, B; Vieira, D; Vieites Diaz, M; Viemann, H; Vilasis-Cardona, X; Vitti, M; Volkov, V; Vollhardt, A; Voneki, B; Vorobyev, A; Vorobyev, V; Voß, C; de Vries, J A; Vázquez Sierra, C; Waldi, R; Wallace, C; Wallace, R; Walsh, J; Wang, J; Ward, D R; Wark, H M; Watson, N K; Websdale, D; Weiden, A; Whitehead, M; Wicht, J; Wilkinson, G; Wilkinson, M; Williams, M; Williams, M P; Williams, M; Williams, T; Wilson, F F; Wimberley, J; Wishahi, J; Wislicki, W; Witek, M; Wormser, G; Wotton, S A; Wraight, K; Wyllie, K; Xie, Y; Xing, Z; Xu, Z; Yang, Z; Yin, H; Yu, J; Yuan, X; Yushchenko, O; Zarebski, K A; Zavertyaev, M; Zhang, L; Zhang, Y; Zhang, Y; Zhelezov, A; Zheng, Y; Zhokhov, A; Zhu, X; Zhukov, V; Zucchelli, S

    2017-01-01

    The decays [Formula: see text] and [Formula: see text] are observed for the first time using a data sample corresponding to an integrated luminosity of 3.0 fb[Formula: see text], collected by the LHCb experiment in proton-proton collisions at the centre-of-mass energies of 7 and 8[Formula: see text]. The branching fractions relative to that of [Formula: see text] are measured to be [Formula: see text]where the first uncertainties are statistical and the second are systematic.

  10. The Weaknesses of Full-Text Searching

    Science.gov (United States)

    Beall, Jeffrey

    2008-01-01

    This paper provides a theoretical critique of the deficiencies of full-text searching in academic library databases. Because full-text searching relies on matching words in a search query with words in online resources, it is an inefficient method of finding information in a database. This matching fails to retrieve synonyms, and it also retrieves…

  11. Mining the Text: 34 Text Features that Can Ease or Obstruct Text Comprehension and Use

    Science.gov (United States)

    White, Sheida

    2012-01-01

    This article presents 34 characteristics of texts and tasks ("text features") that can make continuous (prose), noncontinuous (document), and quantitative texts easier or more difficult for adolescents and adults to comprehend and use. The text features were identified by examining the assessment tasks and associated texts in the national…

  12. Text Mining of Supreme Administrative Court Jurisdictions

    OpenAIRE

    Feinerer, Ingo; Hornik, Kurt

    2007-01-01

    Within the last decade text mining, i.e., extracting sensitive information from text corpora, has become a major factor in business intelligence. The automated textual analysis of law corpora is highly valuable because of its impact on a company's legal options and the raw amount of available jurisdiction. The study of supreme court jurisdiction and international law corpora is equally important due to its effects on business sectors. In this paper we use text mining methods to investigate Au...

  13. From Text to Political Positions: Text analysis across disciplines

    NARCIS (Netherlands)

    Kaal, A.R.; Maks, I.; van Elfrinkhof, A.M.E.

    2014-01-01

    ABSTRACT From Text to Political Positions addresses cross-disciplinary innovation in political text analysis for party positioning. Drawing on political science, computational methods and discourse analysis, it presents a diverse collection of analytical models including pure quantitative and

  14. Text mining from ontology learning to automated text processing applications

    CERN Document Server

    Biemann, Chris

    2014-01-01

    This book comprises a set of articles that specify the methodology of text mining, describe the creation of lexical resources in the framework of text mining and use text mining for various tasks in natural language processing (NLP). The analysis of large amounts of textual data is a prerequisite to build lexical resources such as dictionaries and ontologies and also has direct applications in automated text processing in fields such as history, healthcare and mobile applications, just to name a few. This volume gives an update in terms of the recent gains in text mining methods and reflects

  15. Working with text tools, techniques and approaches for text mining

    CERN Document Server

    Tourte, Gregory J L

    2016-01-01

    Text mining tools and technologies have long been a part of the repository world, where they have been applied to a variety of purposes, from pragmatic aims to support tools. Research areas as diverse as biology, chemistry, sociology and criminology have seen effective use made of text mining technologies. Working With Text collects a subset of the best contributions from the 'Working with text: Tools, techniques and approaches for text mining' workshop, alongside contributions from experts in the area. Text mining tools and technologies in support of academic research include supporting research on the basis of a large body of documents, facilitating access to and reuse of extant work, and bridging between the formal academic world and areas such as traditional and social media. Jisc have funded a number of projects, including NaCTem (the National Centre for Text Mining) and the ResDis programme. Contents are developed from workshop submissions and invited contributions, including: Legal considerations in te...

  16. The Only Safe SMS Texting Is No SMS Texting.

    Science.gov (United States)

    Toth, Cheryl; Sacopulos, Michael J

    2015-01-01

    Many physicians and practice staff use short messaging service (SMS) text messaging to communicate with patients. But SMS text messaging is unencrypted, insecure, and does not meet HIPAA requirements. In addition, the short and abbreviated nature of text messages creates opportunities for misinterpretation, and can negatively impact patient safety and care. Until recently, asking patients to sign a statement that they understand and accept these risks--as well as having policies, device encryption, and cyber insurance in place--would have been enough to mitigate the risk of using SMS text in a medical practice. But new trends and policies have made SMS text messaging unsafe under any circumstance. This article explains these trends and policies, as well as why only secure texting or secure messaging should be used for physician-patient communication.

  17. Monitoring interaction and collective text production through text mining

    Directory of Open Access Journals (Sweden)

    Macedo, Alexandra Lorandi

    2014-04-01

    Full Text Available This article presents the Concepts Network tool, developed using text mining technology. The main objective of this tool is to extract and relate terms of greatest incidence from a text and exhibit the results in the form of a graph. The Network was implemented in the Collective Text Editor (CTE which is an online tool that allows the production of texts in synchronized or non-synchronized forms. This article describes the application of the Network both in texts produced collectively and texts produced in a forum. The purpose of the tool is to offer support to the teacher in managing the high volume of data generated in the process of interaction amongst students and in the construction of the text. Specifically, the aim is to facilitate the teacher’s job by allowing him/her to process data in a shorter time than is currently demanded. The results suggest that the Concepts Network can aid the teacher, as it provides indicators of the quality of the text produced. Moreover, messages posted in forums can be analyzed without their content necessarily having to be pre-read.

  18. Text recycling: acceptable or misconduct?

    Science.gov (United States)

    Harriman, Stephanie; Patel, Jigisha

    2014-08-16

    Text recycling, also referred to as self-plagiarism, is the reproduction of an author's own text from a previous publication in a new publication. Opinions on the acceptability of this practice vary, with some viewing it as acceptable and efficient, and others as misleading and unacceptable. In light of the lack of consensus, journal editors often have difficulty deciding how to act upon the discovery of text recycling. In response to these difficulties, we have created a set of guidelines for journal editors on how to deal with text recycling. In this editorial, we discuss some of the challenges of developing these guidelines, and how authors can avoid undisclosed text recycling.

  19. Text mining resources for the life sciences.

    Science.gov (United States)

    Przybyła, Piotr; Shardlow, Matthew; Aubin, Sophie; Bossy, Robert; Eckart de Castilho, Richard; Piperidis, Stelios; McNaught, John; Ananiadou, Sophia

    2016-01-01

    Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable-those that have the crucial ability to share information, enabling smooth integration and reusability. © The Author(s) 2016. Published by Oxford University Press.

  20. Text mining resources for the life sciences

    Science.gov (United States)

    Shardlow, Matthew; Aubin, Sophie; Bossy, Robert; Eckart de Castilho, Richard; Piperidis, Stelios; McNaught, John; Ananiadou, Sophia

    2016-01-01

    Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable—those that have the crucial ability to share information, enabling smooth integration and reusability. PMID:27888231

  1. TEXT DEIXIS IN NARRATIVE SEQUENCES

    Directory of Open Access Journals (Sweden)

    Josep Rivera

    2007-06-01

    Full Text Available This study looks at demonstrative descriptions, regarding them as text-deictic procedures which contribute to weave discourse reference. Text deixis is thought of as a metaphorical referential device which maps the ground of utterance onto the text itself. Demonstrative expressions with textual antecedent-triggers, considered as the most important text-deictic units, are identified in a narrative corpus consisting of J. M. Barrie’s Peter Pan and its translation into Catalan. Some linguistic and discourse variables related to DemNPs are analysed to characterise adequately text deixis. It is shown that this referential device is usually combined with abstract nouns, thus categorising and encapsulating (non-nominal complex discourse entities as nouns, while performing a referential cohesive function by means of the text deixis + general noun type of lexical cohesion.

  2. Text against Text: Counterbalancing the Hegemony of Assessment.

    Science.gov (United States)

    Cosgrove, Cornelius

    A study examined whether composition specialists can counterbalance the potential privileging of the assessment perspective, or of self-appointed interpreters of that perspective, through the study of assessment discourse as text. Fourteen assessment texts were examined, most of them journal articles and most of them featuring the common…

  3. Texting while driving: is speech-based text entry less risky than handheld text entry?

    Science.gov (United States)

    He, J; Chaparro, A; Nguyen, B; Burge, R J; Crandall, J; Chaparro, B; Ni, R; Cao, S

    2014-11-01

    Research indicates that using a cell phone to talk or text while maneuvering a vehicle impairs driving performance. However, few published studies directly compare the distracting effects of texting using a hands-free (i.e., speech-based interface) versus handheld cell phone, which is an important issue for legislation, automotive interface design and driving safety training. This study compared the effect of speech-based versus handheld text entries on simulated driving performance by asking participants to perform a car following task while controlling the duration of a secondary text-entry task. Results showed that both speech-based and handheld text entries impaired driving performance relative to the drive-only condition by causing more variation in speed and lane position. Handheld text entry also increased the brake response time and increased variation in headway distance. Text entry using a speech-based cell phone was less detrimental to driving performance than handheld text entry. Nevertheless, the speech-based text entry task still significantly impaired driving compared to the drive-only condition. These results suggest that speech-based text entry disrupts driving, but reduces the level of performance interference compared to text entry with a handheld device. In addition, the difference in the distraction effect caused by speech-based and handheld text entry is not simply due to the difference in task duration. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Active Learning for Text Classification

    OpenAIRE

    Hu, Rong

    2011-01-01

    Text classification approaches are used extensively to solve real-world challenges. The success or failure of text classification systems hangs on the datasets used to train them, without a good dataset it is impossible to build a quality system. This thesis examines the applicability of active learning in text classification for the rapid and economical creation of labelled training data. Four main contributions are made in this thesis. First, we present two novel selection strategies to cho...

  5. Text segmentation in degraded historical document images

    Directory of Open Access Journals (Sweden)

    A.S. Kavitha

    2016-07-01

    Full Text Available Text segmentation from degraded Historical Indus script images helps Optical Character Recognizer (OCR to achieve good recognition rates for Hindus scripts; however, it is challenging due to complex background in such images. In this paper, we present a new method for segmenting text and non-text in Indus documents based on the fact that text components are less cursive compared to non-text ones. To achieve this, we propose a new combination of Sobel and Laplacian for enhancing degraded low contrast pixels. Then the proposed method generates skeletons for text components in enhanced images to reduce computational burdens, which in turn helps in studying component structures efficiently. We propose to study the cursiveness of components based on branch information to remove false text components. The proposed method introduces the nearest neighbor criterion for grouping components in the same line, which results in clusters. Furthermore, the proposed method classifies these clusters into text and non-text cluster based on characteristics of text components. We evaluate the proposed method on a large dataset containing varieties of images. The results are compared with the existing methods to show that the proposed method is effective in terms of recall and precision.

  6. Answering Questions from Oceanography Texts: Learner, Task and Text Characteristics.

    Science.gov (United States)

    1987-09-15

    course and the seventh (HS) was the teaching assistant for the course. Students completed a background questionnaire dealing with academic information...language skills and study habits. Table 1 provides a summary of the most pertinent information from this questionnaire. The teaching assistant and three...20. Celce-Murcia, M., & Larson-Freeman, D. (1983). The orammar book: An ESL/ EFL teacher’s nour. Rowley, MA: Newberry House. Chafe, W. L (1985

  7. Text and ideology: text-oriented discourse analysis

    Directory of Open Access Journals (Sweden)

    Maria Eduarda Gonçalves Peixoto

    2018-04-01

    Full Text Available The article aims to contribute to the understanding of the connection between text and ideology articulated by the text-oriented analysis of discourse (ADTO. Based on the reflections of Fairclough (1989, 2001, 2003 and Fairclough and Chouliaraki (1999, the debate presents the social ontology that ADTO uses to base its conception of social life as an open system and textually mediated; the article then explains the chronological-narrative development of the main critical theories of ideology, by virtue of which ADTO organizes the assumptions that underpin the particular use it makes of the term. Finally, the discussion presents the main aspects of the connection between text and ideology, offering a conceptual framework that can contribute to the domain of the theme according to a critical discourse analysis approach.

  8. Resource Lean and Portable Automatic Text Summarization

    OpenAIRE

    Hassel, Martin

    2007-01-01

    Today, with digitally stored information available in abundance, even for many minor languages, this information must by some means be filtered and extracted in order to avoid drowning in it. Automatic summarization is one such technique, where a computer summarizes a longer text to a shorter non-rendundant form. Apart from the major languages of the world there are a lot of languages for which large bodies of data aimed at language technology research to a high degree are lacking. There migh...

  9. Knowledge Based Understanding of Radiology Text

    OpenAIRE

    Ranum, David L.

    1988-01-01

    A data acquisition tool which will extract pertinent diagnostic information from radiology reports has been designed and implemented. Pertinent diagnostic information is defined as that clinical data which is used by the HELP medical expert system. The program uses a memory based semantic parsing technique to “understand” the text. Moreover, the memory structures and lexicon necessary to perform this action are automatically generated from the diagnostic knowledge base by using a special purp...

  10. Financial Statement Fraud Detection using Text Mining

    OpenAIRE

    Rajan Gupta; Nasib Singh Gill

    2013-01-01

    Data mining techniques have been used enormously by the researchers’ community in detecting financial statement fraud. Most of the research in this direction has used the numbers (quantitative information) i.e. financial ratios present in the financial statements for detecting fraud. There is very little or no research on the analysis of text such as auditor’s comments or notes present in published reports. In this study we propose a text mining approach for detecting financial statement frau...

  11. English Metafunction Analysis in Chemistry Text: Characterization of Scientific Text

    Directory of Open Access Journals (Sweden)

    Ahmad Amin Dalimunte, M.Hum

    2013-09-01

    Full Text Available The objectives of this research are to identify what Metafunctions are applied in chemistry text and how they characterize a scientific text. It was conducted by applying content analysis. The data for this research was a twelve-paragraph chemistry text. The data were collected by applying a documentary technique. The document was read and analyzed to find out the Metafunction. The data were analyzed by some procedures: identifying the types of process, counting up the number of the processes, categorizing and counting up the cohesion devices, classifying the types of modulation and determining modality value, finally counting up the number of sentences and clauses, then scoring the grammatical intricacy index. The findings of the research show that Material process (71of 100 is mostly used, circumstance of spatial location (26 of 56 is more dominant than the others. Modality (5 is less used in order to avoid from subjectivity. Impersonality is implied through less use of reference either pronouns (7 or demonstrative (7, conjunctions (60 are applied to develop ideas, and the total number of the clauses are found much more dominant (109 than the total number of the sentences (40 which results high grammatical intricacy index. The Metafunction found indicate that the chemistry text has fulfilled the characteristics of scientific or academic text which truly reflects it as a natural science.

  12. Strategies for Translating Vocative Texts

    Directory of Open Access Journals (Sweden)

    Olga COJOCARU

    2014-12-01

    Full Text Available The paper deals with the linguistic and cultural elements of vocative texts and the techniques used in translating them by giving some examples of texts that are typically vocative (i.e. advertisements and instructions for use. Semantic and communicative strategies are popular in translation studies and each of them has its own advantages and disadvantages in translating vocative texts. The advantage of semantic translation is that it takes more account of the aesthetic value of the SL text, while communicative translation attempts to render the exact contextual meaning of the original text in such a way that both content and language are readily acceptable and comprehensible to the readership. Focus is laid on the strategies used in translating vocative texts, strategies that highlight and introduce a cultural context to the target audience, in order to achieve their overall purpose, that is to sell or persuade the reader to behave in a certain way. Thus, in order to do that, a number of advertisements from the field of cosmetics industry and electronic gadgets were selected for analysis. The aim is to gather insights into vocative text translation and to create new perspectives on this field of research, now considered a process of innovation and diversion, especially in areas as important as economy and marketing.

  13. Systematic characterizations of text similarity in full text biomedical publications.

    Science.gov (United States)

    Sun, Zhaohui; Errami, Mounir; Long, Tara; Renard, Chris; Choradia, Nishant; Garner, Harold

    2010-09-15

    Computational methods have been used to find duplicate biomedical publications in MEDLINE. Full text articles are becoming increasingly available, yet the similarities among them have not been systematically studied. Here, we quantitatively investigated the full text similarity of biomedical publications in PubMed Central. 72,011 full text articles from PubMed Central (PMC) were parsed to generate three different datasets: full texts, sections, and paragraphs. Text similarity comparisons were performed on these datasets using the text similarity algorithm eTBLAST. We measured the frequency of similar text pairs and compared it among different datasets. We found that high abstract similarity can be used to predict high full text similarity with a specificity of 20.1% (95% CI [17.3%, 23.1%]) and sensitivity of 99.999%. Abstract similarity and full text similarity have a moderate correlation (Pearson correlation coefficient: -0.423) when the similarity ratio is above 0.4. Among pairs of articles in PMC, method sections are found to be the most repetitive (frequency of similar pairs, methods: 0.029, introduction: 0.0076, results: 0.0043). In contrast, among a set of manually verified duplicate articles, results are the most repetitive sections (frequency of similar pairs, results: 0.94, methods: 0.89, introduction: 0.82). Repetition of introduction and methods sections is more likely to be committed by the same authors (odds of a highly similar pair having at least one shared author, introduction: 2.31, methods: 1.83, results: 1.03). There is also significantly more similarity in pairs of review articles than in pairs containing one review and one nonreview paper (frequency of similar pairs: 0.0167 and 0.0023, respectively). While quantifying abstract similarity is an effective approach for finding duplicate citations, a comprehensive full text analysis is necessary to uncover all potential duplicate citations in the scientific literature and is helpful when

  14. Figure text extraction in biomedical literature.

    Directory of Open Access Journals (Sweden)

    Daehyun Kim

    2011-01-01

    Full Text Available Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engine (http://figuresearch.askHERMES.org to allow bioscientists to access figures efficiently. Since text frequently appears in figures, automatically extracting such text may assist the task of mining information from figures. Little research, however, has been conducted exploring text extraction from biomedical figures.We first evaluated an off-the-shelf Optical Character Recognition (OCR tool on its ability to extract text from figures appearing in biomedical full-text articles. We then developed a Figure Text Extraction Tool (FigTExT to improve the performance of the OCR tool for figure text extraction through the use of three innovative components: image preprocessing, character recognition, and text correction. We first developed image preprocessing to enhance image quality and to improve text localization. Then we adapted the off-the-shelf OCR tool on the improved text localization for character recognition. Finally, we developed and evaluated a novel text correction framework by taking advantage of figure-specific lexicons.The evaluation on 382 figures (9,643 figure texts in total randomly selected from PubMed Central full-text articles shows that FigTExT performed with 84% precision, 98% recall, and 90% F1-score for text localization and with 62.5% precision, 51.0% recall and 56.2% F1-score for figure text extraction. When limiting figure texts to those judged by domain experts to be important content, FigTExT performed with 87.3% precision, 68.8% recall, and 77% F1-score. FigTExT significantly improved the performance of the off-the-shelf OCR tool we used, which on its own performed with 36.6% precision, 19.3% recall, and 25.3% F1-score for

  15. Linguistic Dating of Biblical Texts

    DEFF Research Database (Denmark)

    Ehrensvärd, Martin Gustaf

    2003-01-01

    For two centuries, scholars have pointed to consistent differences in the Hebrew of certain biblical texts and interpreted these differences as reflecting the date of composition of the texts. Until the 1980s, this was quite uncontroversial as the linguistic findings largely confirmed the chronol......For two centuries, scholars have pointed to consistent differences in the Hebrew of certain biblical texts and interpreted these differences as reflecting the date of composition of the texts. Until the 1980s, this was quite uncontroversial as the linguistic findings largely confirmed...... the chronology of the texts established by other means: the Hebrew of Genesis-2 Kings was judged to be early and that of Esther, Daniel, Ezra, Nehemiah, and Chronicles to be late. In the current debate where revisionists have questioned the traditional dating, linguistic arguments in the dating of texts have...... come more into focus. The study critically examines some linguistic arguments adduced to support the traditional position, and reviewing the arguments it points to weaknesses in the linguistic dating of EBH texts to pre-exilic times. When viewing the linguistic evidence in isolation it will be clear...

  16. Biomarker Identification Using Text Mining

    Directory of Open Access Journals (Sweden)

    Hui Li

    2012-01-01

    Full Text Available Identifying molecular biomarkers has become one of the important tasks for scientists to assess the different phenotypic states of cells or organisms correlated to the genotypes of diseases from large-scale biological data. In this paper, we proposed a text-mining-based method to discover biomarkers from PubMed. First, we construct a database based on a dictionary, and then we used a finite state machine to identify the biomarkers. Our method of text mining provides a highly reliable approach to discover the biomarkers in the PubMed database.

  17. Figure-associated text summarization and evaluation.

    Directory of Open Access Journals (Sweden)

    Balaji Polepalli Ramesh

    Full Text Available Biomedical literature incorporates millions of figures, which are a rich and important knowledge resource for biomedical researchers. Scientists need access to the figures and the knowledge they represent in order to validate research findings and to generate new hypotheses. By themselves, these figures are nearly always incomprehensible to both humans and machines and their associated texts are therefore essential for full comprehension. The associated text of a figure, however, is scattered throughout its full-text article and contains redundant information content. In this paper, we report the continued development and evaluation of several figure summarization systems, the FigSum+ systems, that automatically identify associated texts, remove redundant information, and generate a text summary for every figure in an article. Using a set of 94 annotated figures selected from 19 different journals, we conducted an intrinsic evaluation of FigSum+. We evaluate the performance by precision, recall, F1, and ROUGE scores. The best FigSum+ system is based on an unsupervised method, achieving F1 score of 0.66 and ROUGE-1 score of 0.97. The annotated data is available at figshare.com (http://figshare.com/articles/Figure_Associated_Text_Summarization_and_Evaluation/858903.

  18. Anomaly Detection with Text Mining

    Data.gov (United States)

    National Aeronautics and Space Administration — Many existing complex space systems have a significant amount of historical maintenance and problem data bases that are stored in unstructured text forms. The...

  19. Social Studies: Texts and Supplements.

    Science.gov (United States)

    Curriculum Review, 1979

    1979-01-01

    This review of selected social studies texts, series, and supplements, mainly for the secondary level, includes a special section examining eight titles on warfare and terrorism for grades 4-12. (SJL)

  20. NOTICING HYBRID RECASTS IN TEXT CHAT

    Directory of Open Access Journals (Sweden)

    Mark J. Oliver

    2016-12-01

    Full Text Available This study examined ten EFL learners’ noticing of the corrective nature of a form of text-based SCMC (text chat feedback that combined a recast of a grammatical error with metalinguistic information. The feedback, termed a hybrid recast, was provided by a native-speaker interlocutor during two text chat activities: a spot-the-difference and picture-ordering task. Data was collected in two ways: analysis of task-based dyadic text chat interaction in which uptake was used as an indicator of learner noticing, and a post-task questionnaire containing questions that identified evidence of learner noticing. Interaction analysis showed that learners responded to almost two thirds of the hybrid recasts with uptake. In addition, every learner provided evidence that they had correctly perceived at least some of the hybrid recasts as corrective in their post-task questionnaire responses.

  1. EXPLORING STUDENTS‟ DIFFICULTIES IN READING ACADEMIC TEXTS

    Directory of Open Access Journals (Sweden)

    Ira Ernawati

    2017-04-01

    Full Text Available Academic texts play an important role for university students. However, those texts are considered difficult. This study is intended to investigate students‘ difficulties in reading academic texts. The qualitative approach was employed in this study. The design was a case study. The participants were ten students from fifth semester of CLS: EE (Classroom Language and Strategy: Explaining and Exemplifying class who were selected by using purposive sampling. The data were gathered from students‘ journal reflections, observation, and interview. The finding shows that the students encountered reading difficulties in area of textual factors, namely vocabulary, comprehending specific information, text organization, and grammar and human factors including background knowledge, mood, laziness, and time constraint.

  2. Frontiers of biomedical text mining: current progress

    Science.gov (United States)

    Zweigenbaum, Pierre; Demner-Fushman, Dina; Yu, Hong; Cohen, Kevin B.

    2008-01-01

    It is now almost 15 years since the publication of the first paper on text mining in the genomics domain, and decades since the first paper on text mining in the medical domain. Enormous progress has been made in the areas of information retrieval, evaluation methodologies and resource construction. Some problems, such as abbreviation-handling, can essentially be considered solved problems, and others, such as identification of gene mentions in text, seem likely to be solved soon. However, a number of problems at the frontiers of biomedical text mining continue to present interesting challenges and opportunities for great improvements and interesting research. In this article we review the current state of the art in biomedical text mining or ‘BioNLP’ in general, focusing primarily on papers published within the past year. PMID:17977867

  3. Text Character Extraction Implementation from Captured Handwritten Image to Text Conversionusing Template Matching Technique

    Directory of Open Access Journals (Sweden)

    Barate Seema

    2016-01-01

    Full Text Available Images contain various types of useful information that should be extracted whenever required. A various algorithms and methods are proposed to extract text from the given image, and by using that user will be able to access the text from any image. Variations in text may occur because of differences in size, style,orientation, alignment of text, and low image contrast, composite backgrounds make the problem during extraction of text. If we develop an application that extracts and recognizes those texts accurately in real time, then it can be applied to many important applications like document analysis, vehicle license plate extraction, text- based image indexing, etc and many applications have become realities in recent years. To overcome the above problems we develop such application that will convert the image into text by using algorithms, such as bounding box, HSV model, blob analysis,template matching, template generation.

  4. Text Mining in Organizational Research.

    Science.gov (United States)

    Kobayashi, Vladimer B; Mol, Stefan T; Berkers, Hannah A; Kismihók, Gábor; Den Hartog, Deanne N

    2018-07-01

    Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies.

  5. Studies of electron cyclotron emission on text

    International Nuclear Information System (INIS)

    Gandy, R.F.

    1990-07-01

    The Auburn University electron cyclotron emission (ECE) system has made many significant contributions to the TEXT experimental program during the past five years. Contributions include electron temperature information used in the following areas of study: electron cyclotron heating (ECH), pellet injection, and impurity/energy transport. Details of the role which the Auburn ECE system has played will now be discussed

  6. The Cultural Content of Business Spanish Texts.

    Science.gov (United States)

    Grosse, Christine Uber; Uber, David

    A study examined eight business Spanish textbooks for cultural content by looking at commonly appearing cultural topics and themes, presentation of cultural information, activities and techniques used to promote cultural understanding, and incorporation of authentic materials. The texts were evenly divided among beginning, intermediate, and…

  7. Neogeography: The Treasure of User Volunteered Text

    NARCIS (Netherlands)

    Habib, Mena Badieh

    Neogeography is the combination of user generated data and experiences with mapping technologies. This poster presents a research project to extract valuable structured information with a geographic component from unstructured user generated text in wikis, forums, or SMSs. The project intends to

  8. Monolingual accounting dictionaries for EFL text production

    DEFF Research Database (Denmark)

    Nielsen, Sandro

    2006-01-01

    Monolingual accounting dictionaries are important for producing financial reporting texts in English in an international setting, because of the lack of specialised bilingual dictionaries. As the intended user groups have different factual and linguistic competences, they require specific types...... text production. The monolingual accounting dictionary needs to include information about UK, US and international accounting terms, their grammatical properties, their potential for being combined with other words in collocations, phrases and sentences in order to meet user requirements. Data items...... of information. By identifying and analysing the users' factual and linguistic competences, user needs, use-situations and the stages involved in producing accounting texts in English as a foreign language, lexicographers will have a sound basis for designing the optimal English accounting dictionary for EFL...

  9. Monolingual Accounting Dictionaries for EFL Text Production

    DEFF Research Database (Denmark)

    Nielsen, Sandro

    2009-01-01

    Monolingual accounting dictionaries are important for producing financial reporting texts in English in an international setting, because of the lack of specialised bilingual dictionaries. As the intended user groups have different factual and linguistic competences, they require specific types...... text production. The monolingual accounting dictionary needs to include information about UK, US and international accounting terms, their grammatical properties, their potential for being combined with other words in collocations, phrases and sentences in order to meet user requirements. Data items...... of information. By identifying and analysing the users' factual and linguistic competences, user needs, use-situations and the stages involved in producing accounting texts in English as a foreign language, lexicographers will have a sound basis for designing the optimal English accounting dictionary for EFL...

  10. GPU-Accelerated Text Mining

    International Nuclear Information System (INIS)

    Cui, X.; Mueller, F.; Zhang, Y.; Potok, Thomas E.

    2009-01-01

    Accelerating hardware devices represent a novel promise for improving the performance for many problem domains but it is not clear for which domains what accelerators are suitable. While there is no room in general-purpose processor design to significantly increase the processor frequency, developers are instead resorting to multi-core chips duplicating conventional computing capabilities on a single die. Yet, accelerators offer more radical designs with a much higher level of parallelism and novel programming environments. This present work assesses the viability of text mining on CUDA. Text mining is one of the key concepts that has become prominent as an effective means to index the Internet, but its applications range beyond this scope and extend to providing document similarity metrics, the subject of this work. We have developed and optimized text search algorithms for GPUs to exploit their potential for massive data processing. We discuss the algorithmic challenges of parallelization for text search problems on GPUs and demonstrate the potential of these devices in experiments by reporting significant speedups. Our study may be one of the first to assess more complex text search problems for suitability for GPU devices, and it may also be one of the first to exploit and report on atomic instruction usage that have recently become available in NVIDIA devices

  11. Comprehending text in literature class

    Directory of Open Access Journals (Sweden)

    Purić Daliborka S.

    2016-01-01

    Full Text Available The paper discusses the problem of understanding a text and the contribution of methodological apparatus in the reader book to comprehension of a text being read in junior classes of elementary school. By using the technique of content analysis from methodological apparatuses in eight reader books for the fourth grade of elementary school, approved for usage in 2014/2015 academic year, and surveying 350 teachers in 33 elementary schools and 11 administrative districts in the Republic of Serbia we examined: (a to what extent the Serbian language text book contents enable junior students to understand a literary text; (b to what extent teachers accept the suggestions offered in the textbook for preparing literature teaching. The results show that a large number of suggestions relate to reading comprehension, but some of categories of understanding are unevenly distributed in the methodological apparatus. On the other hand, the majority of teachers use the methodological apparatus given in a textbook for preparing classes, not only the textbook he or she selected for teaching but also other textbooks for the same grade.

  12. Augmenting Oracle Text with the UMLS for enhanced searching of free-text medical reports.

    Science.gov (United States)

    Ding, Jing; Erdal, Selnur; Dhaval, Rakesh; Kamal, Jyoti

    2007-10-11

    The intrinsic complexity of free-text medical reports imposes great challenges for information retrieval systems. We have developed a prototype search engine for retrieving clinical reports that leverages the powerful indexing and querying capabilities of Oracle Text, and the rich biomedical domain knowledge and semantic structures that are captured in the UMLS Metathesaurus.

  13. A Guide Text or Many Texts? "That is the Question”

    Directory of Open Access Journals (Sweden)

    Delgado de Valencia Sonia

    2001-08-01

    Full Text Available The use of supplementary materials in the classroom has always been an essential part of the teaching and learning process. To restrict our teaching to the scope of one single textbook means to stand behind the advances of knowledge, in any area and context. Young learners appreciate any new and varied support that expands their knowledge of the world: diaries, letters, panels, free texts, magazines, short stories, poems or literary excerpts, and articles taken from Internet are materials that will allow learnersto share more and work more collaboratively. In this article we are going to deal with some of these materials, with the criteria to select, adapt, and create them that may be of interest to the learner and that may promote reading and writing processes. Since no text can entirely satisfy the needs of students and teachers, the creativity of both parties will be necessary to improve the quality of teaching through the adequate use and adaptation of supplementary materials.

  14. Monolingual accounting dictionaries for EFL text production

    Directory of Open Access Journals (Sweden)

    Sandro Nielsen

    2006-10-01

    Full Text Available Monolingual accounting dictionaries are important for producing financial reporting texts in English in an international setting, because of the lack of specialised bilingual dictionaries. As the intended user groups have different factual and linguistic competences, they require specific types of information. By identifying and analysing the users' factual and linguistic competences, user needs, use-situations and the stages involved in producing accounting texts in English as a foreign language, lexicographers will have a sound basis for designing the optimal English accounting dictionary for EFL text production. The monolingual accounting dictionary needs to include information about UK, US and international accounting terms, their grammatical properties, their potential for being combined with other words in collocations, phrases and sentences in order to meet user requirements. Data items that deal with these aspects are necessary for the international user group as they produce subject-field specific and register-specific texts in a foreign language, and the data items are relevant for the various stages in text production: draft writing, copyediting, stylistic editing and proofreading.

  15. Figure-associated text summarization and evaluation.

    Science.gov (United States)

    Polepalli Ramesh, Balaji; Sethi, Ricky J; Yu, Hong

    2015-01-01

    Biomedical literature incorporates millions of figures, which are a rich and important knowledge resource for biomedical researchers. Scientists need access to the figures and the knowledge they represent in order to validate research findings and to generate new hypotheses. By themselves, these figures are nearly always incomprehensible to both humans and machines and their associated texts are therefore essential for full comprehension. The associated text of a figure, however, is scattered throughout its full-text article and contains redundant information content. In this paper, we report the continued development and evaluation of several figure summarization systems, the FigSum+ systems, that automatically identify associated texts, remove redundant information, and generate a text summary for every figure in an article. Using a set of 94 annotated figures selected from 19 different journals, we conducted an intrinsic evaluation of FigSum+. We evaluate the performance by precision, recall, F1, and ROUGE scores. The best FigSum+ system is based on an unsupervised method, achieving F1 score of 0.66 and ROUGE-1 score of 0.97. The annotated data is available at figshare.com (http://figshare.com/articles/Figure_Associated_Text_Summarization_and_Evaluation/858903).

  16. The Information Content of Picture-Text Assembly Instructions.

    Science.gov (United States)

    1982-03-01

    Mm *i ofsi big dart as tal hierarchy Of smu~assablle NO pa t~en,1 Is hirarchy the Produe Of a given vsamSSy hUNt PIem .1 about bere . ’" - A" W~*𔃾...I LCOL J. C. Fagnbarger , 3oulder. co .,03O9 DMRCWIIA.E CF PERSnIUMEL APPL !ED RESKARC r. John S. Drawn 101 CMOII BY DRMV XEIC PaleO Alto Research

  17. Context and Structure in Automated Full-Text Information Access

    Science.gov (United States)

    1994-04-29

    Meisei, Makayo, Nitsuko and Tamura, all of Japan; Goldstar, Samsung and OPC of South Korea, and Sun Moon Star of Taiwan; AT&T says the practices have...IN MALAYSIA [ ... ] Another example topic description is shown below: Topic 034 <dom> Domain: Science and Technology <title>Topic: Entities Involved In

  18. Defense Technical Information Center Free Text Experiment - Technical Report File.

    Science.gov (United States)

    1981-10-01

    but also to minimize "no hits." For the purpose of this test the National Library of Medicine (NLM) Stop Word List (appendix D) was used as a basis for...Search @SRTAB@ black cats end 8. Synonyms should be considered as part of the search strategy: Example: MARIJUANA, MARIHUANA , POT, GRASS, WEED, MARY

  19. Defense Technical Information Center Free Text Experiment - Management Data Bases.

    Science.gov (United States)

    1981-10-01

    that can be retrieved directly from the inverted file. In other online systems, such as the National Library of Medicine /MEDLARS or the Systems...should be considered during search strategy formulation. EXAMPLE: Marijuana, Marihuana , Pot, Grass, Weed, Mary Jane 8. Foreign spellings should be

  20. Individual Profiling Using Text Analysis

    Science.gov (United States)

    2016-04-15

    AFRL-AFOSR-UK-TR-2016-0011 Individual Profiling using Text Analysis 140333 Mark Stevenson UNIVERSITY OF SHEFFIELD, DEPARTMENT OF PSYCHOLOGY Final...REPORT TYPE      Final 3.  DATES COVERED (From - To)      15 Sep 2014 to 14 Sep 2015 4.  TITLE AND SUBTITLE Individual Profiling using Text Analysis ...consisted of collections of tweets for a number of Twitter users whose gender, age and personality scores are known. The task was to construct some system

  1. Identifying issue frames in text.

    Directory of Open Access Journals (Sweden)

    Eyal Sagi

    Full Text Available Framing, the effect of context on cognitive processes, is a prominent topic of research in psychology and public opinion research. Research on framing has traditionally relied on controlled experiments and manually annotated document collections. In this paper we present a method that allows for quantifying the relative strengths of competing linguistic frames based on corpus analysis. This method requires little human intervention and can therefore be efficiently applied to large bodies of text. We demonstrate its effectiveness by tracking changes in the framing of terror over time and comparing the framing of abortion by Democrats and Republicans in the U.S.

  2. Finding text in color images

    Science.gov (United States)

    Zhou, Jiangying; Lopresti, Daniel P.; Tasdizen, Tolga

    1998-04-01

    In this paper, we consider the problem of locating and extracting text from WWW images. A previous algorithm based on color clustering and connected components analysis works well as long as the color of each character is relatively uniform and the typography is fairly simple. It breaks down quickly, however, when these assumptions are violated. In this paper, we describe more robust techniques for dealing with this challenging problem. We present an improved color clustering algorithm that measures similarity based on both RGB and spatial proximity. Layout analysis is also incorporated to handle more complex typography. THese changes significantly enhance the performance of our text detection procedure.

  3. The Medline/full-text research project.

    Science.gov (United States)

    McKinin, E J; Sievert, M; Johnson, E D; Mitchell, J A

    1991-05-01

    This project was designed to test the relative efficacy of index terms and full-text for the retrieval of documents in those MEDLINE journals for which full-text searching was also available. The full-text files used were MEDIS from Mead Data Central and CCML from BRS Information Technologies. One hundred clinical medical topics were searched in these two files as well as the MEDLINE file to accumulate the necessary data. It was found that full-text identified significantly more relevant articles than did the indexed file, MEDLINE. The full-text searches, however, lacked the precision of searches done in the indexed file. Most relevant items missed in the full-text files, but identified in MEDLINE, were missed because the searcher failed to account for some aspect of natural language, used a logical or positional operator that was too restrictive, or included a concept which was implied, but not expressed in the natural language. Very few of the unique relevant full-text citations would have been retrieved by title or abstract alone. Finally, as of July, 1990 the more current issue of a journal was just as likely to appear in MEDLINE as in one of the full-text files.

  4. Schema-Based Text Comprehension

    Science.gov (United States)

    Ensar, Ferhat

    2015-01-01

    Schema is one of the most common terms used for classifying and constructing knowledge. Therefore, a schema is a pre-planned set of concepts. It usually contains social information and is used to represent chain of events, perceptions, situations, relationships and even objects. For example, Kant initially defines the idea of schema as some…

  5. Multilingual text induced spelling correction

    NARCIS (Netherlands)

    Reynaert, M.W.C.

    2004-01-01

    We present TISC, a multilingual, language-independent and context-sensitive spelling checking and correction system designed to facilitate the automatic removal of non-word spelling errors in large corpora. Its lexicon is derived from raw text corpora, without supervision, and contains word unigrams

  6. Automated analysis of instructional text

    Energy Technology Data Exchange (ETDEWEB)

    Norton, L.M.

    1983-05-01

    The development of a capability for automated processing of natural language text is a long-range goal of artificial intelligence. This paper discusses an investigation into the issues involved in the comprehension of descriptive, as opposed to illustrative, textual material. The comprehension process is viewed as the conversion of knowledge from one representation into another. The proposed target representation consists of statements of the prolog language, which can be interpreted both declaratively and procedurally, much like production rules. A computer program has been written to model in detail some ideas about this process. The program successfully analyzes several heavily edited paragraphs adapted from an elementary textbook on programming, automatically synthesizing as a result of the analysis a working Prolog program which, when executed, can parse and interpret let commands in the basic language. The paper discusses the motivations and philosophy of the project, the many kinds of prerequisite knowledge which are necessary, and the structure of the text analysis program. A sentence-by-sentence account of the analysis of the sample text is presented, describing the syntactic and semantic processing which is involved. The paper closes with a discussion of lessons learned from the project, possible alternative approaches, and possible extensions for future work. The entire project is presented as illustrative of the nature and complexity of the text analysis process, rather than as providing definitive or optimal solutions to any aspects of the task. 12 references.

  7. Solar Concepts: A Background Text.

    Science.gov (United States)

    Gorham, Jonathan W.

    This text is designed to provide teachers, students, and the general public with an overview of key solar energy concepts. Various energy terms are defined and explained. Basic thermodynamic laws are discussed. Alternative energy production is described in the context of the present energy situation. Described are the principal contemporary solar…

  8. FTP: Full-Text Publishing?

    Science.gov (United States)

    Jul, Erik

    1992-01-01

    Describes the use of file transfer protocol (FTP) on the INTERNET computer network and considers its use as an electronic publishing system. The differing electronic formats of text files are discussed; the preparation and access of documents are described; and problems are addressed, including a lack of consistency. (LRW)

  9. Quality Inspection of Printed Texts

    DEFF Research Database (Denmark)

    Pedersen, Jesper Ballisager; Nasrollahi, Kamal; Moeslund, Thomas B.

    2016-01-01

    -folded: for costumers of the printing and verification system, the overall grade used to verify if the text is of sufficient quality, while for printer's manufacturer, the detailed character/symbols grades and quality measurements are used for the improvement and optimization of the printing task. The proposed system...

  10. Text mining in livestock animal science: introducing the potential of text mining to animal sciences.

    Science.gov (United States)

    Sahadevan, S; Hofmann-Apitius, M; Schellander, K; Tesfaye, D; Fluck, J; Friedrich, C M

    2012-10-01

    In biological research, establishing the prior art by searching and collecting information already present in the domain has equal importance as the experiments done. To obtain a complete overview about the relevant knowledge, researchers mainly rely on 2 major information sources: i) various biological databases and ii) scientific publications in the field. The major difference between the 2 information sources is that information from databases is available, typically well structured and condensed. The information content in scientific literature is vastly unstructured; that is, dispersed among the many different sections of scientific text. The traditional method of information extraction from scientific literature occurs by generating a list of relevant publications in the field of interest and manually scanning these texts for relevant information, which is very time consuming. It is more than likely that in using this "classical" approach the researcher misses some relevant information mentioned in the literature or has to go through biological databases to extract further information. Text mining and named entity recognition methods have already been used in human genomics and related fields as a solution to this problem. These methods can process and extract information from large volumes of scientific text. Text mining is defined as the automatic extraction of previously unknown and potentially useful information from text. Named entity recognition (NER) is defined as the method of identifying named entities (names of real world objects; for example, gene/protein names, drugs, enzymes) in text. In animal sciences, text mining and related methods have been briefly used in murine genomics and associated fields, leaving behind other fields of animal sciences, such as livestock genomics. The aim of this work was to develop an information retrieval platform in the livestock domain focusing on livestock publications and the recognition of relevant data from

  11. The text of the amended Protocol to the Agreement between the Kingdom of Swaziland and the International Atomic Energy Agency for the Application of Safeguards in Connection with the Treaty on the Non-Proliferation of Nuclear Weapons, is reproduced in this document for the information of all Member States of the Agency

    International Nuclear Information System (INIS)

    2010-01-01

    The text of the amended Protocol to the Agreement between the Kingdom of Swaziland and the International Atomic Energy Agency for the Application of Safeguards in Connection with the Treaty on the Non-Proliferation of Nuclear Weapons, is reproduced in this document for the information of all Member States of the Agency [es

  12. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...... the accuracy at the same time. The test example is classified using simpler and smaller model. The training examples in a particular cluster share the common vocabulary. At the time of clustering, we do not take into account the labels of the training examples. After the clusters have been created......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...

  13. Linguistic dating of biblical texts

    DEFF Research Database (Denmark)

    Young, Ian; Rezetko, Robert; Ehrensvärd, Martin Gustaf

    Since the beginning of critical scholarship biblical texts have been dated using linguistic evidence. In recent years this has become a controversial topic, especially with the publication of Ian Young (ed.), Biblical Hebrew: Studies in Chronology and Typology (2003). However, until now there has...... been no introduction and comprehensive study of the field. Volume 1 introduces the field of linguistic dating of biblical texts, particularly to intermediate and advanced students of biblical Hebrew who have a reasonable background in the language, having completed at least an introductory course...... in this volume are: What is it that makes Archaic Biblical Hebrew archaic , Early Biblical Hebrew early , and Late Biblical Hebrew late ? Does linguistic typology, i.e. different linguistic characteristics, convert easily and neatly into linguistic chronology, i.e. different historical origins? A large amount...

  14. Text as an Autopoietic System

    DEFF Research Database (Denmark)

    Nicolaisen, Maria Skou

    2016-01-01

    The aim of the present research article is to discuss the possibilities and limitations in addressing text as an autopoietic system. The theory of autopoiesis originated in the field of biology in order to explain the dynamic processes entailed in sustaining living organisms at cellular level. Th....... By comparing the biological with the textual account of autopoietic agency, the end conclusion is that a newly derived concept of sociopoiesis might be better suited for discussing the architecture of textual systems....

  15. The TEXT upgrade vertical interferometer

    International Nuclear Information System (INIS)

    Hallock, G.A.; Gartman, M.L.; Li, W.; Chiang, K.; Shin, S.; Castles, R.L.; Chatterjee, R.; Rahman, A.S.

    1992-01-01

    A far-infrared interferometer has been installed on TEXT upgrade to obtain electron density profiles. The primary system views the plasma vertically through a set of large (60-cm radialx7.62-cm toroidal) diagnostic ports. A 1-cm channel spacing (59 channels total) and fast electronic time response is used, to provide high resolution for radial profiles and perturbation experiments. Initial operation of the vertical system was obtained late in 1991, with six operating channels

  16. Benchmarking infrastructure for mutation text mining.

    Science.gov (United States)

    Klein, Artjom; Riazanov, Alexandre; Hindle, Matthew M; Baker, Christopher Jo

    2014-02-25

    Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption.

  17. Benchmarking infrastructure for mutation text mining

    Science.gov (United States)

    2014-01-01

    Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600

  18. Rhetorical structure theory and text analysis

    Science.gov (United States)

    Mann, William C.; Matthiessen, Christian M. I. M.; Thompson, Sandra A.

    1989-11-01

    Recent research on text generation has shown that there is a need for stronger linguistic theories that tell in detail how texts communicate. The prevailing theories are very difficult to compare, and it is also very difficult to see how they might be combined into stronger theories. To make comparison and combination a bit more approachable, we have created a book which is designed to encourage comparison. A dozen different authors or teams, all experienced in discourse research, are given exactly the same text to analyze. The text is an appeal for money by a lobbying organization in Washington, DC. It informs, stimulates and manipulates the reader in a fascinating way. The joint analysis is far more insightful than any one team's analysis alone. This paper is our contribution to the book. Rhetorical Structure Theory (RST), the focus of this paper, is a way to account for the functional potential of text, its capacity to achieve the purposes of speakers and produce effects in hearers. It also shows a way to distinguish coherent texts from incoherent ones, and identifies consequences of text structure.

  19. Middle school children's game playing preferences: Case studies of children's experiences playing and critiquing science-related educational games

    Science.gov (United States)

    Joseph, Dolly Rebecca Doran

    The playing of computer games is one of the most popular non-school activities of children, particularly boys, and is often the entry point to greater facility with and use of other computer applications. Children are learning skills as they play, but what they learn often does not generalize beyond application to that and other similar games. Nevertheless, games have the potential to develop in students the knowledge and skills described by national and state educational standards. This study focuses upon middle-school aged children, and how they react to and respond to computer games designed for entertainment and educational purposes, within the context of science learning. Through qualitative, case study methodology, the game play, evaluation, and modification experiences of four diverse middle-school-aged students in summer camps are analyzed. The inquiry focused on determining the attributes of computer games that appeal to middle school students, the aspects of science that appeal to middle school children, and ultimately, how science games might be designed to appeal to middle school children. Qualitative data analysis led to the development of a method for describing players' activity modes during game play, rather than the conventional methods that describe game characteristics. These activity modes are used to describe the game design preferences of the participants. Recommendations are also made in the areas of functional, aesthetic, and character design and for the design of educational games. Middle school students may find the topical areas of forensics, medicine, and the environment to be of most interest; designing games in and across these topic areas has the potential for encouraging voluntary science-related play. Finally, when including children in game evaluation and game design activities, results suggest the value of providing multiple types of activities in order to encourage the full participation of all children.

  20. Attitudes toward Science: Measurement and Psychometric Properties of the Test of Science-Related Attitudes for Its Use in Spanish-Speaking Classrooms

    Science.gov (United States)

    Navarro, Marianela; Förster, Carla; González, Caterina; González-Pose, Paulina

    2016-01-01

    Understanding attitudes toward science and measuring them remain two major challenges for science teaching. This article reviews the concept of attitudes toward science and their measurement. It subsequently analyzes the psychometric properties of the "Test of Science-Related Attitudes" (TOSRA), such as its construct validity, its…

  1. Adolescents' Motivation to Select an Academic Science-Related Career: The Role of School Factors, Individual Interest, and Science Self-Concept

    Science.gov (United States)

    Taskinen, Päivi H.; Schütte, Kerstin; Prenzel, Manfred

    2013-01-01

    Many researchers consider a lacking interest in science and the students' belief that science is too demanding as major reasons why young people do not strive for science-related careers. In this article, we first delineated a theoretical framework to investigate the importance of interest, self-concept, and school factors regarding students'…

  2. Biased limiter experiments on text

    International Nuclear Information System (INIS)

    Phillips, P.E.; Wootton, A.J.; Rowan, W.L.; Ritz, C.P.; Rhodes, T.L.; Bengtson, R.D.; Hodge, W.L.; Durst, R.D.; McCool, S.C.; Richards, B.; Gentle, K.W.; Schoch, P.; Forster, J.C.; Hickok, R.L.; Evans, T.E.

    1987-01-01

    Experiments using an electrically biased limiter have been performed on the Texas Experimental Tokamak (TEXT). A small movable limiter is inserted past the main poloidal ring limiter (which is electrically connected to the vacuum vessel) and biased at V Lim with respect to it. The floating potential, plasma potential and shear layer position can be controlled. With vertical strokeV Lim vertical stroke ≥ 50 V the plasma density increases. For V Lim Lim > 0 the results obtained are inconclusive. Variation of V Lim changes the electrostatic turbulence which may explain the observed total flux changes. (orig.)

  3. New Historicism: Text and Context

    Directory of Open Access Journals (Sweden)

    Violeta M. Vesić

    2016-02-01

    Full Text Available During most of the twentieth century history was seen as a phenomenon outside of literature that guaranteed the veracity of literary interpretation. History was unique and it functioned as a basis for reading literary works. During the seventies of the twentieth century there occurred a change of attitude towards history in American literary theory, and there appeared a new theoretical approach which soon became known as New Historicism. Since its inception, New Historicism has been identified with the study of Renaissance and Romanticism, but nowadays it has been increasingly involved in other literary trends. Although there are great differences in the arguments and practices at various representatives of this school, New Historicism has clearly recognizable features and many new historicists will agree with the statement of Walter Cohen that New Historicism, when it appeared in the eighties, represented something quite new in reference to the studies of theory, criticism and history (Cohen 1987, 33. Theoretical connection with Bakhtin, Foucault and Marx is clear, as well as a kind of uneasy tie with deconstruction and the work of Paul de Man. At the center of this approach is a renewed interest in the study of literary works in the light of historical and political circumstances in which they were created. Foucault encouraged readers to begin to move literary texts and to link them with discourses and representations that are not literary, as well as to examine the sociological aspects of the texts in order to take part in the social struggles of today. The study of literary works using New Historicism is the study of politics, history, culture and circumstances in which these works were created. With regard to one of the main fact which is located in the center of the criticism, that history cannot be viewed objectively and that reality can only be understood through a cultural context that reveals the work, re-reading and interpretation of

  4. Text Mining the History of Medicine.

    Science.gov (United States)

    Thompson, Paul; Batista-Navarro, Riza Theresa; Kontonatsios, Georgios; Carter, Jacob; Toon, Elizabeth; McNaught, John; Timmermann, Carsten; Worboys, Michael; Ananiadou, Sophia

    2016-01-01

    Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while

  5. Speech Act Classification of German Advertising Texts

    Directory of Open Access Journals (Sweden)

    Артур Нарманович Мамедов

    2015-12-01

    Full Text Available This paper uses the theory of speech acts and the underlying concept of pragmalinguistics to determine the types of speech acts and their classification in the German advertising printed texts. We ascertain that the advertising of cars and accessories, household appliances and computer equipment, watches, fancy goods, food, pharmaceuticals, and financial, insurance, legal services and also airline advertising is dominated by a pragmatic principle, which is based on demonstrating information about the benefits of a product / service. This influences the frequent usage of certain speech acts. The dominant form of exposure is to inform the recipient-user about the characteristics of the advertised product. This information is fore-grounded by means of stylistic and syntactic constructions specific to the advertisement (participial constructions, appositional constructions which contribute to emphasize certain notional components within the framework of the advertising text. Stylistic and syntactic devices of reduction (parceling constructions convey the author's idea. Other means like repetitions, enumerations etc are used by the advertiser to strengthen his selling power. The advertiser focuses the attention of the consumer on the characteristics of the product seeking to convince him of the utility of the product and to influence his/ her buying behavior.

  6. Transfer Learning beyond Text Classification

    Science.gov (United States)

    Yang, Qiang

    Transfer learning is a new machine learning and data mining framework that allows the training and test data to come from different distributions or feature spaces. We can find many novel applications of machine learning and data mining where transfer learning is necessary. While much has been done in transfer learning in text classification and reinforcement learning, there has been a lack of documented success stories of novel applications of transfer learning in other areas. In this invited article, I will argue that transfer learning is in fact quite ubiquitous in many real world applications. In this article, I will illustrate this point through an overview of a broad spectrum of applications of transfer learning that range from collaborative filtering to sensor based location estimation and logical action model learning for AI planning. I will also discuss some potential future directions of transfer learning.

  7. WYLBUR reference manual. [For interactive text editing

    Energy Technology Data Exchange (ETDEWEB)

    Krupp, R.F.; Messina, P.C.; Peavler, J.M.; Schustack, S.; Starai, T.

    1977-04-01

    WYLBUR is a system for manipulating various kinds of text, such as computer programs, manuscripts, letters, forms, articles, or reports. Its on-line interactive text-editing capabilities allow the user to create, change, and correct text, and to search and display it. WYLBUR also has facilities for job submission and retrieval from remote terminals that make it possible for a user to inquire about the status of any job in the system, cancel jobs that are executing or awaiting execution, reroute output, raise job priority, or get information on the backlog of batch jobs. WYLBUR also has excellent recovery capabilities and a fast response time. This manual describes the WYLBUR version currently used at ANL. It is intended primarily as a reference manual; thus, examples of WYLBUR commands are kept to a minimum. (RWR)

  8. VisualUrText: A Text Analytics Tool for Unstructured Textual Data

    Science.gov (United States)

    Zainol, Zuraini; Jaymes, Mohd T. H.; Nohuddin, Puteri N. E.

    2018-05-01

    The growing amount of unstructured text over Internet is tremendous. Text repositories come from Web 2.0, business intelligence and social networking applications. It is also believed that 80-90% of future growth data is available in the form of unstructured text databases that may potentially contain interesting patterns and trends. Text Mining is well known technique for discovering interesting patterns and trends which are non-trivial knowledge from massive unstructured text data. Text Mining covers multidisciplinary fields involving information retrieval (IR), text analysis, natural language processing (NLP), data mining, machine learning statistics and computational linguistics. This paper discusses the development of text analytics tool that is proficient in extracting, processing, analyzing the unstructured text data and visualizing cleaned text data into multiple forms such as Document Term Matrix (DTM), Frequency Graph, Network Analysis Graph, Word Cloud and Dendogram. This tool, VisualUrText, is developed to assist students and researchers for extracting interesting patterns and trends in document analyses.

  9. A programmed text in statistics

    CERN Document Server

    Hine, J

    1975-01-01

    Exercises for Section 2 42 Physical sciences and engineering 42 43 Biological sciences 45 Social sciences Solutions to Exercises, Section 1 47 Physical sciences and engineering 47 49 Biological sciences 49 Social sciences Solutions to Exercises, Section 2 51 51 PhYSical sciences and engineering 55 Biological sciences 58 Social sciences 62 Tables 2 62 x - tests involving variances 2 63,64 x - one tailed tests 2 65 x - two tailed tests F-distribution 66-69 Preface This project started some years ago when the Nuffield Foundation kindly gave a grant for writing a pro­ grammed text to use with service courses in statistics. The work carried out by Mrs. Joan Hine and Professor G. B. Wetherill at Bath University, together with some other help from time to time by colleagues at Bath University and elsewhere. Testing was done at various colleges and universities, and some helpful comments were received, but we particularly mention King Edwards School, Bath, who provided some sixth formers as 'guinea pigs' for the fir...

  10. Information Space, Information Field, Information Environment

    Directory of Open Access Journals (Sweden)

    Victor Ya. Tsvetkov

    2014-08-01

    Full Text Available The article analyzes information space, information field and information environment; shows that information space can be natural and artificial; information field is substantive and processual object and articulates the space property; information environment is concerned with some object and acts as the surrounding in relation to it and is considered with regard to it. It enables to define information environment as a subset of information space. It defines its passive description. Information environment can also be defined as a subset of information field. It corresponds to its active description.

  11. Practical text mining and statistical analysis for non-structured text data applications

    CERN Document Server

    Miner, Gary; Hill, Thomas; Nisbet, Robert; Delen, Dursun

    2012-01-01

    The world contains an unimaginably vast amount of digital information which is getting ever vaster ever more rapidly. This makes it possible to do many things that previously could not be done: spot business trends, prevent diseases, combat crime and so on. Managed well, the textual data can be used to unlock new sources of economic value, provide fresh insights into science and hold governments to account. As the Internet expands and our natural capacity to process the unstructured text that it contains diminishes, the value of text mining for information retrieval and search will increase d

  12. Layout-aware text extraction from full-text PDF of scientific articles

    Directory of Open Access Journals (Sweden)

    Ramakrishnan Cartic

    2012-05-01

    Full Text Available Abstract Background The Portable Document Format (PDF is the most commonly used file format for online scientific publications. The absence of effective means to extract text from these PDF files in a layout-aware manner presents a significant challenge for developers of biomedical text mining or biocuration informatics systems that use published literature as an information source. In this paper we introduce the ‘Layout-Aware PDF Text Extraction’ (LA-PDFText system to facilitate accurate extraction of text from PDF files of research articles for use in text mining applications. Results Our paper describes the construction and performance of an open source system that extracts text blocks from PDF-formatted full-text research articles and classifies them into logical units based on rules that characterize specific sections. The LA-PDFText system focuses only on the textual content of the research articles and is meant as a baseline for further experiments into more advanced extraction methods that handle multi-modal content, such as images and graphs. The system works in a three-stage process: (1 Detecting contiguous text blocks using spatial layout processing to locate and identify blocks of contiguous text, (2 Classifying text blocks into rhetorical categories using a rule-based method and (3 Stitching classified text blocks together in the correct order resulting in the extraction of text from section-wise grouped blocks. We show that our system can identify text blocks and classify them into rhetorical categories with Precision1 = 0.96% Recall = 0.89% and F1 = 0.91%. We also present an evaluation of the accuracy of the block detection algorithm used in step 2. Additionally, we have compared the accuracy of the text extracted by LA-PDFText to the text from the Open Access subset of PubMed Central. We then compared this accuracy with that of the text extracted by the PDF2Text system, 2commonly used to extract text from PDF

  13. Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification

    National Research Council Canada - National Science Library

    Han, Euihong; Karypis, George; Kumar, Vipin

    1999-01-01

    .... The authors present a nearest neighbor classification scheme for text categorization in which the importance of discriminating words is learned using mutual information and weight adjustment techniques...

  14. Drawing on Text Features for Reading Comprehension and Composing

    Science.gov (United States)

    Risko, Victoria J.; Walker-Dalhouse, Doris

    2011-01-01

    Students read multiple-genre texts such as graphic novels, poetry, brochures, digitized texts with videos, and informational and narrative texts. Features such as overlapping illustrations and implied cause-and-effect relationships can affect students' comprehension. Teaching with these texts and drawing attention to organizational features hold…

  15. Texting preferences in a Paediatric residency.

    Science.gov (United States)

    Draper, Lauren; Kuklinski, Cadence; Ladley, Amy; Adamson, Greg; Broom, Matthew

    2017-12-01

    Text messaging is ubiquitous among residents, but remains an underused educational tool. Though feasibility has been demonstrated, evidence of its ability to improve standardised test scores and provide insight on resident texting preferences is lacking. The authors set out to evaluate: (1) satisfaction with a hybrid question-and-answer (Q&A) texting format; and (2) pre-/post-paediatric in-training exam (ITE) performance. A prospective study with paediatrics and internal medicine-paediatrics residents. Residents were divided into subgroups: adolescent medicine (AM) and developmental medicine (DM). Messages were derived from ITE questions and sent Monday-Friday with a 20 per cent variance in messages specific to the sub-group. Residents completed surveys gauging perceptions of the programme, and pre- and post-programme ITE scores were analysed. Forty-one residents enrolled and 32 (78%) completed a post-programme survey. Of those, 21 (66%) preferred a Q&A format with an immediate text response versus information-only texts. The percentage change in ITE scores between 2013 and 2014 was significant. Comparing subgroups, there was no significant difference between the percentage change in ITE scores. Neither group performed significantly better on either the adolescent or developmental sections of the ITE. Text messaging… remains an underused educational tool CONCLUSIONS: Overall, participants improved their ITE scores, but no improvement was seen in the targeted subgroups on the exam. Although Q&A texts are preferred by residents, further assessment is required to assess the effect on educational outcomes. © 2017 John Wiley & Sons Ltd and The Association for the Study of Medical Education.

  16. Layout-aware text extraction from full-text PDF of scientific articles.

    Science.gov (United States)

    Ramakrishnan, Cartic; Patnia, Abhishek; Hovy, Eduard; Burns, Gully Apc

    2012-05-28

    The Portable Document Format (PDF) is the most commonly used file format for online scientific publications. The absence of effective means to extract text from these PDF files in a layout-aware manner presents a significant challenge for developers of biomedical text mining or biocuration informatics systems that use published literature as an information source. In this paper we introduce the 'Layout-Aware PDF Text Extraction' (LA-PDFText) system to facilitate accurate extraction of text from PDF files of research articles for use in text mining applications. Our paper describes the construction and performance of an open source system that extracts text blocks from PDF-formatted full-text research articles and classifies them into logical units based on rules that characterize specific sections. The LA-PDFText system focuses only on the textual content of the research articles and is meant as a baseline for further experiments into more advanced extraction methods that handle multi-modal content, such as images and graphs. The system works in a three-stage process: (1) Detecting contiguous text blocks using spatial layout processing to locate and identify blocks of contiguous text, (2) Classifying text blocks into rhetorical categories using a rule-based method and (3) Stitching classified text blocks together in the correct order resulting in the extraction of text from section-wise grouped blocks. We show that our system can identify text blocks and classify them into rhetorical categories with Precision1 = 0.96% Recall = 0.89% and F1 = 0.91%. We also present an evaluation of the accuracy of the block detection algorithm used in step 2. Additionally, we have compared the accuracy of the text extracted by LA-PDFText to the text from the Open Access subset of PubMed Central. We then compared this accuracy with that of the text extracted by the PDF2Text system, 2commonly used to extract text from PDF. Finally, we discuss preliminary error analysis for

  17. Can An Evolutionary Process Create English Text?

    Energy Technology Data Exchange (ETDEWEB)

    Bailey, David H.

    2008-10-29

    Critics of the conventional theory of biological evolution have asserted that while natural processes might result in some limited diversity, nothing fundamentally new can arise from 'random' evolution. In response, biologists such as Richard Dawkins have demonstrated that a computer program can generate a specific short phrase via evolution-like iterations starting with random gibberish. While such demonstrations are intriguing, they are flawed in that they have a fixed, pre-specified future target, whereas in real biological evolution there is no fixed future target, but only a complicated 'fitness landscape'. In this study, a significantly more sophisticated evolutionary scheme is employed to produce text segments reminiscent of a Charles Dickens novel. The aggregate size of these segments is larger than the computer program and the input Dickens text, even when comparing compressed data (as a measure of information content).

  18. Methods for Mining and Summarizing Text Conversations

    CERN Document Server

    Carenini, Giuseppe; Murray, Gabriel

    2011-01-01

    Due to the Internet Revolution, human conversational data -- in written forms -- are accumulating at a phenomenal rate. At the same time, improvements in speech technology enable many spoken conversations to be transcribed. Individuals and organizations engage in email exchanges, face-to-face meetings, blogging, texting and other social media activities. The advances in natural language processing provide ample opportunities for these "informal documents" to be analyzed and mined, thus creating numerous new and valuable applications. This book presents a set of computational methods

  19. Using Text Models In Diagnostic Tasks.

    Directory of Open Access Journals (Sweden)

    Korostil Yuriy

    2015-09-01

    Full Text Available This paper contains developing of a method of solving diagnostic tasks for complex technical objects (STO based on using text models (TMi to describe the functioning of STO. A TMi model is a text description, in normalized form, of all fragments of STO functioning process. The description of TMi is for med using semantic vocabularies of different types, which are generated on the basis of usage of information about all the aspects of STO construction and functioning. Such interpretation description is a subject area for tasks of STO diagnostics. Detection of malfunction and deviations of a functioning process of STO from an established functioning mode is implemented on the basis of analysis of semantic parameters of text description of the STO functioning process in order to determine semantic anomalies which occur in the descriptions of the STO functioning process, as well as in the descriptions of fragments of its functioning. Semantic anomalies occur in case when values of semantic parameters go beyond their established limits.

  20. Modeling statistical properties of written text.

    Directory of Open Access Journals (Sweden)

    M Angeles Serrano

    Full Text Available Written text is one of the fundamental manifestations of human language, and the study of its universal regularities can give clues about how our brains process information and how we, as a society, organize and share it. Among these regularities, only Zipf's law has been explored in depth. Other basic properties, such as the existence of bursts of rare words in specific documents, have only been studied independently of each other and mainly by descriptive models. As a consequence, there is a lack of understanding of linguistic processes as complex emergent phenomena. Beyond Zipf's law for word frequencies, here we focus on burstiness, Heaps' law describing the sublinear growth of vocabulary size with the length of a document, and the topicality of document collections, which encode correlations within and across documents absent in random null models. We introduce and validate a generative model that explains the simultaneous emergence of all these patterns from simple rules. As a result, we find a connection between the bursty nature of rare words and the topical organization of texts and identify dynamic word ranking and memory across documents as key mechanisms explaining the non trivial organization of written text. Our research can have broad implications and practical applications in computer science, cognitive science and linguistics.

  1. The Role of Intrinsic Motivation in the Pursuit of Health Science-Related Careers among Youth from Underrepresented Low Socioeconomic Populations.

    Science.gov (United States)

    Boekeloo, Bradley O; Jones, Chandria; Bhagat, Krishna; Siddiqui, Junaed; Wang, Min Qi

    2015-10-01

    A more diverse health science-related workforce including more underrepresented race/ethnic minorities, especially from low socioeconomic backgrounds, is needed to address health disparities in the USA. To increase such diversity, programs must facilitate youth interest in pursuing a health science-related career (HSRC). Minority youth from low socioeconomic families may focus on the secondary gains of careers, such as high income and status, given their low socioeconomic backgrounds. On the other hand, self-determination theory suggests that it is the intrinsic characteristics of careers which are most likely to sustain pursuit of an HSRC and lead to job satisfaction. Intrinsic and extrinsic motivation for pursuing an HSRC (defined in this study as health professional, health scientist, and medical doctor) was examined in a cohort of youth from the 10th to 12th grade from 2011 to 2013. The sample was from low-income area high schools, had a B- or above grade point average at baseline, and was predominantly: African American (65.7 %) or Hispanic (22.9 %), female (70.1 %), and children of foreign-born parents (64.7 %). In longitudinal general estimating equations, intrinsic motivation (but not extrinsic motivation) consistently predicted intention to pursue an HSRC. This finding provides guidance as to which youth and which qualities of HSRCs might deserve particular attention in efforts to increase diversity in the health science-related workforce.

  2. A STUDY OF TEXT MINING METHODS, APPLICATIONS,AND TECHNIQUES

    OpenAIRE

    R. Rajamani*1 & S. Saranya2

    2017-01-01

    Data mining is used to extract useful information from the large amount of data. It is used to implement and solve different types of research problems. The research related areas in data mining are text mining, web mining, image mining, sequential pattern mining, spatial mining, medical mining, multimedia mining, structure mining and graph mining. Text mining also referred to text of data mining, it is also called knowledge discovery in text (KDT) or knowledge of intelligent text analysis. T...

  3. Reading Aloud Expository Text to First- and Second-Graders: A Comparison of the Effects on Comprehension of During- and After-Reading Questioning

    Science.gov (United States)

    Heisey, Natalie Denise

    2009-01-01

    The purpose of this study was to compare the effects of questioning "during" a read-aloud and questioning "after" a read-aloud, using science-related informational tradebooks with first-and second-graders. Three thematically-related tradebooks were used, each portraying a scientist involved in authentic investigation. Students in two first/second…

  4. Text summarization as a decision support aid

    Directory of Open Access Journals (Sweden)

    Workman T

    2012-05-01

    Full Text Available Abstract Background PubMed data potentially can provide decision support information, but PubMed was not exclusively designed to be a point-of-care tool. Natural language processing applications that summarize PubMed citations hold promise for extracting decision support information. The objective of this study was to evaluate the efficiency of a text summarization application called Semantic MEDLINE, enhanced with a novel dynamic summarization method, in identifying decision support data. Methods We downloaded PubMed citations addressing the prevention and drug treatment of four disease topics. We then processed the citations with Semantic MEDLINE, enhanced with the dynamic summarization method. We also processed the citations with a conventional summarization method, as well as with a baseline procedure. We evaluated the results using clinician-vetted reference standards built from recommendations in a commercial decision support product, DynaMed. Results For the drug treatment data, Semantic MEDLINE enhanced with dynamic summarization achieved average recall and precision scores of 0.848 and 0.377, while conventional summarization produced 0.583 average recall and 0.712 average precision, and the baseline method yielded average recall and precision values of 0.252 and 0.277. For the prevention data, Semantic MEDLINE enhanced with dynamic summarization achieved average recall and precision scores of 0.655 and 0.329. The baseline technique resulted in recall and precision scores of 0.269 and 0.247. No conventional Semantic MEDLINE method accommodating summarization for prevention exists. Conclusion Semantic MEDLINE with dynamic summarization outperformed conventional summarization in terms of recall, and outperformed the baseline method in both recall and precision. This new approach to text summarization demonstrates potential in identifying decision support data for multiple needs.

  5. Pedoinformatics Approach to Soil Text Analytics

    Science.gov (United States)

    Furey, J.; Seiter, J.; Davis, A.

    2017-12-01

    The several extant schema for the classification of soils rely on differing criteria, but the major soil science taxonomies, including the United States Department of Agriculture (USDA) and the international harmonized World Reference Base for Soil Resources systems, are based principally on inferred pedogenic properties. These taxonomies largely result from compiled individual observations of soil morphologies within soil profiles, and the vast majority of this pedologic information is contained in qualitative text descriptions. We present text mining analyses of hundreds of gigabytes of parsed text and other data in the digitally available USDA soil taxonomy documentation, the Soil Survey Geographic (SSURGO) database, and the National Cooperative Soil Survey (NCSS) soil characterization database. These analyses implemented iPython calls to Gensim modules for topic modelling, with latent semantic indexing completed down to the lowest taxon level (soil series) paragraphs. Via a custom extension of the Natural Language Toolkit (NLTK), approximately one percent of the USDA soil series descriptions were used to train a classifier for the remainder of the documents, essentially by treating soil science words as comprising a novel language. While location-specific descriptors at the soil series level are amenable to geomatics methods, unsupervised clustering of the occurrence of other soil science words did not closely follow the usual hierarchy of soil taxa. We present preliminary phrasal analyses that may account for some of these effects.

  6. PEDANT: Parallel Texts in Göteborg

    Directory of Open Access Journals (Sweden)

    Daniel Ridings

    2012-09-01

    Full Text Available

    The article presents the status of the PEDANT project with parallel corpora at the Language Bank at Göteborg University. The solutions for access to the corpus data are presented. Access is provided by way of the internet and standard applications and SGML-aware programming tools. The SGML format for encoding translation pairs is outlined together. The methods allow working with everything from plain text to texts densely encoded with linguistic information.

     

    In hierdie artikel word 'n beskrywing gegee van die stand van die PEDANT-projek met parallelle korpora by die Taalbank by die Universiteit van Göteborg. Oplossings vir die verkryging van toegang tot die korpusdata word aangedui. Toegang word verskaf deur middel van die Internet en standaardtoepassings en SGML-sensitiewe programmeringshulpmiddels. Die SGML-formaat vir die enkodering van vertaalpare word gesamentlik geskets. Hierdie metodes laat toe dat gewerk kan word met enigiets vanaf suiwer teks tot tekste wat taalkundig dig geëtiketteer is.

     

  7. Linking Video and Text via Representations of Narrative

    OpenAIRE

    Salway, Andrew; Graham, Mike; Tomadaki, Eleftheria; Xu, Yan

    2003-01-01

    The ongoing TIWO project is investigating the synthesis of language technologies, like information extraction and corpus-based text analysis, video data modeling and knowledge representation. The aim is to develop a computational account of how video and text can be integrated by representations of narrative in multimedia systems. The multimedia domain is that of film and audio description – an emerging text type that is produced specifically to be informative about the events and objects dep...

  8. Social Media Text Classification by Enhancing Well-Formed Text Trained Model

    Directory of Open Access Journals (Sweden)

    Phat Jotikabukkana

    2016-09-01

    Full Text Available Social media are a powerful communication tool in our era of digital information. The large amount of user-generated data is a useful novel source of data, even though it is not easy to extract the treasures from this vast and noisy trove. Since classification is an important part of text mining, many techniques have been proposed to classify this kind of information. We developed an effective technique of social media text classification by semi-supervised learning utilizing an online news source consisting of well-formed text. The computer first automatically extracts news categories, well-categorized by publishers, as classes for topic classification. A bag of words taken from news articles provides the initial keywords related to their category in the form of word vectors. The principal task is to retrieve a set of new productive keywords. Term Frequency-Inverse Document Frequency weighting (TF-IDF and Word Article Matrix (WAM are used as main methods. A modification of WAM is recomputed until it becomes the most effective model for social media text classification. The key success factor was enhancing our model with effective keywords from social media. A promising result of 99.50% accuracy was achieved, with more than 98.5% of Precision, Recall, and F-measure after updating the model three times.

  9. Terminology extraction from medical texts in Polish.

    Science.gov (United States)

    Marciniak, Małgorzata; Mykowiecka, Agnieszka

    2014-01-01

    Hospital documents contain free text describing the most important facts relating to patients and their illnesses. These documents are written in specific language containing medical terminology related to hospital treatment. Their automatic processing can help in verifying the consistency of hospital documentation and obtaining statistical data. To perform this task we need information on the phrases we are looking for. At the moment, clinical Polish resources are sparse. The existing terminologies, such as Polish Medical Subject Headings (MeSH), do not provide sufficient coverage for clinical tasks. It would be helpful therefore if it were possible to automatically prepare, on the basis of a data sample, an initial set of terms which, after manual verification, could be used for the purpose of information extraction. Using a combination of linguistic and statistical methods for processing over 1200 children hospital discharge records, we obtained a list of single and multiword terms used in hospital discharge documents written in Polish. The phrases are ordered according to their presumed importance in domain texts measured by the frequency of use of a phrase and the variety of its contexts. The evaluation showed that the automatically identified phrases cover about 84% of terms in domain texts. At the top of the ranked list, only 4% out of 400 terms were incorrect while out of the final 200, 20% of expressions were either not domain related or syntactically incorrect. We also observed that 70% of the obtained terms are not included in the Polish MeSH. Automatic terminology extraction can give results which are of a quality high enough to be taken as a starting point for building domain related terminological dictionaries or ontologies. This approach can be useful for preparing terminological resources for very specific subdomains for which no relevant terminologies already exist. The evaluation performed showed that none of the tested ranking procedures were

  10. Effect of the Interaction of Text Structure, Background Knowledge and Purpose on Attention to Text.

    Science.gov (United States)

    1982-04-01

    in4 the sense proposed by Craik and Lockhart (1972). All levels of representation would entail such preliminary processing operations as perceptual...109. Craik , F. I., & Lockhart , R. S. Levels of processing : A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 1972, 11... processes this information to a deeper level than those text elements that are less important or irrelevant. The terminology "deeper" level is used here

  11. Connected text reading and differences in text reading fluency in adult readers.

    Directory of Open Access Journals (Sweden)

    Sebastian Wallot

    Full Text Available The process of connected text reading has received very little attention in contemporary cognitive psychology. This lack of attention is in parts due to a research tradition that emphasizes the role of basic lexical constituents, which can be studied in isolated words or sentences. However, this lack of attention is in parts also due to the lack of statistical analysis techniques, which accommodate interdependent time series. In this study, we investigate text reading performance with traditional and nonlinear analysis techniques and show how outcomes from multiple analyses can used to create a more detailed picture of the process of text reading. Specifically, we investigate reading performance of groups of literate adult readers that differ in reading fluency during a self-paced text reading task. Our results indicate that classical metrics of reading (such as word frequency do not capture text reading very well, and that classical measures of reading fluency (such as average reading time distinguish relatively poorly between participant groups. Nonlinear analyses of distribution tails and reading time fluctuations provide more fine-grained information about the reading process and reading fluency.

  12. Doing Mathematics with Purpose: Mathematical Text Types

    Science.gov (United States)

    Dostal, Hannah M.; Robinson, Richard

    2018-01-01

    Mathematical literacy includes learning to read and write different types of mathematical texts as part of purposeful mathematical meaning making. Thus in this article, we describe how learning to read and write mathematical texts (proof text, algorithmic text, algebraic/symbolic text, and visual text) supports the development of students'…

  13. The socio-demographics of texting

    DEFF Research Database (Denmark)

    Ling, Richard; Bertel, Troels Fibæk; Sundsøy, Pål

    2012-01-01

    Who texts, and with whom do they text? This article examines the use of texting using metered traffic data from a large dataset (nearly 400 million anonymous text messages). We ask 1) How much do different age groups use mobile phone based texting (SMS)? 2) How wide is the circle of texting...

  14. Robust keyword retrieval method for OCRed text

    Science.gov (United States)

    Fujii, Yusaku; Takebe, Hiroaki; Tanaka, Hiroshi; Hotta, Yoshinobu

    2011-01-01

    Document management systems have become important because of the growing popularity of electronic filing of documents and scanning of books, magazines, manuals, etc., through a scanner or a digital camera, for storage or reading on a PC or an electronic book. Text information acquired by optical character recognition (OCR) is usually added to the electronic documents for document retrieval. Since texts generated by OCR generally include character recognition errors, robust retrieval methods have been introduced to overcome this problem. In this paper, we propose a retrieval method that is robust against both character segmentation and recognition errors. In the proposed method, the insertion of noise characters and dropping of characters in the keyword retrieval enables robustness against character segmentation errors, and character substitution in the keyword of the recognition candidate for each character in OCR or any other character enables robustness against character recognition errors. The recall rate of the proposed method was 15% higher than that of the conventional method. However, the precision rate was 64% lower.

  15. Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text.

    Science.gov (United States)

    Garten, Yael; Altman, Russ B

    2009-02-05

    Pharmacogenomics studies the relationship between genetic variation and the variation in drug response phenotypes. The field is rapidly gaining importance: it promises drugs targeted to particular subpopulations based on genetic background. The pharmacogenomics literature has expanded rapidly, but is dispersed in many journals. It is challenging, therefore, to identify important associations between drugs and molecular entities--particularly genes and gene variants, and thus these critical connections are often lost. Text mining techniques can allow us to convert the free-style text to a computable, searchable format in which pharmacogenomic concepts (such as genes, drugs, polymorphisms, and diseases) are identified, and important links between these concepts are recorded. Availability of full text articles as input into text mining engines is key, as literature abstracts often do not contain sufficient information to identify these pharmacogenomic associations. Thus, building on a tool called Textpresso, we have created the Pharmspresso tool to assist in identifying important pharmacogenomic facts in full text articles. Pharmspresso parses text to find references to human genes, polymorphisms, drugs and diseases and their relationships. It presents these as a series of marked-up text fragments, in which key concepts are visually highlighted. To evaluate Pharmspresso, we used a gold standard of 45 human-curated articles. Pharmspresso identified 78%, 61%, and 74% of target gene, polymorphism, and drug concepts, respectively. Pharmspresso is a text analysis tool that extracts pharmacogenomic concepts from the literature automatically and thus captures our current understanding of gene-drug interactions in a computable form. We have made Pharmspresso available at http://pharmspresso.stanford.edu.

  16. Text Summarization Using FrameNet-Based Semantic Graph Model

    Directory of Open Access Journals (Sweden)

    Xu Han

    2016-01-01

    Full Text Available Text summarization is to generate a condensed version of the original document. The major issues for text summarization are eliminating redundant information, identifying important difference among documents, and recovering the informative content. This paper proposes a Semantic Graph Model which exploits the semantic information of sentence using FSGM. FSGM treats sentences as vertexes while the semantic relationship as the edges. It uses FrameNet and word embedding to calculate the similarity of sentences. This method assigns weight to both sentence nodes and edges. After all, it proposes an improved method to rank these sentences, considering both internal and external information. The experimental results show that the applicability of the model to summarize text is feasible and effective.

  17. The Texts of the Agency's Relationship Agreements with Specialized Agencies

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1962-04-10

    The text of the relationship agreement which the Agency has concluded with the Inter-Governmental Maritime Consultative Organization, together with the protocol authenticating it, is reproduced in this document for the information of all Members of the Agency.

  18. The Texts of the Agency's Relationship Agreements with Specialized Agencies

    International Nuclear Information System (INIS)

    1962-01-01

    The text of the relationship agreement which the Agency has concluded with the Inter-Governmental Maritime Consultative Organization, together with the protocol authenticating it, is reproduced in this document for the information of all Members of the Agency

  19. SIAM 2007 Text Mining Competition dataset

    Data.gov (United States)

    National Aeronautics and Space Administration — Subject Area: Text Mining Description: This is the dataset used for the SIAM 2007 Text Mining competition. This competition focused on developing text mining...

  20. Measurement of [Formula: see text] polarisation in [Formula: see text] collisions at [Formula: see text] = 7 TeV.

    Science.gov (United States)

    Aaij, R; Adeva, B; Adinolfi, M; Affolder, A; Ajaltouni, Z; Albrecht, J; Alessio, F; Alexander, M; Ali, S; Alkhazov, G; Alvarez Cartelle, P; Alves, A A; Amato, S; Amerio, S; Amhis, Y; An, L; Anderlini, L; Anderson, J; Andreassen, R; Andreotti, M; Andrews, J E; Appleby, R B; Aquines Gutierrez, O; Archilli, F; Artamonov, A; Artuso, M; Aslanides, E; Auriemma, G; Baalouch, M; Bachmann, S; Back, J J; Badalov, A; Balagura, V; Baldini, W; Barlow, R J; Barschel, C; Barsuk, S; Barter, W; Batozskaya, V; Bauer, Th; Bay, A; Beddow, J; Bedeschi, F; Bediaga, I; Belogurov, S; Belous, K; Belyaev, I; Ben-Haim, E; Bencivenni, G; Benson, S; Benton, J; Berezhnoy, A; Bernet, R; Bettler, M-O; van Beuzekom, M; Bien, A; Bifani, S; Bird, T; Bizzeti, A; Bjørnstad, P M; Blake, T; Blanc, F; Blouw, J; Blusk, S; Bocci, V; Bondar, A; Bondar, N; Bonivento, W; Borghi, S; Borgia, A; Borsato, M; Bowcock, T J V; Bowen, E; Bozzi, C; Brambach, T; van den Brand, J; Bressieux, J; Brett, D; Britsch, M; Britton, T; Brook, N H; Brown, H; Bursche, A; Busetto, G; Buytaert, J; Cadeddu, S; Calabrese, R; Callot, O; Calvi, M; Calvo Gomez, M; Camboni, A; Campana, P; Campora Perez, D; Carbone, A; Carboni, G; Cardinale, R; Cardini, A; Carranza-Mejia, H; Carson, L; Carvalho Akiba, K; Casse, G; Cassina, L; Castillo Garcia, L; Cattaneo, M; Cauet, Ch; Cenci, R; Charles, M; Charpentier, Ph; Cheung, S-F; Chiapolini, N; Chrzaszcz, M; Ciba, K; Cid Vidal, X; Ciezarek, G; Clarke, P E L; Clemencic, M; Cliff, H V; Closier, J; Coca, C; Coco, V; Cogan, J; Cogneras, E; Collins, P; Comerma-Montells, A; Contu, A; Cook, A; Coombes, M; Coquereau, S; Corti, G; Corvo, M; Counts, I; Couturier, B; Cowan, G A; Craik, D C; Cruz Torres, M; Cunliffe, S; Currie, R; D'Ambrosio, C; Dalseno, J; David, P; David, P N Y; Davis, A; De Bruyn, K; De Capua, S; De Cian, M; De Miranda, J M; De Paula, L; De Silva, W; De Simone, P; Decamp, D; Deckenhoff, M; Del Buono, L; Déléage, N; Derkach, D; Deschamps, O; Dettori, F; Di Canto, A; Dijkstra, H; Donleavy, S; Dordei, F; Dorigo, M; Dosil Suárez, A; Dossett, D; Dovbnya, A; Dupertuis, F; Durante, P; Dzhelyadin, R; Dziurda, A; Dzyuba, A; Easo, S; Egede, U; Egorychev, V; Eidelman, S; Eisenhardt, S; Eitschberger, U; Ekelhof, R; Eklund, L; El Rifai, I; Elsasser, Ch; Esen, S; Evans, T; Falabella, A; Färber, C; Farinelli, C; Farry, S; Ferguson, D; Fernandez Albor, V; Ferreira Rodrigues, F; Ferro-Luzzi, M; Filippov, S; Fiore, M; Fiorini, M; Firlej, M; Fitzpatrick, C; Fiutowski, T; Fontana, M; Fontanelli, F; Forty, R; Francisco, O; Frank, M; Frei, C; Frosini, M; Fu, J; Furfaro, E; Gallas Torreira, A; Galli, D; Gandelman, M; Gandini, P; Gao, Y; Garofoli, J; Garra Tico, J; Garrido, L; Gaspar, C; Gauld, R; Gavardi, L; Gersabeck, E; Gersabeck, M; Gershon, T; Ghez, Ph; Gianelle, A; Giani, S; Gibson, V; Giubega, L; Gligorov, V V; Göbel, C; Golubkov, D; Golutvin, A; Gomes, A; Gordon, H; Gotti, C; Grabalosa Gándara, M; Graciani Diaz, R; Granado Cardoso, L A; Graugés, E; Graziani, G; Grecu, A; Greening, E; Gregson, S; Griffith, P; Grillo, L; Grünberg, O; Gui, B; Gushchin, E; Guz, Yu; Gys, T; Hadjivasiliou, C; Haefeli, G; Haen, C; Haines, S C; Hall, S; Hamilton, B; Hampson, T; Han, X; Hansmann-Menzemer, S; Harnew, N; Harnew, S T; Harrison, J; Hartmann, T; He, J; Head, T; Heijne, V; Hennessy, K; Henrard, P; Henry, L; Hernando Morata, J A; van Herwijnen, E; Heß, M; Hicheur, A; Hill, D; Hoballah, M; Hombach, C; Hulsbergen, W; Hunt, P; Hussain, N; Hutchcroft, D; Hynds, D; Iakovenko, V; Idzik, M; Ilten, P; Jacobsson, R; Jaeger, A; Jalocha, J; Jans, E; Jaton, P; Jawahery, A; Jezabek, M; Jing, F; John, M; Johnson, D; Jones, C R; Joram, C; Jost, B; Jurik, N; Kaballo, M; Kandybei, S; Kanso, W; Karacson, M; Karbach, T M; Kelsey, M; Kenyon, I R; Ketel, T; Khanji, B; Khurewathanakul, C; Klaver, S; Kochebina, O; Kolpin, M; Komarov, I; Koopman, R F; Koppenburg, P; Korolev, M; Kozlinskiy, A; Kravchuk, L; Kreplin, K; Kreps, M; Krocker, G; Krokovny, P; Kruse, F; Kucharczyk, M; Kudryavtsev, V; Kurek, K; Kvaratskheliya, T; La Thi, V N; Lacarrere, D; Lafferty, G; Lai, A; Lambert, D; Lambert, R W; Lanciotti, E; Lanfranchi, G; Langenbruch, C; Latham, T; Lazzeroni, C; Le Gac, R; van Leerdam, J; Lees, J-P; Lefèvre, R; Leflat, A; Lefrançois, J; Leo, S; Leroy, O; Lesiak, T; Leverington, B; Li, Y; Liles, M; Lindner, R; Linn, C; Lionetto, F; Liu, B; Liu, G; Lohn, S; Longstaff, I; Longstaff, I; Lopes, J H; Lopez-March, N; Lowdon, P; Lu, H; Lucchesi, D; Luisier, J; Luo, H; Lupato, A; Luppi, E; Lupton, O; Machefert, F; Machikhiliyan, I V; Maciuc, F; Maev, O; Malde, S; Manca, G; Mancinelli, G; Manzali, M; Maratas, J; Marchand, J F; Marconi, U; Marino, P; Märki, R; Marks, J; Martellotti, G; Martens, A; Martín Sánchez, A; Martinelli, M; Martinez Santos, D; Martinez Vidal, F; Martins Tostes, D; Massafferri, A; Matev, R; Mathe, Z; Matteuzzi, C; Mazurov, A; McCann, M; McCarthy, J; McNab, A; McNulty, R; McSkelly, B; Meadows, B; Meier, F; Meissner, M; Merk, M; Milanes, D A; Minard, M-N; Molina Rodriguez, J; Monteil, S; Moran, D; Morandin, M; Morawski, P; Mordà, A; Morello, M J; Moron, J; Mountain, R; Muheim, F; Müller, K; Muresan, R; Muster, B; Naik, P; Nakada, T; Nandakumar, R; Nasteva, I; Needham, M; Neri, N; Neubert, S; Neufeld, N; Neuner, M; Nguyen, A D; Nguyen, T D; Nguyen-Mau, C; Nicol, M; Niess, V; Niet, R; Nikitin, N; Nikodem, T; Novoselov, A; Oblakowska-Mucha, A; Obraztsov, V; Oggero, S; Ogilvy, S; Okhrimenko, O; Oldeman, R; Onderwater, G; Orlandea, M; Otalora Goicochea, J M; Owen, P; Oyanguren, A; Pal, B K; Palano, A; Palombo, F; Palutan, M; Panman, J; Papanestis, A; Pappagallo, M; Parkes, C; Parkinson, C J; Passaleva, G; Patel, G D; Patel, M; Patrignani, C; Pazos Alvarez, A; Pearce, A; Pellegrino, A; Penso, G; Pepe Altarelli, M; Perazzini, S; Perez Trigo, E; Perret, P; Perrin-Terrin, M; Pescatore, L; Pesen, E; Petridis, K; Petrolini, A; Picatoste Olloqui, E; Pietrzyk, B; Pilař, T; Pinci, D; Pistone, A; Playfer, S; Plo Casasus, M; Polci, F; Polok, G; Poluektov, A; Polycarpo, E; Popov, A; Popov, D; Popovici, B; Potterat, C; Powell, A; Prisciandaro, J; Pritchard, A; Prouve, C; Pugatch, V; Puig Navarro, A; Punzi, G; Qian, W; Rachwal, B; Rademacker, J H; Rakotomiaramanana, B; Rama, M; Rangel, M S; Raniuk, I; Rauschmayr, N; Raven, G; Redford, S; Reichert, S; Reid, M M; Dos Reis, A C; Ricciardi, S; Richards, A; Rinnert, K; Rives Molina, V; Roa Romero, D A; Robbe, P; Rodrigues, A B; Rodrigues, E; Rodriguez Perez, P; Roiser, S; Romanovsky, V; Romero Vidal, A; Rotondo, M; Rouvinet, J; Ruf, T; Ruffini, F; Ruiz, H; Ruiz Valls, P; Sabatino, G; Saborido Silva, J J; Sagidova, N; Sail, P; Saitta, B; Salustino Guimaraes, V; Sanchez Mayordomo, C; Sanmartin Sedes, B; Santacesaria, R; Santamarina Rios, C; Santovetti, E; Sapunov, M; Sarti, A; Satriano, C; Satta, A; Savrie, M; Savrina, D; Schiller, M; Schindler, H; Schlupp, M; Schmelling, M; Schmidt, B; Schneider, O; Schopper, A; Schune, M-H; Schwemmer, R; Sciascia, B; Sciubba, A; Seco, M; Semennikov, A; Senderowska, K; Sepp, I; Serra, N; Serrano, J; Sestini, L; Seyfert, P; Shapkin, M; Shapoval, I; Shcheglov, Y; Shears, T; Shekhtman, L; Shevchenko, V; Shires, A; Silva Coutinho, R; Simi, G; Sirendi, M; Skidmore, N; Skwarnicki, T; Smith, N A; Smith, E; Smith, E; Smith, J; Smith, M; Snoek, H; Sokoloff, M D; Soler, F J P; Soomro, F; Souza, D; Souza De Paula, B; Spaan, B; Sparkes, A; Spinella, F; Spradlin, P; Stagni, F; Stahl, S; Steinkamp, O; Stenyakin, O; Stevenson, S; Stoica, S; Stone, S; Storaci, B; Stracka, S; Straticiuc, M; Straumann, U; Stroili, R; Subbiah, V K; Sun, L; Sutcliffe, W; Swientek, K; Swientek, S; Syropoulos, V; Szczekowski, M; Szczypka, P; Szilard, D; Szumlak, T; T'Jampens, S; Teklishyn, M; Tellarini, G; Teodorescu, E; Teubert, F; Thomas, C; Thomas, E; van Tilburg, J; Tisserand, V; Tobin, M; Tolk, S; Tomassetti, L; Tonelli, D; Topp-Joergensen, S; Torr, N; Tournefier, E; Tourneur, S; Tran, M T; Tresch, M; Tsaregorodtsev, A; Tsopelas, P; Tuning, N; Ubeda Garcia, M; Ukleja, A; Ustyuzhanin, A; Uwer, U; Vagnoni, V; Valenti, G; Vallier, A; Vazquez Gomez, R; Vazquez Regueiro, P; Vázquez Sierra, C; Vecchi, S; Velthuis, J J; Veltri, M; Veneziano, G; Vesterinen, M; Viaud, B; Vieira, D; Vieites Diaz, M; Vilasis-Cardona, X; Vollhardt, A; Volyanskyy, D; Voong, D; Vorobyev, A; Vorobyev, V; Voß, C; Voss, H; de Vries, J A; Waldi, R; Wallace, C; Wallace, R; Walsh, J; Wandernoth, S; Wang, J; Ward, D R; Watson, N K; Webber, A D; Websdale, D; Whitehead, M; Wicht, J; Wiedner, D; Wiggers, L; Wilkinson, G; Williams, M P; Williams, M; Wilson, F F; Wimberley, J; Wishahi, J; Wislicki, W; Witek, M; Wormser, G; Wotton, S A; Wright, S; Wu, S; Wyllie, K; Xie, Y; Xing, Z; Xu, Z; Yang, Z; Yuan, X; Yushchenko, O; Zangoli, M; Zavertyaev, M; Zhang, F; Zhang, L; Zhang, W C; Zhang, Y; Zhelezov, A; Zhokhov, A; Zhong, L; Zvyagin, A

    The polarisation of prompt [Formula: see text] mesons is measured by performing an angular analysis of [Formula: see text] decays using proton-proton collision data, corresponding to an integrated luminosity of 1.0[Formula: see text], collected by the LHCb detector at a centre-of-mass energy of 7 TeV. The polarisation is measured in bins of transverse momentum [Formula: see text] and rapidity [Formula: see text] in the kinematic region [Formula: see text] and [Formula: see text], and is compared to theoretical models. No significant polarisation is observed.

  1. Counting OCR errors in typeset text

    Science.gov (United States)

    Sandberg, Jonathan S.

    1995-03-01

    Frequently object recognition accuracy is a key component in the performance analysis of pattern matching systems. In the past three years, the results of numerous excellent and rigorous studies of OCR system typeset-character accuracy (henceforth OCR accuracy) have been published, encouraging performance comparisons between a variety of OCR products and technologies. These published figures are important; OCR vendor advertisements in the popular trade magazines lead readers to believe that published OCR accuracy figures effect market share in the lucrative OCR market. Curiously, a detailed review of many of these OCR error occurrence counting results reveals that they are not reproducible as published and they are not strictly comparable due to larger variances in the counts than would be expected by the sampling variance. Naturally, since OCR accuracy is based on a ratio of the number of OCR errors over the size of the text searched for errors, imprecise OCR error accounting leads to similar imprecision in OCR accuracy. Some published papers use informal, non-automatic, or intuitively correct OCR error accounting. Still other published results present OCR error accounting methods based on string matching algorithms such as dynamic programming using Levenshtein (edit) distance but omit critical implementation details (such as the existence of suspect markers in the OCR generated output or the weights used in the dynamic programming minimization procedure). The problem with not specifically revealing the accounting method is that the number of errors found by different methods are significantly different. This paper identifies the basic accounting methods used to measure OCR errors in typeset text and offers an evaluation and comparison of the various accounting methods.

  2. The effects of visual crowding, text size, and positional uncertainty on text legibility at a glance.

    Science.gov (United States)

    Dobres, Jonathan; Wolfe, Benjamin; Chahine, Nadine; Reimer, Bryan

    2018-07-01

    Reading at a glance, once a relatively infrequent mode of reading, is becoming common. Mobile interaction paradigms increasingly dominate the way in which users obtain information about the world, which often requires reading at a glance, whether from a smartphone, wearable device, or in-vehicle interface. Recent research in these areas has shown that a number of factors can affect text legibility when words are briefly presented in isolation. Here we expand upon this work by examining how legibility is affected by more crowded presentations. Word arrays were combined with a lexical decision task, in which the size of the text elements and the inter-line spacing (leading) between individual items were manipulated to gauge their relative impacts on text legibility. In addition, a single-word presentation condition that randomized the location of presentation was compared with previous work that held position constant. Results show that larger text was more legible than smaller text. Wider leading significantly enhanced legibility as well, but contrary to expectations, wider leading did not fully counteract decrements in legibility at smaller text sizes. Single-word stimuli presented with random positioning were more difficult to read than stationary counterparts from earlier studies. Finally, crowded displays required much greater processing time compared to single-word displays. These results have implications for modern interface design, which often present interactions in the form of scrollable and/or selectable lists. The present findings are of practical interest to the wide community of graphic designers and interface engineers responsible for developing our interfaces of daily use. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. Emotiogenic Cognitive Function of Modern School Teaching Texts

    Directory of Open Access Journals (Sweden)

    Любовь Васильевна Ерохина

    2015-12-01

    Full Text Available The article is devoted to the analysis of emotional attractiveness of modern school educational texts and ecological/non-ecological influence upon pupils’ cognition in teaching communication. Reasoning is based on the thesis that - emotional attractiveness of modern school educational texts opposes their cognitive function. Emotional educational text profile and its components are under consideration. The article is concerned with ecological and cognitive and emotional asymmetry content. The material under focus is printed texts of some of modern school textbooks, teaching methodical aids, academic competitions, mass media information from the cognitive ecology point of view.

  4. How the Relationship between Text and Headings Influences Readers' Memory

    Science.gov (United States)

    Ritchey, Kristin; Schuster, Jonathan; Allen, Jaryn

    2008-01-01

    Two questions regarding signals' influence on memory were examined. First, the relationship between headings and text was manipulated to determine whether headings serve as visual cues, directing readers to recall all subsequent information, or content-specific cues, directing readers to recall only to certain information. Second, distance between…

  5. Text mining of web-based medical content

    CERN Document Server

    Neustein, Amy

    2014-01-01

    Text Mining of Web-Based Medical Content examines web mining for extracting useful information that can be used for treating and monitoring the healthcare of patients. This work provides methodological approaches to designing mapping tools that exploit data found in social media postings. Specific linguistic features of medical postings are analyzed vis-a-vis available data extraction tools for culling useful information.

  6. The role of interest and text structure in professional reading

    NARCIS (Netherlands)

    Spooren, W.; Mulder, M.N.; Hoeken, H.

    1998-01-01

    Students can be regarded as professional readers: they have to attend to, comprehend and remember the most important information in instructional texts, often about topics they are not readily interested in. Optimising such instructional texts has been the subject of much reading research. This

  7. Issues in Text Design and Layout for Computer Based Communications.

    Science.gov (United States)

    Andresen, Lee W.

    1991-01-01

    Discussion of computer-based communications (CBC) focuses on issues involved with screen design and layout for electronic text, based on experiences with electronic messaging, conferencing, and publishing within the Australian Open Learning Information Network (AOLIN). Recommendations for research on design and layout for printed text are also…

  8. Profiles and Context for Structured Text Retrieval

    DEFF Research Database (Denmark)

    Koolen, Marijn; Bogers, Toine

    2017-01-01

    The combination of structured information retrieval with user profile information represents the scenario where systems search with an explicit statement of the information need—a search query—as well as a profile of a user, which can contain information about previous interactions, search histor...

  9. Information Impact: Journal of Information and Knowledge ...

    African Journals Online (AJOL)

    Information Impact: Journal of Information and Knowledge Management: Advanced Search. Journal Home > Information Impact: Journal of Information and Knowledge Management: Advanced Search. Log in or Register to get access to full text downloads.

  10. Information Impact: Journal of Information and Knowledge ...

    African Journals Online (AJOL)

    Information Impact: Journal of Information and Knowledge Management: Site Map. Journal Home > About the Journal > Information Impact: Journal of Information and Knowledge Management: Site Map. Log in or Register to get access to full text downloads.

  11. Examining Text Complexity in the Early Grades

    Science.gov (United States)

    Fitzgerald, Jill; Elmore, Jeff; Hiebert, Elfrieda H.; Koons, Heather H.; Bowen, Kimberly; Sanford-Moore, Eleanor E.; Stenner, A. Jackson

    2016-01-01

    The Common Core raises the stature of texts to new heights, creating a hubbub. The fuss is especially messy at the early grades, where children are expected to read more complex texts than in the past. But early-grades teachers have been given little actionable guidance about text complexity. The authors recently examined early-grades texts to…

  12. Information Impact: Journal of Information and Knowledge ...

    African Journals Online (AJOL)

    Meeting the information needs of remote library users: the case of University of Maiduguri Distance Learning Programme · EMAIL FREE FULL TEXT EMAIL FREE FULL TEXT · DOWNLOAD FULL TEXT DOWNLOAD FULL TEXT. Adam Gambo Saleh, 1-16 ...

  13. Empirical Studies On Machine Learning Based Text Classification Algorithms

    OpenAIRE

    Shweta C. Dharmadhikari; Maya Ingle; Parag Kulkarni

    2011-01-01

    Automatic classification of text documents has become an important research issue now days. Properclassification of text documents requires information retrieval, machine learning and Natural languageprocessing (NLP) techniques. Our aim is to focus on important approaches to automatic textclassification based on machine learning techniques viz. supervised, unsupervised and semi supervised.In this paper we present a review of various text classification approaches under machine learningparadig...

  14. A Chinese text classification system based on Naive Bayes algorithm

    Directory of Open Access Journals (Sweden)

    Cui Wei

    2016-01-01

    Full Text Available In this paper, aiming at the characteristics of Chinese text classification, using the ICTCLAS(Chinese lexical analysis system of Chinese academy of sciences for document segmentation, and for data cleaning and filtering the Stop words, using the information gain and document frequency feature selection algorithm to document feature selection. Based on this, based on the Naive Bayesian algorithm implemented text classifier , and use Chinese corpus of Fudan University has carried on the experiment and analysis on the system.

  15. THE IMPACT OF TEXT DRIVING ON DRIVING SAFETY

    OpenAIRE

    Sanaz Motamedi; Jyh-Hone Wang

    2016-01-01

    In an increasingly mobile era, the wide availability of technology for texting and the prevalence of hands-free form have introduced a new safety concern for drivers. To assess this concern, a questionnaire was first deployed online to gain an understanding of drivers’ text driving experiences as well as their demographic information. The results from 232 people revealed that the majority of drivers are aware of the associated risks with texting while driving. However, more than one-fourth of...

  16. Functions of Case Statements in the Kazakh Text

    Directory of Open Access Journals (Sweden)

    Almagul S. Adilova

    2013-01-01

    Full Text Available The article deals with the functioning of universally decisional or foreign statements. Foreign precedent statements in Kazakh texts are used in canonic and modified forms and fulfill connotative, text-forming, informative functions. These quotations, having lost connection with their context not always preserve perception invariant due to the diversity of linguistic competence and cognitive basis of an author or a reader

  17. Text Mining in Biomedical Domain with Emphasis on Document Clustering.

    Science.gov (United States)

    Renganathan, Vinaitheerthan

    2017-07-01

    With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.

  18. The text plan concept: contributions to the writing planning process

    Directory of Open Access Journals (Sweden)

    Ana Lúcia Tinoco Cabral

    2013-12-01

    Full Text Available Students - at different levels, ranging from early grades up to PhD - face problems both on comprehension and text production. This paper focuses on the text plan concept according to the DTA (Discourse Text Analysis approach, i.e., a principle of organization that allows students to put into practice the production intention as well as to arrange text information while producing; being responsible for the text compositional structure (Adam, 2008. The study analyzes the relation between text plan and the writing planning process, in which the first one provides the second with theoretical support. In order to develop such research, the study covers some issues related to the reading skill, analyzes an argumentative text as per its textual plan, and presents some reflections on the writing process, focusing on the relation between textual plan and the writing planning process.

  19. SAW Classification Algorithm for Chinese Text Classification

    OpenAIRE

    Xiaoli Guo; Huiyu Sun; Tiehua Zhou; Ling Wang; Zhaoyang Qu; Jiannan Zang

    2015-01-01

    Considering the explosive growth of data, the increased amount of text data’s effect on the performance of text categorization forward the need for higher requirements, such that the existing classification method cannot be satisfied. Based on the study of existing text classification technology and semantics, this paper puts forward a kind of Chinese text classification oriented SAW (Structural Auxiliary Word) algorithm. The algorithm uses the special space effect of Chinese text where words...

  20. A multicase study of the impact of perceived gender roles on the career decisions of women in science-related careers

    Science.gov (United States)

    Hren, Stephen Frank

    The purpose of this study was to determine how perceived gender roles developed throughout childhood and early adulthood impacted the career decisions of women in science-related career fields. An additional purpose was to determine if my experiences as I analyzed the data and the propositions discovered in the study would become a transformative agent for me. A multicase framework was utilized so that within and between case analyses could be achieved. Four women who showed early promise in science were chosen as the case study participants. The relationship of gender roles to the career decisions made by the four cases were arbitrated through three areas: (a) supports, which came from parents, immediate family members, spouses, teachers, mentors, and collaborators; (b) opportunities, which were separated into family experiences and opportunities, school and community opportunities, and postsecondary/current opportunities; and (c) postmodern feminism, which was the lens that grounded this study and fit well with the lives of the cases. As seen through a postmodern feminist lens, the cases' social class, their lived experiences tied to their opportunities and supports, and the culture of growing up in a small rural community helped them develop personas for the professions they chose even where those professions did not necessarily follow from the early promise shown for a science-related career. In addition, as related to my transformation as a male researcher, being a male conducting research in a realm most often shared by women, I was able to gain greater empathy and understanding of what it takes for women to be successful in a career and at the same time maintain a fruitful family life.

  1. Discovering gene annotations in biomedical text databases

    Directory of Open Access Journals (Sweden)

    Ozsoyoglu Gultekin

    2008-03-01

    Full Text Available Abstract Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i automating the annotation of genomic entities with Gene Ontology concepts, and (ii providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate

  2. Extractive text summarization system to aid data extraction from full text in systematic review development.

    Science.gov (United States)

    Bui, Duy Duc An; Del Fiol, Guilherme; Hurdle, John F; Jonnalagadda, Siddhartha

    2016-12-01

    Extracting data from publication reports is a standard process in systematic review (SR) development. However, the data extraction process still relies too much on manual effort which is slow, costly, and subject to human error. In this study, we developed a text summarization system aimed at enhancing productivity and reducing errors in the traditional data extraction process. We developed a computer system that used machine learning and natural language processing approaches to automatically generate summaries of full-text scientific publications. The summaries at the sentence and fragment levels were evaluated in finding common clinical SR data elements such as sample size, group size, and PICO values. We compared the computer-generated summaries with human written summaries (title and abstract) in terms of the presence of necessary information for the data extraction as presented in the Cochrane review's study characteristics tables. At the sentence level, the computer-generated summaries covered more information than humans do for systematic reviews (recall 91.2% vs. 83.8%, p<0.001). They also had a better density of relevant sentences (precision 59% vs. 39%, p<0.001). At the fragment level, the ensemble approach combining rule-based, concept mapping, and dictionary-based methods performed better than individual methods alone, achieving an 84.7% F-measure. Computer-generated summaries are potential alternative information sources for data extraction in systematic review development. Machine learning and natural language processing are promising approaches to the development of such an extractive summarization system. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. A text-mining system for extracting metabolic reactions from full-text articles.

    Science.gov (United States)

    Czarnecki, Jan; Nobeli, Irene; Smith, Adrian M; Shepherd, Adrian J

    2012-07-23

    Increasingly biological text mining research is focusing on the extraction of complex relationships relevant to the construction and curation of biological networks and pathways. However, one important category of pathway - metabolic pathways - has been largely neglected.Here we present a relatively simple method for extracting metabolic reaction information from free text that scores different permutations of assigned entities (enzymes and metabolites) within a given sentence based on the presence and location of stemmed keywords. This method extends an approach that has proved effective in the context of the extraction of protein-protein interactions. When evaluated on a set of manually-curated metabolic pathways using standard performance criteria, our method performs surprisingly well. Precision and recall rates are comparable to those previously achieved for the well-known protein-protein interaction extraction task. We conclude that automated metabolic pathway construction is more tractable than has often been assumed, and that (as in the case of protein-protein interaction extraction) relatively simple text-mining approaches can prove surprisingly effective. It is hoped that these results will provide an impetus to further research and act as a useful benchmark for judging the performance of more sophisticated methods that are yet to be developed.

  4. A Proposed Arabic Handwritten Text Normalization Method

    Directory of Open Access Journals (Sweden)

    Tarik Abu-Ain

    2014-11-01

    Full Text Available Text normalization is an important technique in document image analysis and recognition. It consists of many preprocessing stages, which include slope correction, text padding, skew correction, and straight the writing line. In this side, text normalization has an important role in many procedures such as text segmentation, feature extraction and characters recognition. In the present article, a new method for text baseline detection, straightening, and slant correction for Arabic handwritten texts is proposed. The method comprises a set of sequential steps: first components segmentation is done followed by components text thinning; then, the direction features of the skeletons are extracted, and the candidate baseline regions are determined. After that, selection of the correct baseline region is done, and finally, the baselines of all components are aligned with the writing line.  The experiments are conducted on IFN/ENIT benchmark Arabic dataset. The results show that the proposed method has a promising and encouraging performance.

  5. Partition of Ni between olivine and sulfide: the effect of temperature, f_{{text{O}}_{text{2}} } and f_{{text{S}}_{text{2}} }

    Science.gov (United States)

    Fleet, M. E.; Macrae, N. D.

    1987-03-01

    The experimental distribution coefficient for Ni/ Fe exchange between olivine and monosulfide (KD3) is 35.6±1.1 at 1385° C, f_{{text{O}}_{text{2}} } = 10^{ - 8.87} ,f_{{text{S}}_{text{2}} } = 10^{ - 1.02} , and olivine of composition Fo96 to Fo92. These are the physicochemical conditions appropriate to hypothesized sulfur-saturated komatiite magma. The present experiments equilibrated natural olivine grains with sulfide-oxide liquid in the presence of a (Mg, Fe)-alumino-silicate melt. By a variety of different experimental procedures, K D3 is shown to be essentially constant at about 30 to 35 in the temperature range 900 to 1400° C, for olivine of composition Fo97 to FoO, monosulfide composition with up to 70 mol. % NiS, and a wide range of f_{{text{O}}_{text{2}} } and f_{{text{S}}_{text{2}} }.

  6. Arabic text classification using Polynomial Networks

    Directory of Open Access Journals (Sweden)

    Mayy M. Al-Tahrawi

    2015-10-01

    Full Text Available In this paper, an Arabic statistical learning-based text classification system has been developed using Polynomial Neural Networks. Polynomial Networks have been recently applied to English text classification, but they were never used for Arabic text classification. In this research, we investigate the performance of Polynomial Networks in classifying Arabic texts. Experiments are conducted on a widely used Arabic dataset in text classification: Al-Jazeera News dataset. We chose this dataset to enable direct comparisons of the performance of Polynomial Networks classifier versus other well-known classifiers on this dataset in the literature of Arabic text classification. Results of experiments show that Polynomial Networks classifier is a competitive algorithm to the state-of-the-art ones in the field of Arabic text classification.

  7. Text analysis with R for students of literature

    CERN Document Server

    Jockers, Matthew L

    2014-01-01

    Text Analysis with R for Students of Literature is written with students and scholars of literature in mind but will be applicable to other humanists and social scientists wishing to extend their methodological tool kit to include quantitative and computational approaches to the study of text. Computation provides access to information in text that we simply cannot gather using traditional qualitative methods of close reading and human synthesis. Text Analysis with R for Students of Literature provides a practical introduction to computational text analysis using the open source programming language R. R is extremely popular throughout the sciences and because of its accessibility, R is now used increasingly in other research areas. Readers begin working with text right away and each chapter works through a new technique or process such that readers gain a broad exposure to core R procedures and a basic understanding of the possibilities of computational text analysis at both the micro and macro scale. Each c...

  8. Multimodal Diversity of Postmodernist Fiction Text

    Directory of Open Access Journals (Sweden)

    U. I. Tykha

    2016-12-01

    Full Text Available The article is devoted to the analysis of structural and functional manifestations of multimodal diversity in postmodernist fiction texts. Multimodality is defined as the coexistence of more than one semiotic mode within a certain context. Multimodal texts feature a diversity of semiotic modes in the communication and development of their narrative. Such experimental texts subvert conventional patterns by introducing various semiotic resources – verbal or non-verbal.

  9. Social information

    Directory of Open Access Journals (Sweden)

    Luiz Fernando de Barros Campos

    Full Text Available Based on Erving Goffman's work, the article aims to discuss a definition of information centered on the type conveyed by individuals in a multimodal way, encompassing language and body in situations of co-presence, where face-to-face interaction occurs, and influencing inter-subjective formation of the self. Six types of information are highlighted: material information, expressive information, ritualized information, meta-information, strategic information, and information displays. It is argued that the construction of this empirical object tends to dissolve the tension among material, cognitive and pragmatic aspects, constituting an example of the necessary integration among them. Some vulnerable characteristics of the theory are critically mentioned and it is suggested that the concept of information displays could provide a platform to approach the question of the interaction order in its relations with the institutional and social orders, and consequently, to reassess the scope of the notion of social information analyzed.

  10. Youth Texting: Help or Hindrance to Literacy?

    Science.gov (United States)

    Zebroff, Dmitri

    2018-01-01

    An extensive amount of research has been performed in recent years into the widespread practice of text messaging in youth. As part of this broad area of research, the associations between youth texting and literacy have been investigated in a variety of contexts. A comprehensive, semi-systematic review of the literature into texting and literacy…

  11. Choices of texts for literary education

    DEFF Research Database (Denmark)

    Skyggebjerg, Anna Karlskov

    This paper charts the general implications of the choice of texts for literature teaching in the Danish school system, especially in Grades 8 and 9. It will analyze and discuss the premises of the choice of texts, and the possibilities of a certain choice of text in a concrete classroom situation...

  12. Effects of Text Messaging on Academic Performance

    OpenAIRE

    Barks Amanda; Searight H. Russell; Ratwik Susan

    2011-01-01

    University students frequently send and receive cellular phone text messages during classroominstruction. Cognitive psychology research indicates that multi-tasking is frequently associatedwith performance cost. However, university students often have considerable experience withelectronic multi-tasking and may believe that they can devote necessary attention to a classroomlecture while sending and receiving text messages. In the current study, university students whoused text messaging were ...

  13. Academic Journal Embargoes and Full Text Databases.

    Science.gov (United States)

    Brooks, Sam

    2003-01-01

    Documents the reasons for embargoes of academic journals in full text databases (i.e., publisher-imposed delays on the availability of full text content) and provides insight regarding common misconceptions. Tables present data on selected journals covering a cross-section of subjects and publishers and comparing two full text business databases.…

  14. A quick survey of text categorization algorithms

    Directory of Open Access Journals (Sweden)

    Dan MUNTEANU

    2007-12-01

    Full Text Available This paper contains an overview of basic formulations and approaches to text classification. This paper surveys the algorithms used in text categorization: handcrafted rules, decision trees, decision rules, on-line learning, linear classifier, Rocchio’s algorithm, k Nearest Neighbor (kNN, Support Vector Machines (SVM.

  15. Inclusion in the Workplace - Text Version | NREL

    Science.gov (United States)

    Careers » Inclusion in the Workplace - Text Version Inclusion in the Workplace - Text Version This is the text version for the Inclusion: Leading by Example video. I'm Martin Keller. I'm the NREL of the laboratory. Another very important element in inclusion is diversity. Because if we have a

  16. Effects of Text Messaging on Academic Performance

    Directory of Open Access Journals (Sweden)

    Barks Amanda

    2011-12-01

    Full Text Available University students frequently send and receive cellular phone text messages during classroominstruction. Cognitive psychology research indicates that multi-tasking is frequently associatedwith performance cost. However, university students often have considerable experience withelectronic multi-tasking and may believe that they can devote necessary attention to a classroomlecture while sending and receiving text messages. In the current study, university students whoused text messaging were randomly assigned to one of two conditions: 1. a group that sent andreceived text messages during a lecture or, 2. a group that did not engage in text messagingduring the lecture. Participants who engaged in text messaging demonstrated significantlypoorer performance on a test covering lecture content compared with the group that did notsend and receive text messages. Participants exhibiting higher levels of text messaging skill hadsignificantly lower test scores than participants who were less proficient at text messaging. It ishypothesized that in terms of retention of lecture material, more frequent task shifting by thosewith greater text messaging proficiency contributed to poorer performance. Overall, the findingsdo not support the view, held by many university students, that this form of multitasking has littleeffect on the acquisition of lecture content. Results provide empirical support for teachers andprofessors who ban text messaging in the classroom.

  17. The artists' text as work of art

    NARCIS (Netherlands)

    van Rijn, I.A.M.J.

    2017-01-01

    Artists’ texts are texts written and produced by visual artists. Their number increasing since the 2000s, it becomes important to clarify their obscure relationship to art institutions. Analysing and comparing four different artists’ texts on a textual level, this research proposes an alternative to

  18. Information Myopia

    Directory of Open Access Journals (Sweden)

    Nadi Helena Presser

    2016-04-01

    Full Text Available This article reflects on the ways of appropriation in organizations. The notion of Information Myopia is characterized by the lack of knowledge about the available informational capabilities in organizations, revealing a narrow view of the information environment. This analysis has focused on the process for renewing the software licenses contracts of a large multinational group, in order to manage its organizational assets in information technology. The collected, explained and justified information allowed to elaborate an action proposal, which enabled the creation of new organizational knowledge. In its theoretical dimension, the value of information was materialized by its use, in a collective process of organizational learning.

  19. VideoSET: Video Summary Evaluation through Text

    OpenAIRE

    Yeung, Serena; Fathi, Alireza; Fei-Fei, Li

    2014-01-01

    In this paper we present VideoSET, a method for Video Summary Evaluation through Text that can evaluate how well a video summary is able to retain the semantic information contained in its original video. We observe that semantics is most easily expressed in words, and develop a text-based approach for the evaluation. Given a video summary, a text representation of the video summary is first generated, and an NLP-based metric is then used to measure its semantic distance to ground-truth text ...

  20. The Instructional Text like a Textual Genre

    Directory of Open Access Journals (Sweden)

    Adiane Fogali Marinello

    2011-07-01

    Full Text Available This article analyses the instructional text as a textual genre and is part of the research called Reading and text production from the textual genre perspective, done at Universidade de Caxias do Sul, Campus Universitário da Região dos Vinhedos. Firstly, some theoretical assumptions about textual genre are presented, then, the instructional text is characterized. After that an instructional text is analyzed and, finally, some activities related to reading and writing of the mentioned genre directed to High School and University students are suggested.

  1. Recognition of pornographic web pages by classifying texts and images.

    Science.gov (United States)

    Hu, Weiming; Wu, Ou; Chen, Zhouyao; Fu, Zhouyu; Maybank, Steve

    2007-06-01

    With the rapid development of the World Wide Web, people benefit more and more from the sharing of information. However, Web pages with obscene, harmful, or illegal content can be easily accessed. It is important to recognize such unsuitable, offensive, or pornographic Web pages. In this paper, a novel framework for recognizing pornographic Web pages is described. A C4.5 decision tree is used to divide Web pages, according to content representations, into continuous text pages, discrete text pages, and image pages. These three categories of Web pages are handled, respectively, by a continuous text classifier, a discrete text classifier, and an algorithm that fuses the results from the image classifier and the discrete text classifier. In the continuous text classifier, statistical and semantic features are used to recognize pornographic texts. In the discrete text classifier, the naive Bayes rule is used to calculate the probability that a discrete text is pornographic. In the image classifier, the object's contour-based features are extracted to recognize pornographic images. In the text and image fusion algorithm, the Bayes theory is used to combine the recognition results from images and texts. Experimental results demonstrate that the continuous text classifier outperforms the traditional keyword-statistics-based classifier, the contour-based image classifier outperforms the traditional skin-region-based image classifier, the results obtained by our fusion algorithm outperform those by either of the individual classifiers, and our framework can be adapted to different categories of Web pages.

  2. Qualitative Features of Written Summary Texts Produced by Teachers

    Directory of Open Access Journals (Sweden)

    Hülya YAZICI OKUYAN

    2011-12-01

    Full Text Available This research aimed to find an answer to the question: "Do summary texts produced by teachers have the characteristics that a summary text is supposed to have?” Descriptive method was used in the research. The study group consisted of 55 teachers who work as Turkish Language and Literature teachers at central primary and secondary schools in Burdur. During the research, the essay “Kitap Az Yaşamayı Önler” by Çetin Altan was used as the source text and the summary texts produced by teachers were evaluated using a criteria-based and gradual analysis instrument. At the end of the study, it was determined that the teachers only managed to reach the sufficient level in terms of reconstructing the summary texts through authentic sentences and reflecting the main idea of the source text in the summary texts. However, according to the research results regarding the teachers’ competence in creating a new title for the summary texts, including the source text’s all supporting ideas and important information in the summary texts and providing the summary texts with the capacity of reflecting the source text, it has been observed that the teachers lack the required knowledge and skill

  3. Comprehension challenges in the fourth grade: The roles of text cohesion, text genre, and readers’ prior knowledge

    Directory of Open Access Journals (Sweden)

    Danielle S. McNamara

    2011-07-01

    Full Text Available We examined young readers’ comprehension as a function of text genre (narrative, science, text cohesion (high, low, and readers’ abilities (reading decoding skills and world knowledge. The overarching purpose of this study was to contribute to our understanding of the fourth grade slump. Children in grade 4 read four texts, including one high and one low cohesion text from each genre. Comprehension of each text was assessed with 12 multiple-choice questions and free and cued recall. Comprehension was enhanced by increased knowledge: high knowledge readers showed better comprehension than low knowledge readers and narratives were comprehended better than science texts. Interactions between readers’ knowledge levels and text characteristics indicated that the children showed larger effects of knowledge for science than for narrative texts, and those with more knowledge better understood the low cohesion, narrative texts, showing a reverse cohesion effect. Decoding skill benefited comprehension, but effects of text genre and cohesion depended less on decoding skill than prior knowledge. Overall, the study indicates that the fourth grade slump is at least partially attributable to the emergence of complex dependencies between the nature of the text and the reader’s prior knowledge. The results also suggested that simply adding cohesion cues, and not explanatory information, is not likely to be sufficient for young readers as an approach to improving comprehension of challenging texts.

  4. Comprehension challenges in the fourth grade: The roles of text cohesion, text genre, and readers’ prior knowledge

    Directory of Open Access Journals (Sweden)

    Danielle S. McNAMARA

    2011-11-01

    Full Text Available We examined young readers’ comprehension as a function of text genre (narrative, science, text cohesion (high, low, and readers’ abilities (reading decoding skills and world knowledge. The overarching purpose of this study was to contribute to our understanding of the fourth grade slump. Children in grade 4 read four texts, including one high and one low cohesion text from each genre. Comprehension of each text was assessed with 12 multiple-choice questions and free and cued recall. Comprehension was enhanced by increased knowledge: high knowledge readers showed bettercomprehension than low knowledge readers and narratives were comprehended better than science texts. Interactions between readers’ knowledge levels and text characteristics indicated that thechildren showed larger effects of knowledge for science than for narrative texts, and those with more knowledge better understood the low cohesion, narrative texts, showing a reverse cohesion effect.Decoding skill benefited comprehension, but effects of text genre and cohesion depended less on decoding skill than prior knowledge. Overall, the study indicates that the fourth grade slump is at leastpartially attributable to the emergence of complex dependencies between the nature of the text and the reader’s prior knowledge. The results also suggested that simply adding cohesion cues, and notexplanatory information, is not likely to be sufficient for young readers as an approach to improving comprehension of challenging texts.

  5. Experiments on Supervised Learning Algorithms for Text Categorization

    Science.gov (United States)

    Namburu, Setu Madhavi; Tu, Haiying; Luo, Jianhui; Pattipati, Krishna R.

    2005-01-01

    Modern information society is facing the challenge of handling massive volume of online documents, news, intelligence reports, and so on. How to use the information accurately and in a timely manner becomes a major concern in many areas. While the general information may also include images and voice, we focus on the categorization of text data in this paper. We provide a brief overview of the information processing flow for text categorization, and discuss two supervised learning algorithms, viz., support vector machines (SVM) and partial least squares (PLS), which have been successfully applied in other domains, e.g., fault diagnosis [9]. While SVM has been well explored for binary classification and was reported as an efficient algorithm for text categorization, PLS has not yet been applied to text categorization. Our experiments are conducted on three data sets: Reuter's- 21578 dataset about corporate mergers and data acquisitions (ACQ), WebKB and the 20-Newsgroups. Results show that the performance of PLS is comparable to SVM in text categorization. A major drawback of SVM for multi-class categorization is that it requires a voting scheme based on the results of pair-wise classification. PLS does not have this drawback and could be a better candidate for multi-class text categorization.

  6. Text mining with R a tidy approach

    CERN Document Server

    Silge, Julia

    2017-01-01

    Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you'll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You'll learn how tidytext and other tidy tools in R can make text analysis easier and more effective. The authors demonstrate how treating text as data frames enables you to manipulate, summarize, and visualize characteristics of text. You'll also learn how to integrate natural language processing (NLP) into effective workflows. Practical code examples and data explorations will help you generate real insights from literature, news, and social media. Learn how to apply the tidy text format to NLP Use sentiment analysis to mine the emotional content of text Identify a document's most important terms with frequency measurements E...

  7. The nuclear modification of charged particles in Pb-Pb at $\\sqrt{\\text{s}_\\text{NN}} = \\text{5.02}\\,\\text{TeV}$ measured with ALICE

    CERN Document Server

    Gronefeld, Julius

    2016-09-21

    The study of inclusive charged-particle production in heavy-ion collisions provides insights into the density of the medium and the energy-loss mechanisms. The observed suppression of high-$\\textit{p}_\\text{T}$ yield is generally attributed to energy loss of partons as they propagate through a deconfined state of quarks and gluons - Quark-Gluon Plasma (QGP) - predicted by QCD. Such measurements allow the characterization of the QGP by comparison with models. In these proceedings, results on high-$\\textit{p}_\\text{T}$ particle production measured by ALICE in Pb-Pb collisions at $ \\sqrt{\\text{s}_\\text{NN}}\\, = 5.02\\ \\rm{TeV}$ as well as well in pp at $\\sqrt{\\text{s}}\\,=5.02\\ \\rm{TeV}$ are presented for the first time. The nuclear modification factors ($\\text{R}_\\text{AA}$) in Pb-Pb collisions are presented and compared with model calculations.

  8. Adaptive Text Entry for Mobile Devices

    DEFF Research Database (Denmark)

    Proschowsky, Morten Smidt

    The reduced size of many mobile devices makes it difficult to enter text with them. The text entry methods are often slow or complicated to use. This affects the performance and user experience of all applications and services on the device. This work introduces new easy-to-use text entry methods...... for mobile devices and a framework for adaptive context-aware language models. Based on analysis of current text entry methods, the requirements to the new text entry methods are established. Transparent User guided Prediction (TUP) is a text entry method for devices with one dimensional touch input. It can...... be touch sensitive wheels, sliders or similar input devices. The interaction design of TUP is done with a combination of high level task models and low level models of human motor behaviour. Three prototypes of TUP are designed and evaluated by more than 30 users. Observations from the evaluations are used...

  9. CERCLIS (Superfund) ASCII Text Format - CPAD Database

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Comprehensive Environmental Response, Compensation and Liability Information System (CERCLIS) (Superfund) Public Access Database (CPAD) contains a selected set...

  10. Planning Multisentential English Text Using Communicative Acts

    Science.gov (United States)

    1990-12-01

    Composition, Vol. XI in series Advances in Discourse Processing, Alex Publishing Corporation. de Joia , A. and Stenton, A. 1980. Terms in Linguistics: A Guide to...investigate how attentional constraints relate to text planning and linguistic realization. 14 SUBJECT TE1MS I I N& De OF PAGES Natural Language Generation...surface form? Page I 4. What is the relation of communicative intentions to text structure and surface form? 5. What effects can texts be designed to have

  11. TXT@WORK: pediatric hospitalists and text messaging.

    Science.gov (United States)

    Kuhlmann, Stephanie; Ahlers-Schmidt, Carolyn R; Steinberger, Erik

    2014-07-01

    Many studies assess provider-patient communication through text messaging; however, minimal research has addressed communication among physicians. The purpose of this study was to evaluate the use of text messaging by pediatric hospitalists. A brief, anonymous, electronic survey was distributed through the American Academy of Pediatrics Section on Hospital Medicine Listserv in February 2012. Survey questions assessed work-related text messaging. Of the 106 pediatric hospitalist respondents, 97 met inclusion criteria. Most were female (73%) and had been in practice text messages, some (12%) more than 10 times per shift. More than half (53%) received work-related text messages when not at work. When asked to identify all potential work recipients, most often sent work-related text messages to other pediatric hospitalists (64%), fellows or resident physicians (37%), and subspecialists/consulting physicians (28%). When asked their preferred mode for brief communication, respondents' preferences varied. Many (46%) respondents worried privacy laws can be violated by sending/receiving text messages, and some (30%) reported having received protected health information (PHI) through text messages. However, only 11% reported their institution offered encryption software for text messaging. Physicians were using text messaging as a means of brief, work-related communication. Concerns arose regarding transfer of PHI using unsecure systems and work-life balance. Future research should examine accuracy and effectiveness of text message communication in the hospital, as well as patient privacy issues.

  12. Text Analytics: the convergence of Big Data and Artificial Intelligence

    Directory of Open Access Journals (Sweden)

    Antonio Moreno

    2016-03-01

    Full Text Available The analysis of the text content in emails, blogs, tweets, forums and other forms of textual communication constitutes what we call text analytics. Text analytics is applicable to most industries: it can help analyze millions of emails; you can analyze customers’ comments and questions in forums; you can perform sentiment analysis using text analytics by measuring positive or negative perceptions of a company, brand, or product. Text Analytics has also been called text mining, and is a subcategory of the Natural Language Processing (NLP field, which is one of the founding branches of Artificial Intelligence, back in the 1950s, when an interest in understanding text originally developed. Currently Text Analytics is often considered as the next step in Big Data analysis. Text Analytics has a number of subdivisions: Information Extraction, Named Entity Recognition, Semantic Web annotated domain’s representation, and many more. Several techniques are currently used and some of them have gained a lot of attention, such as Machine Learning, to show a semisupervised enhancement of systems, but they also present a number of limitations which make them not always the only or the best choice. We conclude with current and near future applications of Text Analytics.

  13. Science and Technology Text Mining Basic Concepts

    National Research Council Canada - National Science Library

    Losiewicz, Paul

    2003-01-01

    ...). It then presents some of the most widely used data and text mining techniques, including clustering and classification methods, such as nearest neighbor, relational learning models, and genetic...

  14. Using Unlabeled Data to Improve Text Classification

    National Research Council Canada - National Science Library

    Nigam, Kamal P

    2001-01-01

    .... This dissertation demonstrates that supervised learning algorithms that use a small number of labeled examples and many inexpensive unlabeled examples can create high-accuracy text classifiers...

  15. Arabic Text Categorization Using Improved k-Nearest neighbour Algorithm

    Directory of Open Access Journals (Sweden)

    Wail Hamood KHALED

    2014-10-01

    Full Text Available The quantity of text information published in Arabic language on the net requires the implementation of effective techniques for the extraction and classifying of relevant information contained in large corpus of texts. In this paper we presented an implementation of an enhanced k-NN Arabic text classifier. We apply the traditional k-NN and Naive Bayes from Weka Toolkit for comparison purpose. Our proposed modified k-NN algorithm features an improved decision rule to skip the classes that are less similar and identify the right class from k nearest neighbours which increases the accuracy. The study evaluates the improved decision rule technique using the standard of recall, precision and f-measure as the basis of comparison. We concluded that the effectiveness of the proposed classifier is promising and outperforms the classical k-NN classifier.

  16. Mining biological networks from full-text articles.

    Science.gov (United States)

    Czarnecki, Jan; Shepherd, Adrian J

    2014-01-01

    The study of biological networks is playing an increasingly important role in the life sciences. Many different kinds of biological system can be modelled as networks; perhaps the most important examples are protein-protein interaction (PPI) networks, metabolic pathways, gene regulatory networks, and signalling networks. Although much useful information is easily accessible in publicly databases, a lot of extra relevant data lies scattered in numerous published papers. Hence there is a pressing need for automated text-mining methods capable of extracting such information from full-text articles. Here we present practical guidelines for constructing a text-mining pipeline from existing code and software components capable of extracting PPI networks from full-text articles. This approach can be adapted to tackle other types of biological network.

  17. Relating interesting quantitative time series patterns with text events and text features

    Science.gov (United States)

    Wanner, Franz; Schreck, Tobias; Jentner, Wolfgang; Sharalieva, Lyubka; Keim, Daniel A.

    2013-12-01

    In many application areas, the key to successful data analysis is the integrated analysis of heterogeneous data. One example is the financial domain, where time-dependent and highly frequent quantitative data (e.g., trading volume and price information) and textual data (e.g., economic and political news reports) need to be considered jointly. Data analysis tools need to support an integrated analysis, which allows studying the relationships between textual news documents and quantitative properties of the stock market price series. In this paper, we describe a workflow and tool that allows a flexible formation of hypotheses about text features and their combinations, which reflect quantitative phenomena observed in stock data. To support such an analysis, we combine the analysis steps of frequent quantitative and text-oriented data using an existing a-priori method. First, based on heuristics we extract interesting intervals and patterns in large time series data. The visual analysis supports the analyst in exploring parameter combinations and their results. The identified time series patterns are then input for the second analysis step, in which all identified intervals of interest are analyzed for frequent patterns co-occurring with financial news. An a-priori method supports the discovery of such sequential temporal patterns. Then, various text features like the degree of sentence nesting, noun phrase complexity, the vocabulary richness, etc. are extracted from the news to obtain meta patterns. Meta patterns are defined by a specific combination of text features which significantly differ from the text features of the remaining news data. Our approach combines a portfolio of visualization and analysis techniques, including time-, cluster- and sequence visualization and analysis functionality. We provide two case studies, showing the effectiveness of our combined quantitative and textual analysis work flow. The workflow can also be generalized to other

  18. Arabic text preprocessing for the natural language processing applications

    International Nuclear Information System (INIS)

    Awajan, A.

    2007-01-01

    A new approach for processing vowelized and unvowelized Arabic texts in order to prepare them for Natural Language Processing (NLP) purposes is described. The developed approach is rule-based and made up of four phases: text tokenization, word light stemming, word's morphological analysis and text annotation. The first phase preprocesses the input text in order to isolate the words and represent them in a formal way. The second phase applies a light stemmer in order to extract the stem of each word by eliminating the prefixes and suffixes. The third phase is a rule-based morphological analyzer that determines the root and the morphological pattern for each extracted stem. The last phase produces an annotated text where each word is tagged with its morphological attributes. The preprocessor presented in this paper is capable of dealing with vowelized and unvowelized words, and provides the input words along with relevant linguistics information needed by different applications. It is designed to be used with different NLP applications such as machine translation text summarization, text correction, information retrieval and automatic vowelization of Arabic Text. (author)

  19. IDENTITY CLAIMS, TEXTS, ROME AND GALATIANS

    African Journals Online (AJOL)

    inform the identity claimed and negotiated by people and groups. When ..... 24 To some extent, going against the grain of Bourdieu' notion that “what exist in the social world are .... based on words and information that create reality” (Lampe 1995:940, emphasis ..... Jesus, the Early Church and the Roman superpower.

  20. Information Need and Information Seeking Behaviour

    Directory of Open Access Journals (Sweden)

    Nazan Özenç Uçak

    1997-12-01

    Full Text Available Information need is one of the cognitive needs of humankind. Information need causes information seeking behaviour and these concepts complement each other. Information need and information seeking behaviour are effected by many factors. It is necessary to know these factors in establishing the effective information centers and services.

  1. Classifying Written Texts Through Rhythmic Features

    NARCIS (Netherlands)

    Balint, Mihaela; Dascalu, Mihai; Trausan-Matu, Stefan

    2016-01-01

    Rhythm analysis of written texts focuses on literary analysis and it mainly considers poetry. In this paper we investigate the relevance of rhythmic features for categorizing texts in prosaic form pertaining to different genres. Our contribution is threefold. First, we define a set of rhythmic

  2. Text comprehension strategy instruction with poor readers

    NARCIS (Netherlands)

    Van den Bos, K.P.; Aarnoudse, C.C.; Brand-Gruwel, S.

    1998-01-01

    The goal of this study was to investigate the effects of teaching text comprehension strategies to children with decoding and reading comprehension problems and with a poor or normal listening ability. Two experiments are reported. Four text comprehension strategies, viz., question generation,

  3. Text Manipulation Techniques and Foreign Language Composition.

    Science.gov (United States)

    Walker, Ronald W.

    1982-01-01

    Discusses an approach to teaching second language composition which emphasizes (1) careful analysis of model texts from a limited, but well-defined perspective and (2) the application of text manipulation techniques developed by the word processing industry to student compositions. (EKN)

  4. Teachers' Texts in Culturally Responsive Teaching

    Science.gov (United States)

    Kesler, Ted

    2011-01-01

    In this paper, the author shares three teaching stories that demonstrate the social, cultural, political, and historical factors of all texts in specific interpretive communities. The author shows how the texts that comprised his curriculum constructed particular subject positions that inevitably included some students but marginalized and…

  5. Readability Revisited? The Implications of Text Complexity

    Science.gov (United States)

    Wray, David; Janan, Dahlia

    2013-01-01

    The concept of readability has had a variable history, moving from a position where it was considered as a very important topic for those responsible for producing texts and matching those texts to the abilities and needs of learners, to its current declining visibility in the education literature. Some important work has been coming from the USA…

  6. Tipster Text Phase 2 Architecture Design

    Science.gov (United States)

    1996-06-19

    TIPSTER Text Phase II Architecture Design Version 2.1p 19 June 1996 Ralph Grishman New York University grishman @cs.nyu.edu and the TIPSTER...1996 2. REPORT TYPE 3. DATES COVERED 00-00-1996 to 00-00-1996 4. TITLE AND SUBTITLE TIPSTER Text Phase II Architecture Design 5a. CONTRACT

  7. Using Digital Texts to Promote Fluent Reading

    Science.gov (United States)

    Thoermer, Andrea; Williams, Lunetta

    2012-01-01

    Fluency is a critical skill of adept readers. As listening to read alouds and performing Readers Theatre scripts are two prevalent strategies that can increase students' fluency skills, this article provides suggestions in using these strategies with digital texts through free, online resources. Digital texts can be accessed using a desktop,…

  8. Interest, Inferences, and Learning from Texts

    Science.gov (United States)

    Clinton, Virginia; van den Broek, Paul

    2012-01-01

    Topic interest and learning from texts have been found to be positively associated with each other. However, the reason for this positive association is not well understood. The purpose of this study is to examine a cognitive process, inference generation, that could explain the positive association between interest and learning from texts. In…

  9. Text Fabric: What, How, and Why

    NARCIS (Netherlands)

    Erwich, C.M.; Kingham, Cody

    Text-Fabric (TF) is a promising new framework for the Eep Talstra Center for Bible and Computer corpus plus (linguistic) annotations. TF is a Python 3.x software package that provides scientific, accessible and reproducible ways of processing Biblical Hebrew text data. It also allows sharing the

  10. An Intelligent System For Arabic Text Categorization

    NARCIS (Netherlands)

    Syiam, M.M.; Tolba, Mohamed F.; Fayed, Z.T.; Abdel-Wahab, Mohamed S.; Ghoniemy, Said A.; Habib, Mena Badieh

    Text Categorization (classification) is the process of classifying documents into a predefined set of categories based on their content. In this paper, an intelligent Arabic text categorization system is presented. Machine learning algorithms are used in this system. Many algorithms for stemming and

  11. Flexible frontiers for text division into rows

    Directory of Open Access Journals (Sweden)

    Dan L. Lacrămă

    2009-01-01

    Full Text Available This paper presents an original solution for flexible hand-written text division into rows. Unlike the standard procedure, the proposed method avoids the isolated characters extensions amputation and reduces the recognition error rate in the final stage.

  12. Undergraduates' Text Messaging Language and Literacy Skills

    Science.gov (United States)

    Grace, Abbie; Kemp, Nenagh; Martin, Frances Heritage; Parrila, Rauno

    2014-01-01

    Research investigating whether people's literacy skill is being affected by the use of text messaging language has produced largely positive results for children, but mixed results for adults. We asked 150 undergraduate university students in Western Canada and 86 in South Eastern Australia to supply naturalistic text messages and to complete…

  13. Language Skills in Classical Chinese Text Comprehension

    Science.gov (United States)

    Lau, Kit-ling

    2018-01-01

    This study used both quantitative and qualitative methods to explore the role of lower- and higher-level language skills in classical Chinese (CC) text comprehension. A CC word and sentence translation test, text comprehension test, and questionnaire were administered to 393 Secondary Four students; and 12 of these were randomly selected to…

  14. Text Structure and Retention of Prose.

    Science.gov (United States)

    Zimmer, John W.

    1985-01-01

    The effects of text structure were studied using two kinds of reading materials: a standard text with headings and illustrations, as well as a nonstructured manuscript. The manuscript readers scored higher on delayed tests, generated more relevant ideas, and wrote better essays both immediately and after a delay. (Author/GDC)

  15. Application of LSP Texts in Translator Training

    Science.gov (United States)

    Ilynska, Larisa; Smirnova, Tatjana; Platonova, Marina

    2017-01-01

    The paper presents discussion of the results of extensive empirical research into efficient methods of educating and training translators of LSP (language for special purposes) texts. The methodology is based on using popular LSP texts in the respective fields as one of the main media for translator training. The aim of the paper is to investigate…

  16. Modeling text with generalizable Gaussian mixtures

    DEFF Research Database (Denmark)

    Hansen, Lars Kai; Sigurdsson, Sigurdur; Kolenda, Thomas

    2000-01-01

    We apply and discuss generalizable Gaussian mixture (GGM) models for text mining. The model automatically adapts model complexity for a given text representation. We show that the generalizability of these models depends on the dimensionality of the representation and the sample size. We discuss...

  17. Text mining for the biocuration workflow.

    Science.gov (United States)

    Hirschman, Lynette; Burns, Gully A P C; Krallinger, Martin; Arighi, Cecilia; Cohen, K Bretonnel; Valencia, Alfonso; Wu, Cathy H; Chatr-Aryamontri, Andrew; Dowell, Karen G; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on 'Text Mining for the BioCuration Workflow' at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community.

  18. Texts, Transmissions, Receptions. Modern Approaches to Narratives

    NARCIS (Netherlands)

    Lardinois, A.P.M.H.; Levie, S.A.; Hoeken, H.; Lüthy, C.H.

    2015-01-01

    The papers collected in this volume study the function and meaning of narrative texts from a variety of perspectives. The word 'text' is used here in the broadest sense of the term: it denotes literary books, but also oral tales, speeches, newspaper articles and comics. One of the purposes of this

  19. A text in Romani from 1622

    DEFF Research Database (Denmark)

    Bakker, Peter

    2015-01-01

    this is a reprint of a 2012 article: A new old text in Romani: Lord's Prayer, 1622. International Journal of Romani Language and Culture 2 (2011): 193-212.......this is a reprint of a 2012 article: A new old text in Romani: Lord's Prayer, 1622. International Journal of Romani Language and Culture 2 (2011): 193-212....

  20. Where Full-Text Is Viable.

    Science.gov (United States)

    Cotton, P. L.

    1987-01-01

    Defines two types of online databases: source, referring to those intended to be complete in themselves, whether full-text or abstracts; and bibliographic, meaning those that are not complete. Predictions are made about the future growth rate of these two types of databases, as well as full-text versus abstract databases. (EM)

  1. Ontology Assisted Formal Specification Extraction from Text

    Directory of Open Access Journals (Sweden)

    Andreea Mihis

    2010-12-01

    Full Text Available In the field of knowledge processing, the ontologies are the most important mean. They make possible for the computer to understand better the natural language and to make judgments. In this paper, a method which use ontologies in the semi-automatic extraction of formal specifications from a natural language text is proposed.

  2. Learning from text benefits from enactment.

    Science.gov (United States)

    Cutica, Ilaria; Ianì, Francesco; Bucciarelli, Monica

    2014-10-01

    Classical studies on enactment have highlighted the beneficial effects of gestures performed in the encoding phase on memory for words and sentences, for both adults and children. In the present investigation, we focused on the role of enactment for learning from scientific texts among primary-school children. We assumed that enactment would favor the construction of a mental model of the text, and we verified the derived predictions that gestures at the time of encoding would result in greater numbers of correct recollections and discourse-based inferences at recall, as compared to no gestures (Exp. 1), and in a bias to confound paraphrases of the original text with the verbatim text in a recognition test (Exp. 2). The predictions were confirmed; hence, we argue in favor of a theoretical framework that accounts for the beneficial effects of enactment on memory for texts.

  3. Text2Floss: the feasibility and acceptability of a text messaging intervention to improve oral health behavior and knowledge.

    Science.gov (United States)

    Hashemian, Tony S; Kritz-Silverstein, Donna; Baker, Ryan

    2015-01-01

    Text messaging is useful for promoting numerous health-related behaviors. The Text2Floss Study examines the feasibility and utility of a 7-day text messaging intervention to improve oral health knowledge and behavior in mothers of young children. Mothers were recruited from a private practice and a community clinic. Of 156 mothers enrolled, 129 randomized into text (n = 60) and control groups (n = 69) completed the trial. Participants in the text group received text messages for 7 days, asking about flossing and presenting oral health information. Oral health behaviors and knowledge were surveyed pre- and post-intervention. At baseline, there were no differences between text and control group mothers in knowledge and behaviors (P > 0.10). Post-intervention, text group mothers flossed more (P = 0.01), had higher total (P = 0.0006) and specific (P Text messages were accepted and perceived as useful. Mothers receiving text messages improved their own oral health behaviors and knowledge as well as their behaviors regarding their children's oral health. Text messaging represents a viable method to improve oral health behaviors and knowledge. Its high acceptance may make it useful for preventing oral disease. © 2014 American Association of Public Health Dentistry.

  4. Gaze patterns reveal how texts are remembered: A mental model of what was described is favored over the text itself

    DEFF Research Database (Denmark)

    Traub, Franziska; Johansson, Roger; Holmqvist, Kenneth

    or incongruent with the spatial layout of the text itself. 28 participants read and recalled three texts: (1) a scene description congruent with the spatial layout of the text; (2) a scene description incongruent with the spatial layout of the text; and (3) a control text without any spatial scene content....... Recollection was performed orally while gazing at a blank screen. 
Results demonstrate that participant’s gaze patterns during recall more closely reflect the spatial layout of the scene than the physical locations of the text. We conclude that participants formed a mental model that represents the content...... of what was described, i.e., visuospatial information of the scene, which then guided the retrieval process. During their retellings, they moved the eyes across the blank screen as if they saw the scene in front of them. Whereas previous studies on the involvement of eye movements in mental imagery tasks...

  5. LITURGICAL TEXT IN RUSSIAN LITERATURE. PROBLEM STATEMENT

    Directory of Open Access Journals (Sweden)

    Avetis Serezhaevich Seropyan

    2012-11-01

    Full Text Available The article analyses artistic expressions of liturgical language in the literary text and its interaction of the Holy Tradition. Many Russian authors knew the liturgical text well. Studying it reveals the crucial meaning of the Gospel and liturgical texts (as part of the Holy Tradition for Russian literature. Authors saw the essence of every phenomenon in the word for it, and the nature of God in His name. Some ideas and sayings of the authors and their characters find their sources in liturgical texts. The article focuses on liturgical sources of some characters' commemorations and invocations, as well as poetical topics of the symbolists, Dostoevsky's famous dictum on beauty which will save the world (The Idiot, etc. De-cyphering this liturgical code will help us learn and comprehend the hidden endless meaning of a literary text. The specific feature of Russian literature is its pursuit of the spiritual liturgical exploration of the world, an exploration when truth takes shape and thus becomes real in both literary text and history.

  6. Application of LSP texts in translator training

    Directory of Open Access Journals (Sweden)

    Larisa Ilynska

    2017-06-01

    Full Text Available The paper presents discussion of the results of extensive empirical research into efficient methods of educating and training translators of LSP (language for special purposes texts. The methodology is based on using popular LSP texts in the respective fields as one of the main media for translator training. The aim of the paper is to investigate the efficiency of this methodology in developing thematic, linguistic and cultural competences of the students, following Bloom’s revised taxonomy and European Master in Translation Network (EMT translator training competences. The methodology has been tested on the students of a professional Master study programme called Technical Translation implemented by the Institute of Applied Linguistics, Riga Technical University, Latvia. The group of students included representatives of different nationalities, translating from English into Latvian, Russian and French. Analysis of popular LSP texts provides an opportunity to structure student background knowledge and expand it to account for linguistic innovation. Application of popular LSP texts instead of purely technical or scientific texts characterised by neutral style and rigid genre conventions provides an opportunity for student translators to develop advanced text processing and decoding skills, to develop awareness of expressive resources of the source and target languages and to develop understanding of socio-pragmatic language use.

  7. Informational Urbanism

    Directory of Open Access Journals (Sweden)

    Wolfgang G. Stock

    2015-10-01

    Full Text Available Contemporary and future cities are often labeled as "smart cities," "ubiquitous cities," "knowledge cities" and "creative cities." Informational urbanism includes all aspects of information and knowledge with regard to urban regions. "Informational city" is an umbrella term uniting the divergent trends of information-related city research. Informational urbanism is an interdisciplinary endeavor incorporating on the one side computer science and information science and on the other side urbanism, architecture, (city economics, and (city sociology. In our research project on informational cities, we visited more than 40 metropolises and smaller towns all over the world. In this paper, we sketch the theoretical background on a journey from Max Weber to the Internet of Things, introduce our research methods, and describe main results on characteristics of informational cities as prototypical cities of the emerging knowledge society.

  8. Populating the Semantic Web by Macro-reading Internet Text

    Science.gov (United States)

    Mitchell, Tom M.; Betteridge, Justin; Carlson, Andrew; Hruschka, Estevam; Wang, Richard

    A key question regarding the future of the semantic web is "how will we acquire structured information to populate the semantic web on a vast scale?" One approach is to enter this information manually. A second approach is to take advantage of pre-existing databases, and to develop common ontologies, publishing standards, and reward systems to make this data widely accessible. We consider here a third approach: developing software that automatically extracts structured information from unstructured text present on the web. We also describe preliminary results demonstrating that machine learning algorithms can learn to extract tens of thousands of facts to populate a diverse ontology, with imperfect but reasonably good accuracy.

  9. UNDERSTANDING TENOR IN SPOKEN TEXTS IN YEAR XII ENGLISH TEXTBOOK TO IMPROVE THE APPROPRIACY OF THE TEXTS

    Directory of Open Access Journals (Sweden)

    Noeris Meristiani

    2011-07-01

    Full Text Available ABSTRACT: The goal of English Language Teaching is communicative competence. To reach this goal students should be supplied with good model texts. These texts should consider the appropriacy of language use. By analyzing the context of situation which is focused on tenor the meanings constructed to build the relationships among the interactants in spoken texts can be unfolded. This study aims at investigating the interpersonal relations (tenor of the interactants in the conversation texts as well as the appropriacy of their realization in the given contexts. The study was conducted under discourse analysis by applying a descriptive qualitative method. There were eight conversation texts which function as examples in five chapters of a textbook. The data were analyzed by using lexicogrammatical analysis, described, and interpreted contextually. Then, the realization of the tenor of the texts was further analyzed in terms of appropriacy to suggest improvement. The results of the study show that the tenor indicates relationships between friend-friend, student-student, questioners-respondents, mother-son, and teacher-student; the power is equal and unequal; the social distances show frequent contact, relatively frequent contact, relatively low contact, high and low affective involvement, using informal, relatively informal, relatively formal, and formal language. There are also some indications of inappropriacy of tenor realization in all texts. It should be improved in the use of degree of formality, the realization of societal roles, status, and affective involvement. Keywords: context of situation, tenor, appropriacy.

  10. Text-Filled Stacked Area Graphs

    DEFF Research Database (Denmark)

    Kraus, Martin

    2011-01-01

    -filled stacked area graphs; i.e., graphs that feature stacked areas that are filled with small-typed text. Since these graphs allow for computing the text layout automatically, it is possible to include large amounts of textual detail with very little effort. We discuss the most important challenges and some...... solutions for the design of text-filled stacked area graphs with the help of an exemplary visualization of the genres, publication years, and titles of a database of several thousand PC games....

  11. NOTICING AND TEXT-BASED CHAT

    Directory of Open Access Journals (Sweden)

    Chun Lai

    2006-09-01

    Full Text Available This study examined the capacity of text-based online chat to promote learners’ noticing of their problematic language productions and of the interactional feedback from their interlocutors. In this study, twelve ESL learners formed six mixed-proficiency dyads. The same dyads worked on two spot-the-difference tasks, one via online chat and the other through face-to-face conversation. Stimulated recall sessions were held subsequently to identify instances of noticing. It was found that text-based online chat promotes noticing more than face-to-face conversations, especially in terms of learners’ noticing of their own linguistic mistakes.

  12. Text as occasion, dialogue as data, context as disturber

    DEFF Research Database (Denmark)

    Olesen, Birgitte Ravn; Pedersen, Chistina Hee

    context informs focus of analysis and explore the analytic possibilities entrenched in this move! The text in centre of the experiment was produced by a feminist activist in a memory-work session realised in Peru in 2009. The researcher, who facilitated the memory-work in Peru, brought the text to Denmark...... a deconstruction of constraining meaning-making processes and impulse critical dialogue both about meanings of the thematic of the text in ‘the context of the reading’ and about how to understand what is at stake when “researchers” and “practitioners” produce knowledge through dialogical and collaborative research....

  13. Researcher’s Perspective of Substitution Method on Text Steganography

    Science.gov (United States)

    Zamir Mansor, Fawwaz; Mustapha, Aida; Azah Samsudin, Noor

    2017-08-01

    The linguistic steganography studies are still in the stage of development and empowerment practices. This paper will present several text steganography on substitution methods based on the researcher’s perspective, all scholar paper will analyse and compared. The objective of this paper is to give basic information in the substitution method of text domain steganography that has been applied by previous researchers. The typical ways of this method also will be identified in this paper to reveal the most effective method in text domain steganography. Finally, the advantage of the characteristic and drawback on these techniques in generally also presented in this paper.

  14. Attitudes toward science: measurement and psychometric properties of the Test of Science-Related Attitudes for its use in Spanish-speaking classrooms

    Science.gov (United States)

    Navarro, Marianela; Förster, Carla; González, Caterina; González-Pose, Paulina

    2016-06-01

    Understanding attitudes toward science and measuring them remain two major challenges for science teaching. This article reviews the concept of attitudes toward science and their measurement. It subsequently analyzes the psychometric properties of the Test of Science-Related Attitudes (TOSRA), such as its construct validity, its discriminant and concurrent validity, and its reliability. The evidence presented suggests that TOSRA, in its Spanish-adapted version, has adequate construct validity regarding its theoretical referents, as well as good indexes of reliability. In addition, it determines the attitudes toward science of secondary school students in Santiago de Chile (n = 664) and analyzes the sex variable as a differentiating factor in such attitudes. The analysis by sex revealed low-relevance gender difference. The results are contrasted with those obtained in English-speaking countries. This TOSRA sample showed good psychometric parameters for measuring and evaluating attitudes toward science, which can be used in classrooms of Spanish-speaking countries or with immigrant populations with limited English proficiency.

  15. Text mining for traditional Chinese medical knowledge discovery: a survey.

    Science.gov (United States)

    Zhou, Xuezhong; Peng, Yonghong; Liu, Baoyan

    2010-08-01

    Extracting meaningful information and knowledge from free text is the subject of considerable research interest in the machine learning and data mining fields. Text data mining (or text mining) has become one of the most active research sub-fields in data mining. Significant developments in the area of biomedical text mining during the past years have demonstrated its great promise for supporting scientists in developing novel hypotheses and new knowledge from the biomedical literature. Traditional Chinese medicine (TCM) provides a distinct methodology with which to view human life. It is one of the most complete and distinguished traditional medicines with a history of several thousand years of studying and practicing the diagnosis and treatment of human disease. It has been shown that the TCM knowledge obtained from clinical practice has become a significant complementary source of information for modern biomedical sciences. TCM literature obtained from the historical period and from modern clinical studies has recently been transformed into digital data in the form of relational databases or text documents, which provide an effective platform for information sharing and retrieval. This motivates and facilitates research and development into knowledge discovery approaches and to modernize TCM. In order to contribute to this still growing field, this paper presents (1) a comparative introduction to TCM and modern biomedicine, (2) a survey of the related information sources of TCM, (3) a review and discussion of the state of the art and the development of text mining techniques with applications to TCM, (4) a discussion of the research issues around TCM text mining and its future directions. Copyright 2010 Elsevier Inc. All rights reserved.

  16. Building Fluency through the Phrased Text Lesson

    Science.gov (United States)

    Rasinski, Timothy; Yildirim, Kasim; Nageldinger, James

    2012-01-01

    This Teaching Tip article explores the importance of phrasing while reading. It also presents an instructional intervention strategy for helping students develop greater proficiency in reading with phrases that reflect the meaning of the text.

  17. Punctuation effects in english and esperanto texts

    Science.gov (United States)

    Ausloos, M.

    2010-07-01

    A statistical physics study of punctuation effects on sentence lengths is presented for written texts: Alice in wonderland and Through a looking glass. The translation of the first text into esperanto is also considered as a test for the role of punctuation in defining a style, and for contrasting natural and artificial, but written, languages. Several log-log plots of the sentence-length-rank relationship are presented for the major punctuation marks. Different power laws are observed with characteristic exponents. The exponent can take a value much less than unity ( ca. 0.50 or 0.30) depending on how a sentence is defined. The texts are also mapped into time series based on the word frequencies. The quantitative differences between the original and translated texts are very minutes, at the exponent level. It is argued that sentences seem to be more reliable than word distributions in discussing an author style.

  18. MORPHOLOGICAL STRATEGIES IN TEXT MESSAGING AMONG ...

    African Journals Online (AJOL)

    Text messaging is the application of abridged morphological forms in order ... the emergence of the Global System for Mobile Communication (GSM) in the world. ... Our thesis statement is that these morphological patterns as used in SMS are ...

  19. The Relationship between Paraphrasing and Text Analysis

    Directory of Open Access Journals (Sweden)

    María Luisa Cepeda Islas

    2013-04-01

    Full Text Available Given the importance of paraphrasing in the process of comprehension for college students, this study assessed the level of implementation of text analysis and paraphrases the response of a sample of senior students of the career psychology. We selected a group of freshmen to the Psychology course, which was asked to answer a questionnaire and carry out the summary of an empirical article. The results showed that participants have a low level of text analysis, at the same time had low levels of paraphrasing. It was seen that the predominant textual copy. They envision some possibilities for the structure of a training workshop not only paraphrasing but on the analysis of text.

  20. Illustrations in Text: A Retentional Role.

    Science.gov (United States)

    Duchastel, Philippe C.

    1981-01-01

    Describes the results of a study of the retentional role of illustrations in a text and their effect on enhancing long-term memory with 15-year-old secondary school students. Seven references are listed. (CHC)

  1. Figures of thought mathematics and mathematical texts

    CERN Document Server

    Reed, David

    2003-01-01

    Examines the ways in which mathematical works can be read as texts, examines their textual strategiesand demonstrates that such readings provide a rich source of philosophical debate regarding mathematics.

  2. Strategies to Increase Accuracy in Text Classification

    NARCIS (Netherlands)

    D. Blommesteijn (Dennis)

    2014-01-01

    htmlabstractText classification via supervised learning involves various steps from processing raw data, features extraction to training and validating classifiers. Within these steps implementation decisions are critical to the resulting classifier accuracy. This paper contains a report of the

  3. Handwriting segmentation of unconstrained Oriya text

    Indian Academy of Sciences (India)

    Segmentation of handwritten text into lines, words and characters .... We now discuss here some terms relating to water reservoirs that will be used in feature ..... is found. Next, based on the touching position, reservoir base-area points, ...

  4. Statistical text classifier to detect specific type of medical incidents.

    Science.gov (United States)

    Wong, Zoie Shui-Yee; Akiyama, Masanori

    2013-01-01

    WHO Patient Safety has put focus to increase the coherence and expressiveness of patient safety classification with the foundation of International Classification for Patient Safety (ICPS). Text classification and statistical approaches has showed to be successful to identifysafety problems in the Aviation industryusing incident text information. It has been challenging to comprehend the taxonomy of medical incidents in a structured manner. Independent reporting mechanisms for patient safety incidents have been established in the UK, Canada, Australia, Japan, Hong Kong etc. This research demonstrates the potential to construct statistical text classifiers to detect specific type of medical incidents using incident text data. An illustrative example for classifying look-alike sound-alike (LASA) medication incidents using structured text from 227 advisories related to medication errors from Global Patient Safety Alerts (GPSA) is shown in this poster presentation. The classifier was built using logistic regression model. ROC curve and the AUC value indicated that this is a satisfactory good model.

  5. Text mining meets workflow: linking U-Compare with Taverna

    Science.gov (United States)

    Kano, Yoshinobu; Dobson, Paul; Nakanishi, Mio; Tsujii, Jun'ichi; Ananiadou, Sophia

    2010-01-01

    Summary: Text mining from the biomedical literature is of increasing importance, yet it is not easy for the bioinformatics community to create and run text mining workflows due to the lack of accessibility and interoperability of the text mining resources. The U-Compare system provides a wide range of bio text mining resources in a highly interoperable workflow environment where workflows can very easily be created, executed, evaluated and visualized without coding. We have linked U-Compare to Taverna, a generic workflow system, to expose text mining functionality to the bioinformatics community. Availability: http://u-compare.org/taverna.html, http://u-compare.org Contact: kano@is.s.u-tokyo.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20709690

  6. Text feature extraction based on deep learning: a review.

    Science.gov (United States)

    Liang, Hong; Sun, Xiao; Sun, Yunlei; Gao, Yuan

    2017-01-01

    Selection of text feature item is a basic and important matter for text mining and information retrieval. Traditional methods of feature extraction require handcrafted features. To hand-design, an effective feature is a lengthy process, but aiming at new applications, deep learning enables to acquire new effective feature representation from training data. As a new feature extraction method, deep learning has made achievements in text mining. The major difference between deep learning and conventional methods is that deep learning automatically learns features from big data, instead of adopting handcrafted features, which mainly depends on priori knowledge of designers and is highly impossible to take the advantage of big data. Deep learning can automatically learn feature representation from big data, including millions of parameters. This thesis outlines the common methods used in text feature extraction first, and then expands frequently used deep learning methods in text feature extraction and its applications, and forecasts the application of deep learning in feature extraction.

  7. Machine printed text and handwriting identification in noisy document images.

    Science.gov (United States)

    Zheng, Yefeng; Li, Huiping; Doermann, David

    2004-03-01

    In this paper, we address the problem of the identification of text in noisy document images. We are especially focused on segmenting and identifying between handwriting and machine printed text because: 1) Handwriting in a document often indicates corrections, additions, or other supplemental information that should be treated differently from the main content and 2) the segmentation and recognition techniques requested for machine printed and handwritten text are significantly different. A novel aspect of our approach is that we treat noise as a separate class and model noise based on selected features. Trained Fisher classifiers are used to identify machine printed text and handwriting from noise and we further exploit context to refine the classification. A Markov Random Field-based (MRF) approach is used to model the geometrical structure of the printed text, handwriting, and noise to rectify misclassifications. Experimental results show that our approach is robust and can significantly improve page segmentation in noisy document collections.

  8. Nigel: A Systemic Grammar for Text Generation.

    Science.gov (United States)

    1983-02-01

    presumed. Basic references on the systemic framework include [Berry 75, Berry 77, Halliday 76a, Halliday 76b, Hudson 76, Halliday 81, de Joia 80...Edinburgh, 1979. [do Joia 80] de Joia , A., and A. Stanton, Terms in Systemic Linguistics, Batsford Academic and Educational, Ltd., London, 1980. -’C...1 A Grammar for Text Generation- -The Challenge ................................. 1 *1.2 A Grammar for Text Generation--The Design

  9. Text document classification based on mixture models

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana; Malík, Antonín

    2004-01-01

    Roč. 40, č. 3 (2004), s. 293-304 ISSN 0023-5954 R&D Projects: GA AV ČR IAA2075302; GA ČR GA102/03/0049; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : text classification * text categorization * multinomial mixture model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.224, year: 2004

  10. Preserved Network Metrics across Translated Texts

    Science.gov (United States)

    Cabatbat, Josephine Jill T.; Monsanto, Jica P.; Tapang, Giovanni A.

    2014-09-01

    Co-occurrence language networks based on Bible translations and the Universal Declaration of Human Rights (UDHR) translations in different languages were constructed and compared with random text networks. Among the considered network metrics, the network size, N, the normalized betweenness centrality (BC), and the average k-nearest neighbors, knn, were found to be the most preserved across translations. Moreover, similar frequency distributions of co-occurring network motifs were observed for translated texts networks.

  11. Text mining for the biocuration workflow

    Science.gov (United States)

    Hirschman, Lynette; Burns, Gully A. P. C; Krallinger, Martin; Arighi, Cecilia; Cohen, K. Bretonnel; Valencia, Alfonso; Wu, Cathy H.; Chatr-Aryamontri, Andrew; Dowell, Karen G.; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G.

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on ‘Text Mining for the BioCuration Workflow’ at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community. PMID:22513129

  12. Text Entry by Gazing and Smiling

    Directory of Open Access Journals (Sweden)

    Outi Tuisku

    2013-01-01

    Full Text Available Face Interface is a wearable prototype that combines the use of voluntary gaze direction and facial activations, for pointing and selecting objects on a computer screen, respectively. The aim was to investigate the functionality of the prototype for entering text. First, three on-screen keyboard layout designs were developed and tested (n=10 to find a layout that would be more suitable for text entry with the prototype than traditional QWERTY layout. The task was to enter one word ten times with each of the layouts by pointing letters with gaze and select them by smiling. Subjective ratings showed that a layout with large keys on the edge and small keys near the center of the keyboard was rated as the most enjoyable, clearest, and most functional. Second, using this layout, the aim of the second experiment (n=12 was to compare entering text with Face Interface to entering text with mouse. The results showed that text entry rate for Face Interface was 20 characters per minute (cpm and 27 cpm for the mouse. For Face Interface, keystrokes per character (KSPC value was 1.1 and minimum string distance (MSD error rate was 0.12. These values compare especially well with other similar techniques.

  13. Inspiration and the Texts of the Bible

    Directory of Open Access Journals (Sweden)

    Dirk Buchner

    1997-12-01

    Full Text Available This article seeks to explore what the inspired text of the Old Testament was as it existed for the New Testament authors, particularly for the author of the book of Hebrews. A quick look at the facts makes. it clear that there was, at the time, more than one 'inspired' text, among these were the Septuagint and the Masoretic Text 'to name but two'. The latter eventually gained ascendancy which is why it forms the basis of our translated Old Testament today. Yet we have to ask: what do we make of that other text that was the inspired Bible to the early Church, especially to the writer of the book of Hebrews, who ignored the Masoretic text? This article will take a brief look at some suggestions for a doctrine of inspiration that keeps up with the facts of Scripture. Allied to this, the article is something of a bibliographical study of recent developments in textual research following the discovery of the Dead Sea scrolls.

  14. Ancient medical texts, modern reading problems

    Directory of Open Access Journals (Sweden)

    Maria Carlota Rosa

    2006-12-01

    Full Text Available The word tradition has a very specific meaning in linguistics: the passing down of a text, which may have been completed or corrected by different copyists at different times, when the concept of authorship was not the same as it is today. When reading an ancient text the word tradition must be in the reader's mind. To discuss one of the problems an ancient text poses to its modern readers, this work deals with one of the first printed medical texts in Portuguese, the Regimento proueytoso contra ha pestenença, and draws a parallel between it and two related texts, A moche profitable treatise against the pestilence, and the Recopilaçam das cousas que conuem guardar se no modo de preseruar à Cidade de Lixboa E os sãos, & curar os que esteuerem enfermos de Peste. The problems which arise out of the textual structure of those books show how difficult is to establish a tradition of another type, the medical tradition. The linguistic study of the innumerable medieval plague treatises may throw light on the continuities and on the disruptions of the so-called hippocratic-galenical medical tradition.

  15. Chapter 16: text mining for translational bioinformatics.

    Science.gov (United States)

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  16. The Texts of the Agency's Relationship Agreements with Specialized Agencies

    International Nuclear Information System (INIS)

    1960-01-01

    The texts of the relationship agreements which the Agency has concluded with the specialized agencies listed below, together with the respective protocols authenticating them, are reproduced in this document in the order which the agreements entered into force, for the information of all Members of the Agency [es

  17. Semantic Linking and Contextualization for Social Forensic Text Analysis

    NARCIS (Netherlands)

    Ren, Z.; van Dijk, D.; Graus, D.; van der Knaap, N.; Henseler, H.; de Rijke, M.; Brynielsson, J.; Johansson, F.

    2013-01-01

    With the development of social media, forensic text analysis is becoming more and more challenging as forensic analysts have begun to include this information source in their practice. In this paper, we report on our recent work related to semantic search in e-discovery and propose the use of entity

  18. Sourcing in Professional Education: Do Text Factors Make Any Difference?

    Science.gov (United States)

    Bråten, Ivar; Strømsø, Helge I.; Andreassen, Rune

    2016-01-01

    The present study investigated the extent to which the text factors of source salience and emphasis on risk might influence readers' attention to and use of source information when reading single documents to make behavioral decisions on controversial health-related issues. Participants (n = 259), who were attending different bachelor-level…

  19. The Texts of the Agency's Agreements with the United Nations

    International Nuclear Information System (INIS)

    1963-01-01

    The text of the Special Agreement extending the jurisdiction of the Administrative Tribunal of the United Nations International Atomic Energy Agency regarding the applications of officials of this organization alleging non-observance of the Regulations of the Pension Fund UN staff, comes into force October 18, 1963, is reproduced in this document for the information of all Members of the Agency [fr

  20. The Texts of the Agency's Agreements with the United Nations

    International Nuclear Information System (INIS)

    1963-01-01

    The text of the Special Agreement extending the jurisdiction of the Administrative Tribunal of the United Nations International Atomic Energy Agency regarding the applications of officials of this organization alleging non-observance of the Regulations of the Pension Fund UN staff, comes into force October 18, 1963, is reproduced in this document for the information of all Members of the Agency

  1. Texting to increase adolescent physical activity: Feasibility assessment

    Science.gov (United States)

    Feasibility trials assess whether a behavior change program warrants a definite trial evaluation. This paper reports the feasibility of an intervention consisting of Self Determination Theory-informed text messages, pedometers, and goal prompts to increase adolescent physical activity. A 4-group ran...

  2. The Texts of the Agency's Relationship Agreements with Specialized Agencies

    International Nuclear Information System (INIS)

    1988-03-01

    The text of the relationship agreement with the Agency has concluded with the United Nations Industrial Development Organization, together with the protocol regarding its entry into force, is reproduced in this document for the information of all Members of the Agency. The agreement entered into force on 9 October 1987 pursuant to Article 10

  3. Enhancing L2 Reading Comprehension with Hypermedia Texts: Student Perceptions

    Science.gov (United States)

    Garrett-Rucks, Paula; Howles, Les; Lake, William M.

    2015-01-01

    This study extends current research about L2 hypermedia texts by investigating the combined use of audiovisual features including: (a) Contextualized images, (b) rollover translations, (c) cultural information, (d) audio explanations and (e) comprehension check exercises. Specifically, student perceptions of hypermedia readings compared to…

  4. The Texts of the Agency's Agreements with the United Nations

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1963-12-02

    The text of the Special Agreement extending the jurisdiction of the Administrative Tribunal of the United Nations International Atomic Energy Agency regarding the applications of officials of this organization alleging non-observance of the Regulations of the Pension Fund UN staff, comes into force October 18, 1963, is reproduced in this document for the information of all Members of the Agency.

  5. Cognition-Based Approaches for High-Precision Text Mining

    Science.gov (United States)

    Shannon, George John

    2017-01-01

    This research improves the precision of information extraction from free-form text via the use of cognitive-based approaches to natural language processing (NLP). Cognitive-based approaches are an important, and relatively new, area of research in NLP and search, as well as linguistics. Cognitive approaches enable significant improvements in both…

  6. Temporal analysis of text data using latent variable models

    DEFF Research Database (Denmark)

    Mølgaard, Lasse Lohilahti; Larsen, Jan; Goutte, Cyril

    2009-01-01

    Detecting and tracking of temporal data is an important task in multiple applications. In this paper we study temporal text mining methods for Music Information Retrieval. We compare two ways of detecting the temporal latent semantics of a corpus extracted from Wikipedia, using a stepwise...

  7. The Texts of the Agency's Relationship Agreements with Specialized Agencies

    International Nuclear Information System (INIS)

    1960-01-01

    The texts of the relationship agreements which the Agency has concluded with the specialized agencies listed below, together with the respective protocols authenticating them, are reproduced in this document in the order which the agreements entered into force, for the information of all Members of the Agency [fr

  8. The Texts of the Agency's Relationship Agreements with Specialized Agencies

    International Nuclear Information System (INIS)

    1960-01-01

    The texts of the relationship agreements which the Agency has concluded with the specialized agencies listed below, together with the respective protocols authenticating them, are reproduced in this document in the order which the agreements entered into force, for the information of all Members of the Agency

  9. The "Literature" of Literature Anthologies: An Examination of Text Types

    Science.gov (United States)

    Watkins, Naomi M.; Liang, Lauren Aimonette

    2014-01-01

    While the contents of K-6 basal readers have been recently examined (Dewitz, Leahy, Jones, Sullivan, 2010; Moss, 2008; Moss & Newton, 2002), the contents of secondary school literature anthologies have been vastly ignored in the last 2 decades. Given the Common Core State Standards' division of literary and informational text across content…

  10. Assessing semantic similarity of texts - Methods and algorithms

    Science.gov (United States)

    Rozeva, Anna; Zerkova, Silvia

    2017-12-01

    Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.

  11. Data-Model Relationship in Text-Independent Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Stapert Robert

    2005-01-01

    Full Text Available Text-independent speaker recognition systems such as those based on Gaussian mixture models (GMMs do not include time sequence information (TSI within the model itself. The level of importance of TSI in speaker recognition is an interesting question and one addressed in this paper. Recent works has shown that the utilisation of higher-level information such as idiolect, pronunciation, and prosodics can be useful in reducing speaker recognition error rates. In accordance with these developments, the aim of this paper is to show that as more data becomes available, the basic GMM can be enhanced by utilising TSI, even in a text-independent mode. This paper presents experimental work incorporating TSI into the conventional GMM. The resulting system, known as the segmental mixture model (SMM, embeds dynamic time warping (DTW into a GMM framework. Results are presented on the 2000-speaker SpeechDat Welsh database which show improved speaker recognition performance with the SMM.

  12. Self-knowledge in the process of interpreting philosophical texts

    Directory of Open Access Journals (Sweden)

    A. V. Kulik

    2016-08-01

    Full Text Available Modern philosophical researches pay attention to various aspects of self-knowledge investigation. For instance, there are works on historical concepts of self-knowledge (e.g. ones by C. Moore, or articles about the place of self-knowledge in the phenomenon of rationality (e.g. ones by J. Roessler, or articles about epistemological lacks of self-knowledge (e.g. ones by J. Fernández. However, our paper is about the aspect that is not in the researchers’ centre of attention. Our study shows that practicing of book reading can be a source of information not only about the content of the books, but also about their readers. We investigate a phenomenon of incompatible interpretations, Ludwig Wittgenstein’s ideas about understanding his texts, Merab Mamardashvili’s concept of ‘novel as a machine’, Ludwig Feuerbach’s theory of specifics of human cognition, and Philosophical Hermeneutics thinkers’ concepts about problems of understanding. The purpose of our paper is to describe possibilities of getting self-knowledge by analyzing the information about the results of philosophical texts reading. The research holds that these results give information about the content of the text as well as a reader’s ideas. A reader can use word forms from the texts to express his own implicit thoughts. Philosophical texts are the most effective tool for doing that because their content and ways of text organization stimulate such an activity. We illustrate these statements by examples from history of philosophy. For instance, we investigate the creation of the theory on the genealogy of morality by Friedrich Nietzsche. Analyzing phrases, which were important for him in the text, a person can estimate his own ideas. If one uses this theoretical model for getting self-knowledge he takes a new source of information about his own implicit ideas. The interpretation of this information will be effective. As a result of analyzing of Friedrich Schleiermacher

  13. ERRORS AND DIFFICULTIES IN TRANSLATING LEGAL TEXTS

    Directory of Open Access Journals (Sweden)

    Camelia, CHIRILA

    2014-11-01

    Full Text Available Nowadays the accurate translation of legal texts has become highly important as the mistranslation of a passage in a contract, for example, could lead to lawsuits and loss of money. Consequently, the translation of legal texts to other languages faces many difficulties and only professional translators specialised in legal translation should deal with the translation of legal documents and scholarly writings. The purpose of this paper is to analyze translation from three perspectives: translation quality, errors and difficulties encountered in translating legal texts and consequences of such errors in professional translation. First of all, the paper points out the importance of performing a good and correct translation, which is one of the most important elements to be considered when discussing translation. Furthermore, the paper presents an overview of the errors and difficulties in translating texts and of the consequences of errors in professional translation, with applications to the field of law. The paper is also an approach to the differences between languages (English and Romanian that can hinder comprehension for those who have embarked upon the difficult task of translation. The research method that I have used to achieve the objectives of the paper was the content analysis of various Romanian and foreign authors' works.

  14. n-Gram-Based Text Compression

    Directory of Open Access Journals (Sweden)

    Vu H. Nguyen

    2016-01-01

    Full Text Available We propose an efficient method for compressing Vietnamese text using n-gram dictionaries. It has a significant compression ratio in comparison with those of state-of-the-art methods on the same dataset. Given a text, first, the proposed method splits it into n-grams and then encodes them based on n-gram dictionaries. In the encoding phase, we use a sliding window with a size that ranges from bigram to five grams to obtain the best encoding stream. Each n-gram is encoded by two to four bytes accordingly based on its corresponding n-gram dictionary. We collected 2.5 GB text corpus from some Vietnamese news agencies to build n-gram dictionaries from unigram to five grams and achieve dictionaries with a size of 12 GB in total. In order to evaluate our method, we collected a testing set of 10 different text files with different sizes. The experimental results indicate that our method achieves compression ratio around 90% and outperforms state-of-the-art methods.

  15. Gender differences in psychosocial predictors of texting while driving.

    Science.gov (United States)

    Struckman-Johnson, Cindy; Gaster, Samuel; Struckman-Johnson, Dave; Johnson, Melissa; May-Shinagle, Gabby

    2015-01-01

    A sample of 158 male and 357 female college students at a midwestern university participated in an on-line study of psychosocial motives for texting while driving. Men and women did not differ in self-reported ratings of how often they texted while driving. However, more women sent texts of less than a sentence while more men sent texts of 1-5 sentences. More women than men said they would quit texting while driving due to police warnings, receiving information about texting dangers, being shown graphic pictures of texting accidents, and being in a car accident. A hierarchical regression for men's data revealed that lower levels of feeling distracted by texting while driving (20% of the variance), higher levels of cell phone dependence (11.5% of the variance), risky behavioral tendencies (6.5% of the variance) and impulsivity (2.3%) of the variance) were significantly associated with more texting while driving (total model variance=42%). A separate regression for women revealed that higher levels of cell phone dependence (10.4% of the variance), risky behavioral tendencies (9.9% of the variance), texting distractibility (6.2%), crash risk estimates (2.2% of the variance) and driving confidence (1.3% of the variance) were significantly associated with more texting while driving (total model variance=31%.) Friendship potential and need for intimacy were not related to men's or women's texting while driving. Implications of the results for gender-specific prevention strategies are discussed. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. Multilingual access to full text databases

    International Nuclear Information System (INIS)

    Fluhr, C.; Radwan, K.

    1990-05-01

    Many full text databases are available in only one language, or more, they may contain documents in different languages. Even if the user is able to understand the language of the documents in the database, it could be easier for him to express his need in his own language. For the case of databases containing documents in different languages, it is more simple to formulate the query in one language only and to retrieve documents in different languages. This paper present the developments and the first experiments of multilingual search, applied to french-english pair, for text data in nuclear field, based on the system SPIRIT. After reminding the general problems of full text databases search by queries formulated in natural language, we present the methods used to reformulate the queries and show how they can be expanded for multilingual search. The first results on data in nuclear field are presented (AFCEN norms and INIS abstracts). 4 refs

  17. Runaway electrons in TEXT-U

    International Nuclear Information System (INIS)

    Freeman, M.R.

    1994-01-01

    Runaway electrons have long been studied in tokamak plasmas. The previous results regarding runaway electrons and the detection of hard x-rays are reviewed. The hard x-ray energy on TEXT-U is measured and the scaling of energy with electron density, n e , is noted. This scaling suggests a runaway source term that scales roughly as n e / 1 . The results indicate that runaways are created throughout the discharges. An upper bound for X e due to magnetic fluctuations was found to be .0343 m 2 /s. This is an order of magnitude too low to explain the thermal transport in TEXT, implying that electrostatic fluctuations are important in thermal transport in TEXT

  18. No More Provincialism: Art and Text

    Directory of Open Access Journals (Sweden)

    Heather Barker

    2010-11-01

    Full Text Available This essay discusses the writing and personalities surrounding the 1981 establishment of the Australian art magazine, Art & Text, and traces its progression under Paul Taylor’s editorship up to his relocation to New York. During this period, Art & Text published Taylor’s own essays and, more importantly, those of other writers and artists — Meaghan Morris, Paul Foss, Philip Brophy, Imants Tillers, Rex Butler, Edward Colless — all articulating a consistent and complex postmodern position. The magazine’s founder and editor, Paul Taylor, personified the shattering impact of postmodernism upon the Australian art world as well as postmodernism’s limitations. Taylor facilitated a new theoretical framework for the discussion of Australian art, one that continues to dominate the internationalist aspirations of Australian art writers. He produced temporarily convincing solutions to problems that earlier critics had wrestled with unsuccessfully, in particular the twin problems of provincialism, and the relationship of Australian to international art.

  19. Text mining patents for biomedical knowledge.

    Science.gov (United States)

    Rodriguez-Esteban, Raul; Bundschus, Markus

    2016-06-01

    Biomedical text mining of scientific knowledge bases, such as Medline, has received much attention in recent years. Given that text mining is able to automatically extract biomedical facts that revolve around entities such as genes, proteins, and drugs, from unstructured text sources, it is seen as a major enabler to foster biomedical research and drug discovery. In contrast to the biomedical literature, research into the mining of biomedical patents has not reached the same level of maturity. Here, we review existing work and highlight the associated technical challenges that emerge from automatically extracting facts from patents. We conclude by outlining potential future directions in this domain that could help drive biomedical research and drug discovery. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Counselling young cannabis users by text message

    DEFF Research Database (Denmark)

    Laursen, Ditte

    2010-01-01

    This article presents the results of a study of two SMS services aimed at providing young people with information on cannabis and helping them to reduce their consumption of the drug. The attitude of the 12 participants in the study towards the SMS services is generally positive, but they prefer...... factual information to advice and counselling. The messages prompt reflection and awareness among the recipients, and their repetitive, serial nature plays a significant part in the process of change. This is especially true of the young people whose use of cannabis is recreational. For them, the SMS...

  1. Enhancing biomedical text summarization using semantic relation extraction.

    Directory of Open Access Journals (Sweden)

    Yue Shang

    Full Text Available Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1 We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2 We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3 For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.

  2. Quantum mechanics a comprehensive text for chemistry

    CERN Document Server

    Arora, Kishor

    2010-01-01

    This book contains 14 chapters. The text includes the inadequacy of classical mechanics and covers basic and fundamental concepts of quantum mechanics including concepts of transitional, vibration rotation and electronic energies, introduction to concepts of angular momenta, approximatemethods and their application concepts related to electron spin, symmetery concepts and quantum mechanics and ultimately the book features the theories of chemical bonding and use of softwares in quantum mechanics. the text of the book is presented in a lucid manner with ample examples and illustrations wherever

  3. Chinese legal texts – Quantitative Description

    Directory of Open Access Journals (Sweden)

    Ľuboš GAJDOŠ

    2017-06-01

    Full Text Available The aim of the paper is to provide a quantitative description of legal Chinese. This study adopts the approach of corpus-based analyses and it shows basic statistical parameters of legal texts in Chinese, namely the length of a sentence, the proportion of part of speech etc. The research is conducted on the Chinese monolingual corpus Hanku. The paper also discusses the issues of statistical data processing from various corpora, e.g. the tokenisation and part of speech tagging and their relevance to study of registers variation.

  4. Ancient Indian Astronomy in Introductory Texts

    Science.gov (United States)

    Narahari Achar, B. N.

    1997-10-01

    It is customary in introductory survey courses in astronomy to devote some time to the history of astronomy. In the available text books only the Greek contribution receives any attention. Apart from Stonehenge and Chichenitza pictures, contributions from Babylon and China are some times mentioned. Hardly any account is given of ancient Indian astronomy. Even when something is mentioned it is incomplete or incorrect or both. Examples are given from several text books currently available. An attempt is made to correct this situation by sketching the contributions from the earliest astronomy of India, namely Vedaanga Jyotisha.

  5. Radioprotection and radiotherapy: new regulatory texts

    International Nuclear Information System (INIS)

    Cosset, J.M.

    1998-01-01

    This article reviews about radiation protection of the workers in the radiotherapy centers. The different texts are explained. These texts (international and european ones) have to aim to reinforce the protection of personnel working in radiotherapy services, to reduce as it is possible the determinists an stochastic effects to organs out of the irradiated volumes, to avoid severe accidents. The radiotherapists have to keep in their mind that treatments must be justified in a clear way and optimized as reasonably achievable. (N.C.)

  6. Precision and Disclosure in Text and Voice Interviews on Smartphones.

    Directory of Open Access Journals (Sweden)

    Michael F Schober

    Full Text Available As people increasingly communicate via asynchronous non-spoken modes on mobile devices, particularly text messaging (e.g., SMS, longstanding assumptions and practices of social measurement via telephone survey interviewing are being challenged. In the study reported here, 634 people who had agreed to participate in an interview on their iPhone were randomly assigned to answer 32 questions from US social surveys via text messaging or speech, administered either by a human interviewer or by an automated interviewing system. 10 interviewers from the University of Michigan Survey Research Center administered voice and text interviews; automated systems launched parallel text and voice interviews at the same time as the human interviews were launched. The key question was how the interview mode affected the quality of the response data, in particular the precision of numerical answers (how many were not rounded, variation in answers to multiple questions with the same response scale (differentiation, and disclosure of socially undesirable information. Texting led to higher quality data-fewer rounded numerical answers, more differentiated answers to a battery of questions, and more disclosure of sensitive information-than voice interviews, both with human and automated interviewers. Text respondents also reported a strong preference for future interviews by text. The findings suggest that people interviewed on mobile devices at a time and place that is convenient for them, even when they are multitasking, can give more trustworthy and accurate answers than those in more traditional spoken interviews. The findings also suggest that answers from text interviews, when aggregated across a sample, can tell a different story about a population than answers from voice interviews, potentially altering the policy implications from a survey.

  7. $E_{\\text{T}}^{\\text{miss}}$ performance in the ATLAS detector using 2015-2016 LHC p-p collisions

    CERN Document Server

    The ATLAS collaboration

    2018-01-01

    The reconstruction and calibration algorithms used to measure missing transverse momentum ($E_{\\text{T}}^{\\text{miss}}$) with the ATLAS detector utilise energy deposits within the calorimeter and tracks reconstructed in the inner detector and the muon spectrometer. The performance of the $E_{\\text{T}}^{\\text{miss}}$ reconstruction algorithms is evaluated using data collected in proton--proton collisions in 2015 and 2016 at a centre-of-mass energy of $13$ TeV. Results are shown for a data sample corresponding to an integrated luminosity of $36~$fb$^{-1}$. The performance of $E_{\\text{T}}^{\\text{miss}}$ built with jets reconstructed using a particle flow algorithm is presented and compared to that built with calorimeter jets. Various strategies are used to suppress effects arising from additional proton--proton interactions, called pileup. The tracking and vertexing information is used to distinguish contributions from pileup entering the $E_{\\text{T}}^{\\text{miss}}$ calculation. The modelling of $E_{\\text{T}}^...

  8. Text accessibility by people with reduced contrast sensitivity.

    Science.gov (United States)

    Crossland, Michael D; Rubin, Gary S

    2012-09-01

    Contrast sensitivity is reduced in people with eye disease, and also in older adults without eye disease. In this article, we compare contrast of text presented in print and digital formats with contrast sensitivity values for a large cohort of subjects in a population-based study of older adults (the Salisbury Eye Evaluation). Contrast sensitivity values were recorded for 2520 adults aged 65 to 84 years living in Salisbury, Maryland. The proportion of the sample likely to be unable to read text of different formats (electronic books, newsprint, paperback books, laser print, and LED computer monitors) was calculated using published contrast reserve levels required to perform spot reading, to read with fluency, high fluency, and under optimal conditions. One percent of this sample had contrast sensitivity less than that required to read newsprint fluently. Text presented on an LED computer monitor had the highest contrast. Ninety-eight percent of the sample had contrast sensitivity sufficient for high fluent reading of text (at least 160 words/min) on a monitor. However, 29.6% were still unlikely to be able to read this text with optimal fluency. Reduced contrast of print limits text accessibility for many people in the developed world. Presenting text in a high-contrast format, such as black laser print on a white page, would increase the number of people able to access such information. Additionally, making text available in a format that can be presented on an LED computer monitor will increase access to written documents.

  9. Guided Text Search Using Adaptive Visual Analytics

    Energy Technology Data Exchange (ETDEWEB)

    Steed, Chad A [ORNL; Symons, Christopher T [ORNL; Senter, James K [ORNL; DeNap, Frank A [ORNL

    2012-10-01

    This research demonstrates the promise of augmenting interactive visualizations with semi- supervised machine learning techniques to improve the discovery of significant associations and insights in the search and analysis of textual information. More specifically, we have developed a system called Gryffin that hosts a unique collection of techniques that facilitate individualized investigative search pertaining to an ever-changing set of analytical questions over an indexed collection of open-source documents related to critical national infrastructure. The Gryffin client hosts dynamic displays of the search results via focus+context record listings, temporal timelines, term-frequency views, and multiple coordinate views. Furthermore, as the analyst interacts with the display, the interactions are recorded and used to label the search records. These labeled records are then used to drive semi-supervised machine learning algorithms that re-rank the unlabeled search records such that potentially relevant records are moved to the top of the record listing. Gryffin is described in the context of the daily tasks encountered at the US Department of Homeland Security s Fusion Center, with whom we are collaborating in its development. The resulting system is capable of addressing the analysts information overload that can be directly attributed to the deluge of information that must be addressed in the search and investigative analysis of textual information.

  10. Mizan 3.1 (b) Text, corrected

    African Journals Online (AJOL)

    eliasn

    the principle of confidentiality in international commercial arbitration cou- pled with their sensitivity to the ...... Ethiopian courts have been addressing the question from diverse perspec- tives. ... 5.2- Ethio Marketing Ltd. v Ministry of Information 51 .... questions: Is an arbitration clause inserted in a public works contract valid? If.

  11. A Science for Citizenship Model: Assessing the Effects of Benefits, Risks, and Trust for Predicting Students' Interest in and Understanding of Science-Related Content

    Science.gov (United States)

    Jack, Brady Michael; Lee, Ling; Yang, Kuay-Keng; Lin, Huann-shyang

    2017-10-01

    This study showcases the Science for Citizenship Model (SCM) as a new instructional methodology for presenting, to secondary students, science-related technology content related to the use of science in society not taught in the science curriculum, and a new approach for assessing the intercorrelations among three independent variables (benefits, risks, and trust) to predict the dependent variable of triggered interest in learning science. Utilizing a 50-minute instructional presentation on nanotechnology for citizenship, data were collected from 301 Taiwanese high school students. Structural equation modeling (SEM) and paired-samples t-tests were used to analyze the fitness of data to SCM and the extent to which a 50-minute class presentation of nanotechnology for citizenship affected students' awareness of benefits, risks, trust, and triggered interest in learning science. Results of SCM on pre-tests and post-tests revealed acceptable model fit to data and demonstrated that the strongest predictor of students' triggered interest in nanotechnology was their trust in science. Paired-samples t-test results on students' understanding of nanotechnology and their self-evaluated awareness of the benefits and risks of nanotechology, trust in scientists, and interest in learning science revealed low significant differences between pre-test and post-test. These results provide evidence that a short 50-minute presentation on an emerging science not normally addressed within traditional science curriculum had a significant yet limited impact on students' learning of nanotechnology in the classroom. Finally, we suggest why the results of this study may be important to science education instruction and research for understanding how the integration into classroom science education of short presentations of cutting-edge science and emerging technologies in support of the science for citizenship enterprise might be accomplished through future investigations.

  12. Knowledge Revision Processes in Refutation Texts

    Science.gov (United States)

    Kendeou, Panayiota; Walsh, Erinn K.; Smith, Emily R.; O'Brien, Edward J.

    2014-01-01

    In the present set of experiments, we systematically examined the processes that occur while reading texts designed to refute and explain commonsense beliefs that reside in readers' long-term memory. In Experiment 1 (n = 36), providing readers with a refutation-plus-explanation of a commonsense belief was sufficient to significantly reduce…

  13. Sleep Habits and Nighttime Texting among Adolescents

    Science.gov (United States)

    Garmy, Pernilla; Ward, Teresa M.

    2018-01-01

    The aim of this study was to examine sleep habits (i.e., bedtimes and rising times) and their association with nighttime text messaging in 15- to 17-year-old adolescents. This cross-sectional study analyzed data from a web-based survey of adolescent students attending secondary schools in southern Sweden (N = 278, 50% female). Less than 8 hr of…

  14. Handwriting segmentation of unconstrained Oriya text

    Indian Academy of Sciences (India)

    Based on vertical projection profiles and structural features of Oriya characters, text lines are segmented into words. For character segmentation, at first, the isolated and connected (touching) characters in a word are detected. Using structural, topological and water reservoir concept-based features, characters of the word ...

  15. The Readability of an Unreadable Text.

    Science.gov (United States)

    Gordon, Robert M.

    1980-01-01

    The Dale-Chall Readability Formula and the Fry Readability Graph were used to analyze passages of Plato's "Parmenides," a notoriously difficult literary piece. The readability levels of the text ranged from fourth to eighth grade (Dale-Chall) and from sixth to tenth grade (Fry), indicating the limitations of the readability tests. (DF)

  16. Validation Study of Waray Text Readability Instrument

    Science.gov (United States)

    Oyzon, Voltaire Q.; Corrales, Juven B.; Estardo, Wilfredo M., Jr.

    2015-01-01

    In 2012 the Leyte Normal University developed a computer software--modelled after the Spache Readability Formula (1953) made for English--made to help rank texts that can is used by teachers or research groups on selecting appropriate reading materials to support the DepEd's MTB-MLE program in Region VIII, in the Philippines. However,…

  17. The Pelindaba text and its previous

    International Nuclear Information System (INIS)

    Adeniji, O.

    1996-01-01

    The main body of the Treaty, the preamble, articles 1-22, and the map are reproduced in this issue in the section ''Documentation Relating to Disarmament and International Security''. The complete text, including annexes and protocols, is contained in document A/50/426

  18. n-Gram-Based Text Compression

    Science.gov (United States)

    Duong, Hieu N.; Snasel, Vaclav

    2016-01-01

    We propose an efficient method for compressing Vietnamese text using n-gram dictionaries. It has a significant compression ratio in comparison with those of state-of-the-art methods on the same dataset. Given a text, first, the proposed method splits it into n-grams and then encodes them based on n-gram dictionaries. In the encoding phase, we use a sliding window with a size that ranges from bigram to five grams to obtain the best encoding stream. Each n-gram is encoded by two to four bytes accordingly based on its corresponding n-gram dictionary. We collected 2.5 GB text corpus from some Vietnamese news agencies to build n-gram dictionaries from unigram to five grams and achieve dictionaries with a size of 12 GB in total. In order to evaluate our method, we collected a testing set of 10 different text files with different sizes. The experimental results indicate that our method achieves compression ratio around 90% and outperforms state-of-the-art methods. PMID:27965708

  19. Historical Text Comprehension Reflective Tutorial Dialogue System

    Science.gov (United States)

    Grigoriadou, Maria; Tsaganou, Grammatiki; Cavoura, Theodora

    2005-01-01

    The Reflective Tutorial Dialogue System (ReTuDiS) is a system for learner modelling historical text comprehension through reflective dialogue. The system infers learners' cognitive profiles and constructs their learner models. Based on the learner model the system plans the appropriate--personalized for learners--reflective tutorial dialogue in…

  20. There is a Text in 'The Balloon'

    DEFF Research Database (Denmark)

    Elias, Camelia

    2009-01-01

    From the Introduction: Camelia Elias' "There is a Text in 'The Balloon': Donald Barthelme's Allegorical Flights" provides its reader with a much-need and useful distinction between fantasy and the fantastic: "whereas fantasy in critical discourse can be aligned with allegory, in which a supernatu...

  1. Assessing Assessment Texts: Where Is Planning?

    Science.gov (United States)

    Fives, Helenrose; Barnes, Nicole; Dacey, Charity; Gillis, Anna

    2016-01-01

    We conducted a content analysis of 27 assessment textbooks to determine how assessment planning was framed in texts for preservice teachers. We identified eight assessment planning themes: alignment, assessment purpose and types, reliability and validity, writing goals and objectives, planning specific assessments, unpacking, overall assessment…

  2. "The Politics of Location": Text as Opposition.

    Science.gov (United States)

    Moreno, Renee

    Eduardo Galeano's "Memory of Fire: Genesis" raises a number of questions concerning the "politics of location," a term that may be defined as the intersections, tensions, and complications that people of color bring to space and what space means in terms of hierarchies and power, racial and gender stratifications. Text can also…

  3. Writing Treatment for Aphasia: A Texting Approach

    Science.gov (United States)

    Beeson, Pelagie M.; Higginson, Kristina; Rising, Kindle

    2013-01-01

    Purpose: Treatment studies have documented the therapeutic and functional value of lexical writing treatment for individuals with severe aphasia. The purpose of this study was to determine whether such retraining could be accomplished using the typing feature of a cellular telephone, with the ultimate goal of using text messaging for…

  4. INNER DIALOGICITY OF MEDICAL SCIENTIFIC TEXTS

    Directory of Open Access Journals (Sweden)

    Efremova Nataliya Vladimirovna

    2015-06-01

    Full Text Available The author studies inner dialogicity as an integral property of a scientist's thinking activity, a way of a scientific idea development, one of the cognitive and discursive mechanisms of new knowledge formation, its crystallization and dementalisation in a text, as a way of search for truth. Such approach to dialogicity in the study of a scientific text makes it possible to analyze the cogitative processes proceeding in human consciousness and cognitive activity, allows to fully understand the stated scientific concept, to define pragmatic strategies of the author, to plunge into his reflexive world. On the material of medical scientific texts of N.M. Amosov and F. G. Uglov, famous scientists in the field of cardio surgery, it is established that traces of internal dialogicity manifestation in the textual space of scientists actualize the origin of new knowledge, the change of author's semantic positions, his ability to reflect, compare, analyze his own thoughts and actions, to estimate oneself and the features of thinking process which are realized in logic of a statement of the scientific concept, an explanation of concepts, terms at judgment of the points of view of contemporaries and predecessors, adherents and scientist's opponents, and also orientation to the addressee's presupposition, activization of his cogitative activity. Linguistic, discursive, verbal analysis singles out the impact on the addressee, his mental activity.

  5. AUTHENTIC TEXTS FOR CRITICAL READING ACTIVITIES

    Directory of Open Access Journals (Sweden)

    Ila Amalia

    2016-03-01

    Full Text Available This research takes an action research aimed at promoting critical reading (“thinking” while reading skills using authentic materials among the students. This research also aims to reveal the students perception on using critical reading skills in reading activities. Nineteen English Education Department students who took Reading IV class, participated in this project. There were three cycles with three different critical reading strategies were applied. Meanwhile, the authentic materials were taken from newspaper and internet articles. The result revealed that the use of critical reading strategies along with the use of authentic materials has improved students’ critical reading skills as seen from the improvement of each cycle - the students critical reading skill was 54% (fair in the cycle 1 improved to 68% (average in cycle 2, and 82% (good in cycle 3.. In addition, based on the critical reading skill criteria, the students’ critical reading skill has improved from 40% (nearly meet to 80% (exceed. Meanwhile, from the students’ perception questionnaire, it was shown that 63% students agreed the critical reading activity using authentic text could improve critical thinking and 58% students agreed that doing critical reading activity could improve reading comprehension. The result had the implication that the use of authentic texts could improve students’ critical reading skills if it was taught by performing not lecturing them. Selectively choosing various strategies and materials can trigger students’ activeness in responding to a text, that eventually shape their critical reading skills.

  6. Database citation in full text biomedical articles.

    Science.gov (United States)

    Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R

    2013-01-01

    Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services.

  7. Selecting Full-Text Undergraduate Periodicals Databases.

    Science.gov (United States)

    Still, Julie M.; Kassabian, Vibiana

    1999-01-01

    Examines how libraries and librarians can compare full-text general periodical indices, using ProQuest Direct, Periodical Abstracts (via Ovid), and EBSCOhost as examples. Explores breadth and depth of coverage; manipulation of results (email/download/print); ease of use (searching); and indexing quirks. (AEF)

  8. The Impact of Texting on Comprehension

    Directory of Open Access Journals (Sweden)

    Jamal K. M. Ali

    2015-07-01

    Full Text Available This paper presents a study of the effects of texting on English language comprehension. The authors believe that English used in texting causes a lack of comprehension for English speakers, learners, and texters. Wei, Xian-hai and Jiang (2008:3 declare “In Netspeak, there are some newly-created vocabularies, which people cannot comprehend them either from their partial pronunciation or from their figures.” Crystal (2007:23 claims; “variation causes problems of comprehension and acceptability. If you speak or write differently from the way I do, we may fail to understand each other.”  In this paper, the authors conducted a questionnaire at Aligarh Muslim University to ninety respondents from five different Faculties and four different levels. To measure respondents’ comprehension of English texting, the authors gave the respondents abbreviations used by texters and asked them to write the full forms of the abbreviations. The authors found that many abbreviations were not understood, which suggested that most of the respondents did not understand and did not use these abbreviations.

  9. Prayer in Qumran texts. A brief introduction

    Directory of Open Access Journals (Sweden)

    Zdzisław J. Kapera

    2011-03-01

    Full Text Available Of some three hundred literary texts found in the caves of the Judaean Desert and those close to Khirbet Qumran, 56 are various pieces of poetry and liturgy. Seven specific groups have been distinguished among them: 1. Liturgy on sunshine and sunset and on specific days; 2. Liturgy on specific ceremonies of the community; 3. Eschatological prayers; 4. Magic texts; 5. Collections of psalms (including pseudepigrapha; 6. Thanksgiving hymns; 7. Prose prayers. The issue of how the Qumranians were praying is here briefly touched upon. Then there is a description of morning and evening prayers, Sabbath prayers, specific liturgy of the annual ceremony of entering the New Covenant, the Hodayot (Thanksgiving Hymns, pseudepigraphic Psalms (like Ps 151, and the eschatological prayers. The introduction ends with a summary evaluation of the role of the texts in reconstructing the historical development of the Jewish prayer of the late Second Temple period. The need to study the relationship of the Qumran prayers with the early Christian prayers is also briefly discussed.

  10. Rubrics and Exemplars in Text-Conferencing

    Science.gov (United States)

    Zahara, Allan

    2005-01-01

    The author draws on his K-12 teaching experiences in analyzing the strengths and weaknesses of asynchronous, text-based conferencing in online education. Issues relating to Web-based versus client-driven systems in computer-mediated conferencing (CMC) are examined. The paper also discusses pedagogical and administrative implications of choosing a…

  11. The Challenges of Qualitatively Coding Ancient Texts

    Science.gov (United States)

    Slingerland, Edward; Chudek, Maciej

    2012-01-01

    We respond to several important and valid concerns about our study ("The Prevalence of Folk Dualism in Early China," "Cognitive Science" 35: 997-1007) by Klein and Klein, defending our interpretation of our data. We also argue that, despite the undeniable challenges involved in qualitatively coding texts from ancient cultures,…

  12. Teaching life writing texts in Europe : Introduction

    NARCIS (Netherlands)

    Mreijen, Anne-Marie

    2015-01-01

    Although courses on auto/biography and life writing are taught at different universities in Europe, and elements of contemporary life writing issues are addressed in different disciplines like sociology and history, life writing courses, as described in Teaching Life Writing Texts, are certainly not

  13. Measurement of the [Formula: see text] meson lifetime using [Formula: see text] decays.

    Science.gov (United States)

    Aaij, R; Adeva, B; Adinolfi, M; Affolder, A; Ajaltouni, Z; Albrecht, J; Alessio, F; Alexander, M; Ali, S; Alkhazov, G; Cartelle, P Alvarez; Alves, A A; Amato, S; Amerio, S; Amhis, Y; Anderlini, L; Anderson, J; Andreassen, R; Andreotti, M; Andrews, J E; Appleby, R B; Gutierrez, O Aquines; Archilli, F; Artamonov, A; Artuso, M; Aslanides, E; Auriemma, G; Baalouch, M; Bachmann, S; Back, J J; Badalov, A; Balagura, V; Baldini, W; Barlow, R J; Barschel, C; Barsuk, S; Barter, W; Batozskaya, V; Bauer, Th; Bay, A; Beddow, J; Bedeschi, F; Bediaga, I; Belogurov, S; Belous, K; Belyaev, I; Ben-Haim, E; Bencivenni, G; Benson, S; Benton, J; Berezhnoy, A; Bernet, R; Bettler, M-O; van Beuzekom, M; Bien, A; Bifani, S; Bird, T; Bizzeti, A; Bjørnstad, P M; Blake, T; Blanc, F; Blouw, J; Blusk, S; Bocci, V; Bondar, A; Bondar, N; Bonivento, W; Borghi, S; Borgia, A; Borsato, M; Bowcock, T J V; Bowen, E; Bozzi, C; Brambach, T; van den Brand, J; Bressieux, J; Brett, D; Britsch, M; Britton, T; Brook, N H; Brown, H; Bursche, A; Busetto, G; Buytaert, J; Cadeddu, S; Calabrese, R; Callot, O; Calvi, M; Calvo Gomez, M; Camboni, A; Campana, P; Campora Perez, D; Carbone, A; Carboni, G; Cardinale, R; Cardini, A; Carranza-Mejia, H; Carson, L; Carvalho Akiba, K; Casse, G; Castillo Garcia, L; Cattaneo, M; Cauet, Ch; Cenci, R; Charles, M; Charpentier, Ph; Cheung, S-F; Chiapolini, N; Chrzaszcz, M; Ciba, K; Cid Vidal, X; Ciezarek, G; Clarke, P E L; Clemencic, M; Cliff, H V; Closier, J; Coca, C; Coco, V; Cogan, J; Cogneras, E; Collins, P; Comerma-Montells, A; Contu, A; Cook, A; Coombes, M; Coquereau, S; Corti, G; Counts, I; Couturier, B; Cowan, G A; Craik, D C; Cruz Torres, M; Cunliffe, S; Currie, R; D'Ambrosio, C; Dalseno, J; David, P; David, P N Y; Davis, A; De Bonis, I; De Bruyn, K; De Capua, S; De Cian, M; De Miranda, J M; De Paula, L; De Silva, W; De Simone, P; Decamp, D; Deckenhoff, M; Del Buono, L; Déléage, N; Derkach, D; Deschamps, O; Dettori, F; Di Canto, A; Dijkstra, H; Donleavy, S; Dordei, F; Dorigo, M; Dorosz, P; Dosil Suárez, A; Dossett, D; Dovbnya, A; Dupertuis, F; Durante, P; Dzhelyadin, R; Dziurda, A; Dzyuba, A; Easo, S; Egede, U; Egorychev, V; Eidelman, S; Eisenhardt, S; Eitschberger, U; Ekelhof, R; Eklund, L; El Rifai, I; Elsasser, Ch; Falabella, A; Färber, C; Farinelli, C; Farry, S; Ferguson, D; Fernandez Albor, V; Ferreira Rodrigues, F; Ferro-Luzzi, M; Filippov, S; Fiore, M; Fiorini, M; Fitzpatrick, C; Fontana, M; Fontanelli, F; Forty, R; Francisco, O; Frank, M; Frei, C; Frosini, M; Furfaro, E; Gallas Torreira, A; Galli, D; Gandelman, M; Gandini, P; Gao, Y; Garofoli, J; Garra Tico, J; Garrido, L; Gaspar, C; Gauld, R; Gersabeck, E; Gersabeck, M; Gershon, T; Ghez, Ph; Gianelle, A; Gibson, V; Giubega, L; Gligorov, V V; Göbel, C; Golubkov, D; Golutvin, A; Gomes, A; Gordon, H; Grabalosa Gándara, M; Graciani Diaz, R; Granado Cardoso, L A; Graugés, E; Graziani, G; Grecu, A; Greening, E; Gregson, S; Griffith, P; Grillo, L; Grünberg, O; Gui, B; Gushchin, E; Guz, Yu; Gys, T; Hadjivasiliou, C; Haefeli, G; Haen, C; Hafkenscheid, T W; Haines, S C; Hall, S; Hamilton, B; Hampson, T; Hansmann-Menzemer, S; Harnew, N; Harnew, S T; Harrison, J; Hartmann, T; He, J; Head, T; Heijne, V; Hennessy, K; Henrard, P; Hernando Morata, J A; van Herwijnen, E; Heß, M; Hicheur, A; Hill, D; Hoballah, M; Hombach, C; Hulsbergen, W; Hunt, P; Huse, T; Hussain, N; Hutchcroft, D; Hynds, D; Iakovenko, V; Idzik, M; Ilten, P; Jacobsson, R; Jaeger, A; Jans, E; Jaton, P; Jawahery, A; Jing, F; John, M; Johnson, D; Jones, C R; Joram, C; Jost, B; Jurik, N; Kaballo, M; Kandybei, S; Kanso, W; Karacson, M; Karbach, T M; Kenyon, I R; Ketel, T; Khanji, B; Khurewathanakul, C; Klaver, S; Kochebina, O; Komarov, I; Koopman, R F; Koppenburg, P; Korolev, M; Kozlinskiy, A; Kravchuk, L; Kreplin, K; Kreps, M; Krocker, G; Krokovny, P; Kruse, F; Kucharczyk, M; Kudryavtsev, V; Kurek, K; Kvaratskheliya, T; La Thi, V N; Lacarrere, D; Lafferty, G; Lai, A; Lambert, D; Lambert, R W; Lanciotti, E; Lanfranchi, G; Langenbruch, C; Latham, T; Lazzeroni, C; Le Gac, R; van Leerdam, J; Lees, J-P; Lefèvre, R; Leflat, A; Lefrançois, J; Leo, S; Leroy, O; Lesiak, T; Leverington, B; Li, Y; Liles, M; Lindner, R; Linn, C; Lionetto, F; Liu, B; Liu, G; Lohn, S; Longstaff, I; Lopes, J H; Lopez-March, N; Lowdon, P; Lu, H; Lucchesi, D; Luisier, J; Luo, H; Luppi, E; Lupton, O; Machefert, F; Machikhiliyan, I V; Maciuc, F; Maev, O; Malde, S; Manca, G; Mancinelli, G; Manzali, M; Maratas, J; Marconi, U; Marino, P; Märki, R; Marks, J; Martellotti, G; Martens, A; Martín Sánchez, A; Martinelli, M; Martinez Santos, D; Martins Tostes, D; Massafferri, A; Matev, R; Mathe, Z; Matteuzzi, C; Mazurov, A; McCann, M; McCarthy, J; McNab, A; McNulty, R; McSkelly, B; Meadows, B; Meier, F; Meissner, M; Merk, M; Milanes, D A; Minard, M-N; Molina Rodriguez, J; Monteil, S; Moran, D; Morandin, M; Morawski, P; Mordà, A; Morello, M J; Mountain, R; Mous, I; Muheim, F; Müller, K; Muresan, R; Muryn, B; Muster, B; Naik, P; Nakada, T; Nandakumar, R; Nasteva, I; Needham, M; Neubert, S; Neufeld, N; Nguyen, A D; Nguyen, T D; Nguyen-Mau, C; Nicol, M; Niess, V; Niet, R; Nikitin, N; Nikodem, T; Novoselov, A; Oblakowska-Mucha, A; Obraztsov, V; Oggero, S; Ogilvy, S; Okhrimenko, O; Oldeman, R; Onderwater, G; Orlandea, M; Otalora Goicochea, J M; Owen, P; Oyanguren, A; Pal, B K; Palano, A; Palutan, M; Panman, J; Papanestis, A; Pappagallo, M; Pappalardo, L; Parkes, C; Parkinson, C J; Passaleva, G; Patel, G D; Patel, M; Patrignani, C; Pavel-Nicorescu, C; Pazos Alvarez, A; Pearce, A; Pellegrino, A; Penso, G; Pepe Altarelli, M; Perazzini, S; Perez Trigo, E; Perret, P; Perrin-Terrin, M; Pescatore, L; Pesen, E; Pessina, G; Petridis, K; Petrolini, A; Picatoste Olloqui, E; Pietrzyk, B; Pilař, T; Pinci, D; Pistone, A; Playfer, S; Plo Casasus, M; Polci, F; Polok, G; Poluektov, A; Polycarpo, E; Popov, A; Popov, D; Popovici, B; Potterat, C; Powell, A; Prisciandaro, J; Pritchard, A; Prouve, C; Pugatch, V; Puig Navarro, A; Punzi, G; Qian, W; Rachwal, B; Rademacker, J H; Rakotomiaramanana, B; Rama, M; Rangel, M S; Raniuk, I; Rauschmayr, N; Raven, G; Redford, S; Reichert, S; Reid, M M; Dos Reis, A C; Ricciardi, S; Richards, A; Rinnert, K; Rives Molina, V; Roa Romero, D A; Robbe, P; Roberts, D A; Rodrigues, A B; Rodrigues, E; Rodriguez Perez, P; Roiser, S; Romanovsky, V; Romero Vidal, A; Rotondo, M; Rouvinet, J; Ruf, T; Ruffini, F; Ruiz, H; Ruiz Valls, P; Sabatino, G; Saborido Silva, J J; Sagidova, N; Sail, P; Saitta, B; Salustino Guimaraes, V; Sanmartin Sedes, B; Santacesaria, R; Santamarina Rios, C; Santovetti, E; Sapunov, M; Sarti, A; Satriano, C; Satta, A; Savrie, M; Savrina, D; Schiller, M; Schindler, H; Schlupp, M; Schmelling, M; Schmidt, B; Schneider, O; Schopper, A; Schune, M-H; Schwemmer, R; Sciascia, B; Sciubba, A; Seco, M; Semennikov, A; Senderowska, K; Sepp, I; Serra, N; Serrano, J; Seyfert, P; Shapkin, M; Shapoval, I; Shcheglov, Y; Shears, T; Shekhtman, L; Shevchenko, O; Shevchenko, V; Shires, A; Silva Coutinho, R; Simi, G; Sirendi, M; Skidmore, N; Skwarnicki, T; Smith, N A; Smith, E; Smith, E; Smith, J; Smith, M; Snoek, H; Sokoloff, M D; Soler, F J P; Soomro, F; Souza, D; Souza De Paula, B; Spaan, B; Sparkes, A; Spinella, F; Spradlin, P; Stagni, F; Stahl, S; Steinkamp, O; Stevenson, S; Stoica, S; Stone, S; Storaci, B; Stracka, S; Straticiuc, M; Straumann, U; Stroili, R; Subbiah, V K; Sun, L; Sutcliffe, W; Swientek, S; Syropoulos, V; Szczekowski, M; Szczypka, P; Szilard, D; Szumlak, T; T'Jampens, S; Teklishyn, M; Tellarini, G; Teodorescu, E; Teubert, F; Thomas, C; Thomas, E; van Tilburg, J; Tisserand, V; Tobin, M; Tolk, S; Tomassetti, L; Tonelli, D; Topp-Joergensen, S; Torr, N; Tournefier, E; Tourneur, S; Tran, M T; Tresch, M; Tsaregorodtsev, A; Tsopelas, P; Tuning, N; Ubeda Garcia, M; Ukleja, A; Ustyuzhanin, A; Uwer, U; Vagnoni, V; Valenti, G; Vallier, A; Vazquez Gomez, R; Vazquez Regueiro, P; Vázquez Sierra, C; Vecchi, S; Velthuis, J J; Veltri, M; Veneziano, G; Vesterinen, M; Viaud, B; Vieira, D; Vilasis-Cardona, X; Vollhardt, A; Volyanskyy, D; Voong, D; Vorobyev, A; Vorobyev, V; Voß, C; Voss, H; de Vries, J A; Waldi, R; Wallace, C; Wallace, R; Wandernoth, S; Wang, J; Ward, D R; Watson, N K; Webber, A D; Websdale, D; Whitehead, M; Wicht, J; Wiechczynski, J; Wiedner, D; Wiggers, L; Wilkinson, G; Williams, M P; Williams, M; Wilson, F F; Wimberley, J; Wishahi, J; Wislicki, W; Witek, M; Wormser, G; Wotton, S A; Wright, S; Wu, S; Wyllie, K; Xie, Y; Xing, Z; Yang, Z; Yuan, X; Yushchenko, O; Zangoli, M; Zavertyaev, M; Zhang, F; Zhang, L; Zhang, W C; Zhang, Y; Zhelezov, A; Zhokhov, A; Zhong, L; Zvyagin, A

    The lifetime of the [Formula: see text] meson is measured using semileptonic decays having a [Formula: see text] meson and a muon in the final state. The data, corresponding to an integrated luminosity of [Formula: see text], are collected by the LHCb detector in [Formula: see text] collisions at a centre-of-mass energy of 8 TeV. The measured lifetime is [Formula: see text]where the first uncertainty is statistical and the second is systematic.

  14. Making School Development Credible. Text, Context, Irony

    Directory of Open Access Journals (Sweden)

    Mats Börjesson

    2012-01-01

    Full Text Available

    The article argues for the importance of an open, reflexive-methodological approach when switching between studying text, context and researcher activity. Close linguistic analysis can benefit from being linked with the researcher’s contextualisation of his empirical material as well as with more distanced readings. The more specific starting point for this article is that school development, like other similar terms such as school improvement and the like, makes use of linguistic building blocks with which whole narratives about today’s and tomorrow’s schools can be constructed. The subject of the study is a short text issued by the Swedish Schools Inspectorate (Skolinspektionen. Government language changes according to the authorities’ role in society and their own definitions of their functions, and an important aspect here is the legitimacy of the authorities’ texts. By means of various kinds of close linguistic analysis, the above-mentioned text is studied with regard to choice of categories, hierarchies of modalisation and the rhetorical effects of different types of formulations in a broader political-social landscape. The article concludes with a reflective discussion on the relationship between government language and irony as a stylistic device – a device that is based on the results of the close empirical analysis.[i]



    [i] The article is part of the project ”School  Development as Narrative”, funded by the Swedish Research Council. The author would like to thank the two reviewers for very valuable comments.

  15. DEXTER: Disease-Expression Relation Extraction from Text.

    Science.gov (United States)

    Gupta, Samir; Dingerdissen, Hayley; Ross, Karen E; Hu, Yu; Wu, Cathy H; Mazumder, Raja; Vijay-Shanker, K

    2018-01-01

    Gene expression levels affect biological processes and play a key role in many diseases. Characterizing expression profiles is useful for clinical research, and diagnostics and prognostics of diseases. There are currently several high-quality databases that capture gene expression information, obtained mostly from large-scale studies, such as microarray and next-generation sequencing technologies, in the context of disease. The scientific literature is another rich source of information on gene expression-disease relationships that not only have been captured from large-scale studies but have also been observed in thousands of small-scale studies. Expression information obtained from literature through manual curation can extend expression databases. While many of the existing databases include information from literature, they are limited by the time-consuming nature of manual curation and have difficulty keeping up with the explosion of publications in the biomedical field. In this work, we describe an automated text-mining tool, Disease-Expression Relation Extraction from Text (DEXTER) to extract information from literature on gene and microRNA expression in the context of disease. One of the motivations in developing DEXTER was to extend the BioXpress database, a cancer-focused gene expression database that includes data derived from large-scale experiments and manual curation of publications. The literature-based portion of BioXpress lags behind significantly compared to expression information obtained from large-scale studies and can benefit from our text-mined results. We have conducted two different evaluations to measure the accuracy of our text-mining tool and achieved average F-scores of 88.51 and 81.81% for the two evaluations, respectively. Also, to demonstrate the ability to extract rich expression information in different disease-related scenarios, we used DEXTER to extract information on differential expression information for 2024 genes in lung

  16. Text collections for evaluation of Russian morphological taggers

    Directory of Open Access Journals (Sweden)

    Lyashevskaya Olga

    2017-12-01

    Full Text Available The paper describes the preparation and development of the text collections within the framework of MorphoRuEval-2017 shared task, an evaluation campaign designed to stimulate development of the automatic morphological processing technologies for Russian. The main challenge for the organizers was to standardize all available Russian corpora with the manually verified high-quality tagging to a single format (Universal Dependencies CONLL-U. The sources of the data were the disambiguated subcorpus of the Russian National Corpus, SynTagRus, OpenCorpora.org data and GICR corpus with the resolved homonymy, all exhibiting different tagsets, rules for lemmatization, pipeline architecture, technical solutions and error systematicity. The collections includes both normative texts (the news and modern literature and more informal discourse (social media and spoken data, the texts are available under CC BY-NC-SA 3.0 license.

  17. A Sample Typology of Texts in Corporate Discourse

    Directory of Open Access Journals (Sweden)

    Jacek Kołata

    2009-11-01

    Full Text Available The subject matter of this article is to present a working typology of different texts existing in corporate discourse. The data for the following analysis are drawn from various groups of documents existing in Nestle Corporation. The division into categories was possible after highlighting the most discriminative features of the texts under investigation. Moreover, it gives me the possibility to reveal how texts are shaped by contexts in which they exist. Bearing the above in mind, we must not forget that written utterances are always influenced by different but closely related parameters, such as a sender, a recipient, a particular incident and an aim of the conversation – to be more precise they cannot exist independently. This paper attempts at pointing out the weakness and merits of the corporate discourse communication system in the described company and by doing so, facilitate the flow of information among all departments, employees and factories.

  18. Enhancing a Teen Pregnancy Prevention Program with text messaging: engaging minority youth to develop TOP ® Plus Text.

    Science.gov (United States)

    Devine, Sharon; Bull, Sheana; Dreisbach, Susan; Shlay, Judith

    2014-03-01

    To develop and pilot a theory-based, mobile phone texting component attractive to minority youth as a supplement to the Teen Outreach Program(®), a youth development program for reducing teen pregnancy and school dropout. We conducted iterative formative research with minority youth in multiple focus groups to explore interest in texting and reaction to text messages. We piloted a month-long version of TOP(®) Plus Text with 96 teens at four sites and conducted a computer-based survey immediately after enrollment and at the end of the pilot that collected information about teens' values, social support, self-efficacy, and behaviors relating to school performance, trouble with the law, and sexual activity. After each of the first three weekly sessions we collected satisfaction measures. Upon completion of the pilot we conducted exit interviews with twelve purposively selected pilot participants. We successfully recruited and enrolled minority youth into the pilot. Teens were enthusiastic about text messages complementing TOP(®). Results also revealed barriers: access to text-capable mobile phones, retention as measured by completion of the post-pilot survey, and a need to be attentive to teen literacy. Piloting helped identify improvements for implementation including offering text messages through multiple platforms so youth without access to a mobile phone could receive messages; rewording texts to allow youth to express opinions without feeling judged; and collecting multiple types of contact information to improve follow-up. Thoughtful attention to social and behavioral theory and investment in iterative formative research with extensive consultation with teens can lead to an engaging texting curriculum that enhances and complements TOP(®). Copyright © 2014 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.

  19. Deep Belief Networks Based Toponym Recognition for Chinese Text

    Directory of Open Access Journals (Sweden)

    Shu Wang

    2018-06-01

    Full Text Available In Geographical Information Systems, geo-coding is used for the task of mapping from implicitly geo-referenced data to explicitly geo-referenced coordinates. At present, an enormous amount of implicitly geo-referenced information is hidden in unstructured text, e.g., Wikipedia, social data and news. Toponym recognition is the foundation of mining this useful geo-referenced information by identifying words as toponyms in text. In this paper, we propose an adapted toponym recognition approach based on deep belief network (DBN by exploring two key issues: word representation and model interpretation. A Skip-Gram model is used in the word representation process to represent words with contextual information that are ignored by current word representation models. We then determine the core hyper-parameters of the DBN model by illustrating the relationship between the performance and the hyper-parameters, e.g., vector dimensionality, DBN structures and probability thresholds. The experiments evaluate the performance of the Skip-Gram model implemented by the Word2Vec open-source tool, determine stable hyper-parameters and compare our approach with a conditional random field (CRF based approach. The experimental results show that the DBN model outperforms the CRF model with smaller corpus. When the corpus size is large enough, their statistical metrics become approaching. However, their recognition results express differences and complementarity on different kinds of toponyms. More importantly, combining their results can directly improve the performance of toponym recognition relative to their individual performances. It seems that the scale of the corpus has an obvious effect on the performance of toponym recognition. Generally, there is no adequate tagged corpus on specific toponym recognition tasks, especially in the era of Big Data. In conclusion, we believe that the DBN-based approach is a promising and powerful method to extract geo

  20. Advanced text and video analytics for proactive decision making

    Science.gov (United States)

    Bowman, Elizabeth K.; Turek, Matt; Tunison, Paul; Porter, Reed; Thomas, Steve; Gintautas, Vadas; Shargo, Peter; Lin, Jessica; Li, Qingzhe; Gao, Yifeng; Li, Xiaosheng; Mittu, Ranjeev; Rosé, Carolyn Penstein; Maki, Keith; Bogart, Chris; Choudhari, Samrihdi Shree

    2017-05-01

    Today's warfighters operate in a highly dynamic and uncertain world, and face many competing demands. Asymmetric warfare and the new focus on small, agile forces has altered the framework by which time critical information is digested and acted upon by decision makers. Finding and integrating decision-relevant information is increasingly difficult in data-dense environments. In this new information environment, agile data algorithms, machine learning software, and threat alert mechanisms must be developed to automatically create alerts and drive quick response. Yet these advanced technologies must be balanced with awareness of the underlying context to accurately interpret machine-processed indicators and warnings and recommendations. One promising approach to this challenge brings together information retrieval strategies from text, video, and imagery. In this paper, we describe a technology demonstration that represents two years of tri-service research seeking to meld text and video for enhanced content awareness. The demonstration used multisource data to find an intelligence solution to a problem using a common dataset. Three technology highlights from this effort include 1) Incorporation of external sources of context into imagery normalcy modeling and anomaly detection capabilities, 2) Automated discovery and monitoring of targeted users from social media text, regardless of language, and 3) The concurrent use of text and imagery to characterize behaviour using the concept of kinematic and text motifs to detect novel and anomalous patterns. Our demonstration provided a technology baseline for exploiting heterogeneous data sources to deliver timely and accurate synopses of data that contribute to a dynamic and comprehensive worldview.