WorldWideScience

Sample records for text region extraction

  1. Key Frame Extraction for Text Based Video Retrieval Using Maximally Stable Extremal Regions

    Directory of Open Access Journals (Sweden)

    Werachard Wattanarachothai

    2015-04-01

    Full Text Available This paper presents a new approach for text-based video content retrieval system. The proposed scheme consists of three main processes that are key frame extraction, text localization and keyword matching. For the key-frame extraction, we proposed a Maximally Stable Extremal Region (MSER based feature which is oriented to segment shots of the video with different text contents. In text localization process, in order to form the text lines, the MSERs in each key frame are clustered based on their similarity in position, size, color, and stroke width. Then, Tesseract OCR engine is used for recognizing the text regions. In this work, to improve the recognition results, we input four images obtained from different pre-processing methods to Tesseract engine. Finally, the target keyword for querying is matched with OCR results based on an approximate string search scheme. The experiment shows that, by using the MSER feature, the videos can be segmented by using efficient number of shots and provide the better precision and recall in comparison with a sum of absolute difference and edge based method.

  2. Metadata extraction using text mining.

    Science.gov (United States)

    Seth, Shivani; Rüping, Stefan; Wrobel, Stefan

    2009-01-01

    Grid technologies have proven to be very successful in the area of eScience, and healthcare in particular, because they allow to easily combine proven solutions for data querying, integration, and analysis into a secure, scalable framework. In order to integrate the services that implement these solutions into a given Grid architecture, some metadata is required, for example information about the low-level access to these services, security information, and some documentation for the user. In this paper, we investigate how relevant metadata can be extracted from a semi-structured textual documentation of the algorithm that is underlying the service, by the use of text mining methods. In particular, we investigate the semi-automatic conversion of functions of the statistical environment R into Grid services as implemented by the GridR tool by the generation of appropriate metadata.

  3. Figure text extraction in biomedical literature.

    Directory of Open Access Journals (Sweden)

    Daehyun Kim

    Full Text Available BACKGROUND: Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engine (http://figuresearch.askHERMES.org to allow bioscientists to access figures efficiently. Since text frequently appears in figures, automatically extracting such text may assist the task of mining information from figures. Little research, however, has been conducted exploring text extraction from biomedical figures. METHODOLOGY: We first evaluated an off-the-shelf Optical Character Recognition (OCR tool on its ability to extract text from figures appearing in biomedical full-text articles. We then developed a Figure Text Extraction Tool (FigTExT to improve the performance of the OCR tool for figure text extraction through the use of three innovative components: image preprocessing, character recognition, and text correction. We first developed image preprocessing to enhance image quality and to improve text localization. Then we adapted the off-the-shelf OCR tool on the improved text localization for character recognition. Finally, we developed and evaluated a novel text correction framework by taking advantage of figure-specific lexicons. RESULTS/CONCLUSIONS: The evaluation on 382 figures (9,643 figure texts in total randomly selected from PubMed Central full-text articles shows that FigTExT performed with 84% precision, 98% recall, and 90% F1-score for text localization and with 62.5% precision, 51.0% recall and 56.2% F1-score for figure text extraction. When limiting figure texts to those judged by domain experts to be important content, FigTExT performed with 87.3% precision, 68.8% recall, and 77% F1-score. FigTExT significantly improved the performance of the off-the-shelf OCR tool we used, which on its own performed with 36

  4. Ontology Assisted Formal Specification Extraction from Text

    Directory of Open Access Journals (Sweden)

    Andreea Mihis

    2010-12-01

    Full Text Available In the field of knowledge processing, the ontologies are the most important mean. They make possible for the computer to understand better the natural language and to make judgments. In this paper, a method which use ontologies in the semi-automatic extraction of formal specifications from a natural language text is proposed.

  5. Unsupervised information extraction by text segmentation

    CERN Document Server

    Cortez, Eli

    2013-01-01

    A new unsupervised approach to the problem of Information Extraction by Text Segmentation (IETS) is proposed, implemented and evaluated herein. The authors' approach relies on information available on pre-existing data to learn how to associate segments in the input string with attributes of a given domain relying on a very effective set of content-based features. The effectiveness of the content-based features is also exploited to directly learn from test data structure-based features, with no previous human-driven training, a feature unique to the presented approach. Based on the approach, a

  6. Extraction of information from unstructured text

    Energy Technology Data Exchange (ETDEWEB)

    Irwin, N.H.; DeLand, S.M.; Crowder, S.V.

    1995-11-01

    Extracting information from unstructured text has become an emphasis in recent years due to the large amount of text now electronically available. This status report describes the findings and work done by the end of the first year of a two-year LDRD. Requirements of the approach included that it model the information in a domain independent way. This means that it would differ from current systems by not relying on previously built domain knowledge and that it would do more than keyword identification. Three areas that are discussed and expected to contribute to a solution include (1) identifying key entities through document level profiling and preprocessing, (2) identifying relationships between entities through sentence level syntax, and (3) combining the first two with semantic knowledge about the terms.

  7. Terminology extraction from medical texts in Polish.

    Science.gov (United States)

    Marciniak, Małgorzata; Mykowiecka, Agnieszka

    2014-01-01

    Hospital documents contain free text describing the most important facts relating to patients and their illnesses. These documents are written in specific language containing medical terminology related to hospital treatment. Their automatic processing can help in verifying the consistency of hospital documentation and obtaining statistical data. To perform this task we need information on the phrases we are looking for. At the moment, clinical Polish resources are sparse. The existing terminologies, such as Polish Medical Subject Headings (MeSH), do not provide sufficient coverage for clinical tasks. It would be helpful therefore if it were possible to automatically prepare, on the basis of a data sample, an initial set of terms which, after manual verification, could be used for the purpose of information extraction. Using a combination of linguistic and statistical methods for processing over 1200 children hospital discharge records, we obtained a list of single and multiword terms used in hospital discharge documents written in Polish. The phrases are ordered according to their presumed importance in domain texts measured by the frequency of use of a phrase and the variety of its contexts. The evaluation showed that the automatically identified phrases cover about 84% of terms in domain texts. At the top of the ranked list, only 4% out of 400 terms were incorrect while out of the final 200, 20% of expressions were either not domain related or syntactically incorrect. We also observed that 70% of the obtained terms are not included in the Polish MeSH. Automatic terminology extraction can give results which are of a quality high enough to be taken as a starting point for building domain related terminological dictionaries or ontologies. This approach can be useful for preparing terminological resources for very specific subdomains for which no relevant terminologies already exist. The evaluation performed showed that none of the tested ranking procedures were

  8. Extracting Conceptual Feature Structures from Text

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik; Lassen, Tine

    2011-01-01

    This paper describes an approach to indexing texts by their conceptual content using ontologies along with lexico-syntactic information and semantic role assignment provided by lexical resources. The conceptual content of meaningful chunks of text is transformed into conceptual feature structures...... and mapped into concepts in a generative ontology. Synonymous but linguistically quite distinct expressions are mapped to the same concept in the ontology. This allows us to perform a content-based search which will retrieve relevant documents independently of the linguistic form of the query as well...

  9. Automatic extraction of angiogenesis bioprocess from text

    Science.gov (United States)

    Wang, Xinglong; McKendrick, Iain; Barrett, Ian; Dix, Ian; French, Tim; Tsujii, Jun'ichi; Ananiadou, Sophia

    2011-01-01

    Motivation: Understanding key biological processes (bioprocesses) and their relationships with constituent biological entities and pharmaceutical agents is crucial for drug design and discovery. One way to harvest such information is searching the literature. However, bioprocesses are difficult to capture because they may occur in text in a variety of textual expressions. Moreover, a bioprocess is often composed of a series of bioevents, where a bioevent denotes changes to one or a group of cells involved in the bioprocess. Such bioevents are often used to refer to bioprocesses in text, which current techniques, relying solely on specialized lexicons, struggle to find. Results: This article presents a range of methods for finding bioprocess terms and events. To facilitate the study, we built a gold standard corpus in which terms and events related to angiogenesis, a key biological process of the growth of new blood vessels, were annotated. Statistics of the annotated corpus revealed that over 36% of the text expressions that referred to angiogenesis appeared as events. The proposed methods respectively employed domain-specific vocabularies, a manually annotated corpus and unstructured domain-specific documents. Evaluation results showed that, while a supervised machine-learning model yielded the best precision, recall and F1 scores, the other methods achieved reasonable performance and less cost to develop. Availability: The angiogenesis vocabularies, gold standard corpus, annotation guidelines and software described in this article are available at http://text0.mib.man.ac.uk/~mbassxw2/angiogenesis/ Contact: xinglong.wang@gmail.com PMID:21821664

  10. Text Character Extraction Implementation from Captured Handwritten Image to Text Conversionusing Template Matching Technique

    Directory of Open Access Journals (Sweden)

    Barate Seema

    2016-01-01

    Full Text Available Images contain various types of useful information that should be extracted whenever required. A various algorithms and methods are proposed to extract text from the given image, and by using that user will be able to access the text from any image. Variations in text may occur because of differences in size, style,orientation, alignment of text, and low image contrast, composite backgrounds make the problem during extraction of text. If we develop an application that extracts and recognizes those texts accurately in real time, then it can be applied to many important applications like document analysis, vehicle license plate extraction, text- based image indexing, etc and many applications have become realities in recent years. To overcome the above problems we develop such application that will convert the image into text by using algorithms, such as bounding box, HSV model, blob analysis,template matching, template generation.

  11. Methods for Evaluating Text Extraction Toolkits: An Exploratory Investigation

    Science.gov (United States)

    2015-01-22

    SNAPSHOT extracted “a b c d f”. Borrowing technical terms from the field of corpus linguistics , we would say that the text extracted by Tika 1.5 had 8...Although this effort focuses on the popular open source Apache Tika toolkit and the govdocs1 corpus , the method generally applies to other text...a text extraction toolkit. Although this effort focuses on the popular open source Apache Tika toolkit and the govdocs1 corpus , the method generally

  12. Text feature extraction based on deep learning: a review.

    Science.gov (United States)

    Liang, Hong; Sun, Xiao; Sun, Yunlei; Gao, Yuan

    2017-01-01

    Selection of text feature item is a basic and important matter for text mining and information retrieval. Traditional methods of feature extraction require handcrafted features. To hand-design, an effective feature is a lengthy process, but aiming at new applications, deep learning enables to acquire new effective feature representation from training data. As a new feature extraction method, deep learning has made achievements in text mining. The major difference between deep learning and conventional methods is that deep learning automatically learns features from big data, instead of adopting handcrafted features, which mainly depends on priori knowledge of designers and is highly impossible to take the advantage of big data. Deep learning can automatically learn feature representation from big data, including millions of parameters. This thesis outlines the common methods used in text feature extraction first, and then expands frequently used deep learning methods in text feature extraction and its applications, and forecasts the application of deep learning in feature extraction.

  13. Layout-aware text extraction from full-text PDF of scientific articles.

    Science.gov (United States)

    Ramakrishnan, Cartic; Patnia, Abhishek; Hovy, Eduard; Burns, Gully Apc

    2012-05-28

    The Portable Document Format (PDF) is the most commonly used file format for online scientific publications. The absence of effective means to extract text from these PDF files in a layout-aware manner presents a significant challenge for developers of biomedical text mining or biocuration informatics systems that use published literature as an information source. In this paper we introduce the 'Layout-Aware PDF Text Extraction' (LA-PDFText) system to facilitate accurate extraction of text from PDF files of research articles for use in text mining applications. Our paper describes the construction and performance of an open source system that extracts text blocks from PDF-formatted full-text research articles and classifies them into logical units based on rules that characterize specific sections. The LA-PDFText system focuses only on the textual content of the research articles and is meant as a baseline for further experiments into more advanced extraction methods that handle multi-modal content, such as images and graphs. The system works in a three-stage process: (1) Detecting contiguous text blocks using spatial layout processing to locate and identify blocks of contiguous text, (2) Classifying text blocks into rhetorical categories using a rule-based method and (3) Stitching classified text blocks together in the correct order resulting in the extraction of text from section-wise grouped blocks. We show that our system can identify text blocks and classify them into rhetorical categories with Precision1 = 0.96% Recall = 0.89% and F1 = 0.91%. We also present an evaluation of the accuracy of the block detection algorithm used in step 2. Additionally, we have compared the accuracy of the text extracted by LA-PDFText to the text from the Open Access subset of PubMed Central. We then compared this accuracy with that of the text extracted by the PDF2Text system, 2commonly used to extract text from PDF. Finally, we discuss preliminary error analysis for

  14. Layout-aware text extraction from full-text PDF of scientific articles

    Directory of Open Access Journals (Sweden)

    Ramakrishnan Cartic

    2012-05-01

    Full Text Available Abstract Background The Portable Document Format (PDF is the most commonly used file format for online scientific publications. The absence of effective means to extract text from these PDF files in a layout-aware manner presents a significant challenge for developers of biomedical text mining or biocuration informatics systems that use published literature as an information source. In this paper we introduce the ‘Layout-Aware PDF Text Extraction’ (LA-PDFText system to facilitate accurate extraction of text from PDF files of research articles for use in text mining applications. Results Our paper describes the construction and performance of an open source system that extracts text blocks from PDF-formatted full-text research articles and classifies them into logical units based on rules that characterize specific sections. The LA-PDFText system focuses only on the textual content of the research articles and is meant as a baseline for further experiments into more advanced extraction methods that handle multi-modal content, such as images and graphs. The system works in a three-stage process: (1 Detecting contiguous text blocks using spatial layout processing to locate and identify blocks of contiguous text, (2 Classifying text blocks into rhetorical categories using a rule-based method and (3 Stitching classified text blocks together in the correct order resulting in the extraction of text from section-wise grouped blocks. We show that our system can identify text blocks and classify them into rhetorical categories with Precision1 = 0.96% Recall = 0.89% and F1 = 0.91%. We also present an evaluation of the accuracy of the block detection algorithm used in step 2. Additionally, we have compared the accuracy of the text extracted by LA-PDFText to the text from the Open Access subset of PubMed Central. We then compared this accuracy with that of the text extracted by the PDF2Text system, 2commonly used to extract text from PDF

  15. Automatic Definition Extraction and Crossword Generation From Spanish News Text

    Directory of Open Access Journals (Sweden)

    Jennifer Esteche

    2017-08-01

    Full Text Available This paper describes the design and implementation of a system that takes Spanish texts and generates crosswords (board and definitions in a fully automatic way using definitions extracted from those texts. Our solution divides the problem in two parts: a definition extraction module that applies pattern matching implemented in Python, and a crossword generation module that uses a greedy strategy implemented in Prolog. The system achieves 73% precision and builds crosswords similar to those built by humans.

  16. A COMPREHENSIVE STUDY ON TEXT INFORMATION EXTRACTION FROM NATURAL SCENE IMAGES

    Directory of Open Access Journals (Sweden)

    Anit V. Manjaly

    2016-08-01

    Full Text Available In Text Information Extraction (TIE process, the text regions are localized and extracted from the images. It is an active research problem in computer vision applications. Diversity in text is due to the differences in size, style, orientation, alignment of text, low image contrast and complex backgrounds. The semantic information provided by an image can be used in different applications such as content based image retrieval, sign board identification etc. Text information extraction comprises of text image classification, text detection, localization, segmentation, enhancement and recognition. This paper contains a quick review on various text localization methods for localizing texts from natural scene images.

  17. PDF text classification to leverage information extraction from publication reports.

    Science.gov (United States)

    Bui, Duy Duc An; Del Fiol, Guilherme; Jonnalagadda, Siddhartha

    2016-06-01

    Data extraction from original study reports is a time-consuming, error-prone process in systematic review development. Information extraction (IE) systems have the potential to assist humans in the extraction task, however majority of IE systems were not designed to work on Portable Document Format (PDF) document, an important and common extraction source for systematic review. In a PDF document, narrative content is often mixed with publication metadata or semi-structured text, which add challenges to the underlining natural language processing algorithm. Our goal is to categorize PDF texts for strategic use by IE systems. We used an open-source tool to extract raw texts from a PDF document and developed a text classification algorithm that follows a multi-pass sieve framework to automatically classify PDF text snippets (for brevity, texts) into TITLE, ABSTRACT, BODYTEXT, SEMISTRUCTURE, and METADATA categories. To validate the algorithm, we developed a gold standard of PDF reports that were included in the development of previous systematic reviews by the Cochrane Collaboration. In a two-step procedure, we evaluated (1) classification performance, and compared it with machine learning classifier, and (2) the effects of the algorithm on an IE system that extracts clinical outcome mentions. The multi-pass sieve algorithm achieved an accuracy of 92.6%, which was 9.7% (pPDF documents. Text classification is an important prerequisite step to leverage information extraction from PDF documents. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Rational kernels for Arabic Root Extraction and Text Classification

    Directory of Open Access Journals (Sweden)

    Attia Nehar

    2016-04-01

    Full Text Available In this paper, we address the problems of Arabic Text Classification and root extraction using transducers and rational kernels. We introduce a new root extraction approach on the basis of the use of Arabic patterns (Pattern Based Stemmer. Transducers are used to model these patterns and root extraction is done without relying on any dictionary. Using transducers for extracting roots, documents are transformed into finite state transducers. This document representation allows us to use and explore rational kernels as a framework for Arabic Text Classification. Root extraction experiments are conducted on three word collections and yield 75.6% of accuracy. Classification experiments are done on the Saudi Press Agency dataset and N-gram kernels are tested with different values of N. Accuracy and F1 report 90.79% and 62.93% respectively. These results show that our approach, when compared with other approaches, is promising specially in terms of accuracy and F1.

  19. Text extraction method for historical Tibetan document images based on block projections

    Science.gov (United States)

    Duan, Li-juan; Zhang, Xi-qun; Ma, Long-long; Wu, Jian

    2017-11-01

    Text extraction is an important initial step in digitizing the historical documents. In this paper, we present a text extraction method for historical Tibetan document images based on block projections. The task of text extraction is considered as text area detection and location problem. The images are divided equally into blocks and the blocks are filtered by the information of the categories of connected components and corner point density. By analyzing the filtered blocks' projections, the approximate text areas can be located, and the text regions are extracted. Experiments on the dataset of historical Tibetan documents demonstrate the effectiveness of the proposed method.

  20. AViTExt: Automatic Video Text Extraction, A new Approach for video content indexing Application

    OpenAIRE

    Bouaziz, Baseem; Zlitni, Tarek; Walid MAHDI

    2013-01-01

    In this paper, we propose a spatial temporal video-text detection technique which proceed in two principal steps:potential text region detection and a filtering process. In the first step we divide dynamically each pair of consecutive video frames into sub block in order to detect change. A significant difference between homologous blocks implies the appearance of an important object which may be a text region. The temporal redundancy is then used to filter these regions and forms an effectiv...

  1. Information Extraction from Unstructured Text for the Biodefense Knowledge Center

    Energy Technology Data Exchange (ETDEWEB)

    Samatova, N F; Park, B; Krishnamurthy, R; Munavalli, R; Symons, C; Buttler, D J; Cottom, T; Critchlow, T J; Slezak, T

    2005-04-29

    The Bio-Encyclopedia at the Biodefense Knowledge Center (BKC) is being constructed to allow an early detection of emerging biological threats to homeland security. It requires highly structured information extracted from variety of data sources. However, the quantity of new and vital information available from every day sources cannot be assimilated by hand, and therefore reliable high-throughput information extraction techniques are much anticipated. In support of the BKC, Lawrence Livermore National Laboratory and Oak Ridge National Laboratory, together with the University of Utah, are developing an information extraction system built around the bioterrorism domain. This paper reports two important pieces of our effort integrated in the system: key phrase extraction and semantic tagging. Whereas two key phrase extraction technologies developed during the course of project help identify relevant texts, our state-of-the-art semantic tagging system can pinpoint phrases related to emerging biological threats. Also we are enhancing and tailoring the Bio-Encyclopedia by augmenting semantic dictionaries and extracting details of important events, such as suspected disease outbreaks. Some of these technologies have already been applied to large corpora of free text sources vital to the BKC mission, including ProMED-mail, PubMed abstracts, and the DHS's Information Analysis and Infrastructure Protection (IAIP) news clippings. In order to address the challenges involved in incorporating such large amounts of unstructured text, the overall system is focused on precise extraction of the most relevant information for inclusion in the BKC.

  2. Enhancing biomedical text summarization using semantic relation extraction.

    Directory of Open Access Journals (Sweden)

    Yue Shang

    Full Text Available Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1 We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2 We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3 For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.

  3. Document Exploration and Automatic Knowledge Extraction for Unstructured Biomedical Text

    Science.gov (United States)

    Chu, S.; Totaro, G.; Doshi, N.; Thapar, S.; Mattmann, C. A.; Ramirez, P.

    2015-12-01

    We describe our work on building a web-browser based document reader with built-in exploration tool and automatic concept extraction of medical entities for biomedical text. Vast amounts of biomedical information are offered in unstructured text form through scientific publications and R&D reports. Utilizing text mining can help us to mine information and extract relevant knowledge from a plethora of biomedical text. The ability to employ such technologies to aid researchers in coping with information overload is greatly desirable. In recent years, there has been an increased interest in automatic biomedical concept extraction [1, 2] and intelligent PDF reader tools with the ability to search on content and find related articles [3]. Such reader tools are typically desktop applications and are limited to specific platforms. Our goal is to provide researchers with a simple tool to aid them in finding, reading, and exploring documents. Thus, we propose a web-based document explorer, which we called Shangri-Docs, which combines a document reader with automatic concept extraction and highlighting of relevant terms. Shangri-Docsalso provides the ability to evaluate a wide variety of document formats (e.g. PDF, Words, PPT, text, etc.) and to exploit the linked nature of the Web and personal content by performing searches on content from public sites (e.g. Wikipedia, PubMed) and private cataloged databases simultaneously. Shangri-Docsutilizes Apache cTAKES (clinical Text Analysis and Knowledge Extraction System) [4] and Unified Medical Language System (UMLS) to automatically identify and highlight terms and concepts, such as specific symptoms, diseases, drugs, and anatomical sites, mentioned in the text. cTAKES was originally designed specially to extract information from clinical medical records. Our investigation leads us to extend the automatic knowledge extraction process of cTAKES for biomedical research domain by improving the ontology guided information extraction

  4. Extracting biomedical events from pairs of text entities.

    Science.gov (United States)

    Liu, Xiao; Bordes, Antoine; Grandvalet, Yves

    2015-01-01

    Huge amounts of electronic biomedical documents, such as molecular biology reports or genomic papers are generated daily. Nowadays, these documents are mainly available in the form of unstructured free texts, which require heavy processing for their registration into organized databases. This organization is instrumental for information retrieval, enabling to answer the advanced queries of researchers and practitioners in biology, medicine, and related fields. Hence, the massive data flow calls for efficient automatic methods of text-mining that extract high-level information, such as biomedical events, from biomedical text. The usual computational tools of Natural Language Processing cannot be readily applied to extract these biomedical events, due to the peculiarities of the domain. Indeed, biomedical documents contain highly domain-specific jargon and syntax. These documents also describe distinctive dependencies, making text-mining in molecular biology a specific discipline. We address biomedical event extraction as the classification of pairs of text entities into the classes corresponding to event types. The candidate pairs of text entities are recursively provided to a multiclass classifier relying on Support Vector Machines. This recursive process extracts events involving other events as arguments. Compared to joint models based on Markov Random Fields, our model simplifies inference and hence requires shorter training and prediction times along with lower memory capacity. Compared to usual pipeline approaches, our model passes over a complex intermediate problem, while making a more extensive usage of sophisticated joint features between text entities. Our method focuses on the core event extraction of the Genia task of BioNLP challenges yielding the best result reported so far on the 2013 edition.

  5. Mining knowledge from text repositories using information extraction ...

    Indian Academy of Sciences (India)

    Computational Linguistics, Stroudsburg, PA, USA, pp 66–73. Rose S, Engel D, Cramer N and Cowley W 2010 Automatic keyword extraction from individual document,. Text mining: Application and theory, M W Berry and J Kogan (eds) John Willey & Sons Ltd 2010, pp 3–20. Sánchez D, Martín-Bautista M J and Blanco I 2008 ...

  6. Enhanced root extraction and document classification algorithm for Arabic text

    OpenAIRE

    Alsaad, Amal

    2016-01-01

    This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London Many text extraction and classification systems have been developed for English and other international languages; most of the languages are based on Roman letters. However, Arabic language is one of the difficult languages which have special rules and morphology. Not many systems have been developed for Arabic text categorization. Arabic language is one of the Semitic languages with...

  7. METHOD OF RARE TERM CONTRASTIVE EXTRACTION FROM NATURAL LANGUAGE TEXTS

    Directory of Open Access Journals (Sweden)

    I. A. Bessmertny

    2017-01-01

    Full Text Available The paper considers a problem of automatic domain term extraction from documents corpus by means of a contrast collection. Existing contrastive methods successfully extract often used terms but mishandle rare terms. This could yield poorness of the resulting thesaurus. Assessment of point-wise mutual information is one of the known statistical methods of term extraction and it finds rare terms successfully. Although, it extracts many false terms at that. The proposed approach consists of point-wise mutual information application for rare terms extraction and filtering of candidates by criterion of joint occurrence with the other candidates. We build “documents-by-terms” matrix that is subjected to singular value decomposition to eliminate noise and reveal strong interconnections. Then we pass on to the resulting matrix “terms-by-terms” that reproduces strength of interconnections between words. This approach was approved on a documents collection from “Geology” domain with the use of contrast documents from such topics as “Politics”, “Culture”, “Economics” and “Accidents” on some Internet resources. The experimental results demonstrate operability of this method for rare terms extraction.

  8. Web text corpus extraction system for linguistic tasks

    Directory of Open Access Journals (Sweden)

    Héctor Fabio Cadavid Rengifo

    2010-05-01

    Full Text Available Internet content, used as text corpus for natural language learning, offers important characteristics for such task, like its huge vo- lume, being permanently up-to-date with linguistic variants and having low time and resource costs regarding the traditional way that text is built for natural language machine learning tasks. This paper describes a system for the automatic extraction of large bodies of text from the Internet as a valuable tool for such learning tasks. A concurrent programming-based, hardware-use opti- misation strategy significantly improving extraction performance is also presented. The strategies incorporated into the system for maximising hardware resource exploitation, thereby reducing extraction time are presented, as are extendibility (supporting digi- tal-content formats and adaptability (regarding how the system cleanses content for obtaining pure natural language samples. The experimental results obtained after processing one of the biggest Spanish domains on the internet, are presented (i.e. es.wikipedia.org. Such results are used for presenting initial conclusions about the validity and applicability of corpus directly ex- tracted from Internet as morphological or syntactical learning input.

  9. NAMED ENTITY RECOGNITION FROM BIOMEDICAL TEXT -AN INFORMATION EXTRACTION TASK

    Directory of Open Access Journals (Sweden)

    N. Kanya

    2016-07-01

    Full Text Available Biomedical Text Mining targets the Extraction of significant information from biomedical archives. Bio TM encompasses Information Retrieval (IR and Information Extraction (IE. The Information Retrieval will retrieve the relevant Biomedical Literature documents from the various Repositories like PubMed, MedLine etc., based on a search query. The IR Process ends up with the generation of corpus with the relevant document retrieved from the Publication databases based on the query. The IE task includes the process of Preprocessing of the document, Named Entity Recognition (NER from the documents and Relationship Extraction. This process includes Natural Language Processing, Data Mining techniques and machine Language algorithm. The preprocessing task includes tokenization, stop word Removal, shallow parsing, and Parts-Of-Speech tagging. NER phase involves recognition of well-defined objects such as genes, proteins or cell-lines etc. This process leads to the next phase that is extraction of relationships (IE. The work was based on machine learning algorithm Conditional Random Field (CRF.

  10. Regional Geography Texts on Latin America: A Review.

    Science.gov (United States)

    Williams, Lynden S.

    1980-01-01

    Provides an evaluation of regional texts on Latin America by means of analyzing opinions of 45 Latin Americanist geographers. Findings are presented in three categories--highly rated texts, intermediately rated texts, and others. (DB)

  11. Basic Test Framework for the Evaluation of Text Line Segmentation and Text Parameter Extraction

    Directory of Open Access Journals (Sweden)

    Darko Brodić

    2010-05-01

    Full Text Available Text line segmentation is an essential stage in off-line optical character recognition (OCR systems. It is a key because inaccurately segmented text lines will lead to OCR failure. Text line segmentation of handwritten documents is a complex and diverse problem, complicated by the nature of handwriting. Hence, text line segmentation is a leading challenge in handwritten document image processing. Due to inconsistencies in measurement and evaluation of text segmentation algorithm quality, some basic set of measurement methods is required. Currently, there is no commonly accepted one and all algorithm evaluation is custom oriented. In this paper, a basic test framework for the evaluation of text feature extraction algorithms is proposed. This test framework consists of a few experiments primarily linked to text line segmentation, skew rate and reference text line evaluation. Although they are mutually independent, the results obtained are strongly cross linked. In the end, its suitability for different types of letters and languages as well as its adaptability are its main advantages. Thus, the paper presents an efficient evaluation method for text analysis algorithms.

  12. Using Text Mining for Unsupervised Knowledge Extraction and Organization

    Directory of Open Access Journals (Sweden)

    REZENDE, S. O.

    2011-06-01

    Full Text Available The progress in digitally generated data aquisition and storage has allowed for a huge growth in information generated in organizations. Around 80% ofthose data are created in non structured format and a significant part of those are texts. Intelligent organization of those textual collection is a matter of interest for most organizations, for it speed up information search and retrieval. In this context, Text Mining can transform this great amount non structure text data un useful knowledge, that can even be innovative for those organizations. Using unsupervised methods for knowledge extraction and organization has received great attention in literature, because it does not require previous knowledge on the textual collections that are going to be explored. In this article we describe the main techniques and algorithms used for unsupervised knowledege extraction and organization from textual data. The most relevant works in literature are presented and discussed in each phase of the Text Mining process and some existing computational tools are suggested for each task at hand. At last, some examples and applications are present to show the use of Text Mining on real problems.

  13. Extracting BI-RADS Features from Portuguese Clinical Texts

    Science.gov (United States)

    Nassif, Houssam; Cunha, Filipe; Moreira, Inês C.; Cruz-Correia, Ricardo; Sousa, Eliana; Page, David; Burnside, Elizabeth; Dutra, Inês

    2013-01-01

    In this work we build the first BI-RADS parser for Portuguese free texts, modeled after existing approaches to extract BI-RADS features from English medical records. Our concept finder uses a semantic grammar based on the BIRADS lexicon and on iterative transferred expert knowledge. We compare the performance of our algorithm to manual annotation by a specialist in mammography. Our results show that our parser’s performance is comparable to the manual method. PMID:23797461

  14. Basic test framework for the evaluation of text line segmentation and text parameter extraction.

    Science.gov (United States)

    Brodić, Darko; Milivojević, Dragan R; Milivojević, Zoran

    2010-01-01

    Text line segmentation is an essential stage in off-line optical character recognition (OCR) systems. It is a key because inaccurately segmented text lines will lead to OCR failure. Text line segmentation of handwritten documents is a complex and diverse problem, complicated by the nature of handwriting. Hence, text line segmentation is a leading challenge in handwritten document image processing. Due to inconsistencies in measurement and evaluation of text segmentation algorithm quality, some basic set of measurement methods is required. Currently, there is no commonly accepted one and all algorithm evaluation is custom oriented. In this paper, a basic test framework for the evaluation of text feature extraction algorithms is proposed. This test framework consists of a few experiments primarily linked to text line segmentation, skew rate and reference text line evaluation. Although they are mutually independent, the results obtained are strongly cross linked. In the end, its suitability for different types of letters and languages as well as its adaptability are its main advantages. Thus, the paper presents an efficient evaluation method for text analysis algorithms.

  15. Automatic extraction of ontological relations from Arabic text

    Directory of Open Access Journals (Sweden)

    Mohammed G.H. Al Zamil

    2014-12-01

    The proposed methodology has been designed to analyze Arabic text using lexical semantic patterns of the Arabic language according to a set of features. Next, the features have been abstracted and enriched with formal descriptions for the purpose of generalizing the resulted rules. The rules, then, have formulated a classifier that accepts Arabic text, analyzes it, and then displays related concepts labeled with its designated relationship. Moreover, to resolve the ambiguity of homonyms, a set of machine translation, text mining, and part of speech tagging algorithms have been reused. We performed extensive experiments to measure the effectiveness of our proposed tools. The results indicate that our proposed methodology is promising for automating the process of extracting ontological relations.

  16. Automatic extraction of relations between medical concepts in clinical texts.

    Science.gov (United States)

    Rink, Bryan; Harabagiu, Sanda; Roberts, Kirk

    2011-01-01

    A supervised machine learning approach to discover relations between medical problems, treatments, and tests mentioned in electronic medical records. A single support vector machine classifier was used to identify relations between concepts and to assign their semantic type. Several resources such as Wikipedia, WordNet, General Inquirer, and a relation similarity metric inform the classifier. The techniques reported in this paper were evaluated in the 2010 i2b2 Challenge and obtained the highest F1 score for the relation extraction task. When gold standard data for concepts and assertions were available, F1 was 73.7, precision was 72.0, and recall was 75.3. F1 is defined as 2*Precision*Recall/(Precision+Recall). Alternatively, when concepts and assertions were discovered automatically, F1 was 48.4, precision was 57.6, and recall was 41.7. Although a rich set of features was developed for the classifiers presented in this paper, little knowledge mining was performed from medical ontologies such as those found in UMLS. Future studies should incorporate features extracted from such knowledge sources, which we expect to further improve the results. Moreover, each relation discovery was treated independently. Joint classification of relations may further improve the quality of results. Also, joint learning of the discovery of concepts, assertions, and relations may also improve the results of automatic relation extraction. Lexical and contextual features proved to be very important in relation extraction from medical texts. When they are not available to the classifier, the F1 score decreases by 3.7%. In addition, features based on similarity contribute to a decrease of 1.1% when they are not available.

  17. Domain-independent information extraction in unstructured text

    Energy Technology Data Exchange (ETDEWEB)

    Irwin, N.H. [Sandia National Labs., Albuquerque, NM (United States). Software Surety Dept.

    1996-09-01

    Extracting information from unstructured text has become an important research area in recent years due to the large amount of text now electronically available. This status report describes the findings and work done during the second year of a two-year Laboratory Directed Research and Development Project. Building on the first-year`s work of identifying important entities, this report details techniques used to group words into semantic categories and to output templates containing selective document content. Using word profiles and category clustering derived during a training run, the time-consuming knowledge-building task can be avoided. Though the output still lacks in completeness when compared to systems with domain-specific knowledge bases, the results do look promising. The two approaches are compatible and could complement each other within the same system. Domain-independent approaches retain appeal as a system that adapts and learns will soon outpace a system with any amount of a priori knowledge.

  18. An automatic system to detect and extract texts in medical images for de-identification

    Science.gov (United States)

    Zhu, Yingxuan; Singh, P. D.; Siddiqui, Khan; Gillam, Michael

    2010-03-01

    Recently, there is an increasing need to share medical images for research purpose. In order to respect and preserve patient privacy, most of the medical images are de-identified with protected health information (PHI) before research sharing. Since manual de-identification is time-consuming and tedious, so an automatic de-identification system is necessary and helpful for the doctors to remove text from medical images. A lot of papers have been written about algorithms of text detection and extraction, however, little has been applied to de-identification of medical images. Since the de-identification system is designed for end-users, it should be effective, accurate and fast. This paper proposes an automatic system to detect and extract text from medical images for de-identification purposes, while keeping the anatomic structures intact. First, considering the text have a remarkable contrast with the background, a region variance based algorithm is used to detect the text regions. In post processing, geometric constraints are applied to the detected text regions to eliminate over-segmentation, e.g., lines and anatomic structures. After that, a region based level set method is used to extract text from the detected text regions. A GUI for the prototype application of the text detection and extraction system is implemented, which shows that our method can detect most of the text in the images. Experimental results validate that our method can detect and extract text in medical images with a 99% recall rate. Future research of this system includes algorithm improvement, performance evaluation, and computation optimization.

  19. Automated Extraction of Substance Use Information from Clinical Texts.

    Science.gov (United States)

    Wang, Yan; Chen, Elizabeth S; Pakhomov, Serguei; Arsoniadis, Elliot; Carter, Elizabeth W; Lindemann, Elizabeth; Sarkar, Indra Neil; Melton, Genevieve B

    2015-01-01

    Within clinical discourse, social history (SH) includes important information about substance use (alcohol, drug, and nicotine use) as key risk factors for disease, disability, and mortality. In this study, we developed and evaluated a natural language processing (NLP) system for automated detection of substance use statements and extraction of substance use attributes (e.g., temporal and status) based on Stanford Typed Dependencies. The developed NLP system leveraged linguistic resources and domain knowledge from a multi-site social history study, Propbank and the MiPACQ corpus. The system attained F-scores of 89.8, 84.6 and 89.4 respectively for alcohol, drug, and nicotine use statement detection, as well as average F-scores of 82.1, 90.3, 80.8, 88.7, 96.6, and 74.5 respectively for extraction of attributes. Our results suggest that NLP systems can achieve good performance when augmented with linguistic resources and domain knowledge when applied to a wide breadth of substance use free text clinical notes.

  20. Text Mining approaches for automated literature knowledge extraction and representation.

    Science.gov (United States)

    Nuzzo, Angelo; Mulas, Francesca; Gabetta, Matteo; Arbustini, Eloisa; Zupan, Blaz; Larizza, Cristiana; Bellazzi, Riccardo

    2010-01-01

    Due to the overwhelming volume of published scientific papers, information tools for automated literature analysis are essential to support current biomedical research. We have developed a knowledge extraction tool to help researcher in discovering useful information which can support their reasoning process. The tool is composed of a search engine based on Text Mining and Natural Language Processing techniques, and an analysis module which process the search results in order to build annotation similarity networks. We tested our approach on the available knowledge about the genetic mechanism of cardiac diseases, where the target is to find both known and possible hypothetical relations between specific candidate genes and the trait of interest. We show that the system i) is able to effectively retrieve medical concepts and genes and ii) plays a relevant role assisting researchers in the formulation and evaluation of novel literature-based hypotheses.

  1. Using Semantic Linking to Understand Persons’ Networks Extracted from Text

    Directory of Open Access Journals (Sweden)

    Alessio Palmero Aprosio

    2017-11-01

    Full Text Available In this work, we describe a methodology to interpret large persons’ networks extracted from text by classifying cliques using the DBpedia ontology. The approach relies on a combination of NLP, Semantic web technologies, and network analysis. The classification methodology that first starts from single nodes and then generalizes to cliques is effective in terms of performance and is able to deal also with nodes that are not linked to Wikipedia. The gold standard manually developed for evaluation shows that groups of co-occurring entities share in most of the cases a category that can be automatically assigned. This holds for both languages considered in this study. The outcome of this work may be of interest to enhance the readability of large networks and to provide an additional semantic layer on top of cliques. This would greatly help humanities scholars when dealing with large amounts of textual data that need to be interpreted or categorized. Furthermore, it represents an unsupervised approach to automatically extend DBpedia starting from a corpus.

  2. Unsupervised Extraction of Diagnosis Codes from EMRs Using Knowledge-Based and Extractive Text Summarization Techniques.

    Science.gov (United States)

    Kavuluru, Ramakanth; Han, Sifei; Harris, Daniel

    2013-05-01

    Diagnosis codes are extracted from medical records for billing and reimbursement and for secondary uses such as quality control and cohort identification. In the US, these codes come from the standard terminology ICD-9-CM derived from the international classification of diseases (ICD). ICD-9 codes are generally extracted by trained human coders by reading all artifacts available in a patient's medical record following specific coding guidelines. To assist coders in this manual process, this paper proposes an unsupervised ensemble approach to automatically extract ICD-9 diagnosis codes from textual narratives included in electronic medical records (EMRs). Earlier attempts on automatic extraction focused on individual documents such as radiology reports and discharge summaries. Here we use a more realistic dataset and extract ICD-9 codes from EMRs of 1000 inpatient visits at the University of Kentucky Medical Center. Using named entity recognition (NER), graph-based concept-mapping of medical concepts, and extractive text summarization techniques, we achieve an example based average recall of 0.42 with average precision 0.47; compared with a baseline of using only NER, we notice a 12% improvement in recall with the graph-based approach and a 7% improvement in precision using the extractive text summarization approach. Although diagnosis codes are complex concepts often expressed in text with significant long range non-local dependencies, our present work shows the potential of unsupervised methods in extracting a portion of codes. As such, our findings are especially relevant for code extraction tasks where obtaining large amounts of training data is difficult.

  3. Extracting Useful Semantic Information from Large Scale Corpora of Text

    Science.gov (United States)

    Mendoza, Ray Padilla, Jr.

    2012-01-01

    Extracting and representing semantic information from large scale corpora is at the crux of computer-assisted knowledge generation. Semantic information depends on collocation extraction methods, mathematical models used to represent distributional information, and weighting functions which transform the space. This dissertation provides a…

  4. Basic Test Framework for the Evaluation of Text Line Segmentation and Text Parameter Extraction

    OpenAIRE

    Darko Brodić; Milivojević, Dragan R.; Zoran Milivojević

    2010-01-01

    Text line segmentation is an essential stage in off-line optical character recognition (OCR) systems. It is a key because inaccurately segmented text lines will lead to OCR failure. Text line segmentation of handwritten documents is a complex and diverse problem, complicated by the nature of handwriting. Hence, text line segmentation is a leading challenge in handwritten document image processing. Due to inconsistencies in measurement and evaluation of text segmentation algorithm quality, som...

  5. Extracting Temporal Information from Open Domain Text: A Comparative Exploration

    NARCIS (Netherlands)

    Ahn, D.D.; Fissaha Adafre, S.; de Rijke, M.

    2005-01-01

    The utility of data-driven techniques in the end-to-end problem of temporal information extraction is unclear. Recognition of temporal expressions yields readily to machine learning, but normalization seems to call for a rule-based approach. We explore two aspects of the (potential) utility of

  6. Extraction of Relations between Entities from Texts by Learning Methods

    Science.gov (United States)

    2006-12-01

    Tasks, in Proceedings of the Eleventh National Conference on Artificial Intelligence, 811-816. AAAI Press / The MIT Press . Rohmer J. (2002...pour une extraction d’informations sur le web dédiées à la veille . Réalisation du système informatique JavaVeille. PhD Thesis, Université Paris 4

  7. Texting

    Science.gov (United States)

    Tilley, Carol L.

    2009-01-01

    With the increasing ranks of cell phone ownership is an increase in text messaging, or texting. During 2008, more than 2.5 trillion text messages were sent worldwide--that's an average of more than 400 messages for every person on the planet. Although many of the messages teenagers text each day are perhaps nothing more than "how r u?" or "c u…

  8. Research of Anti-Noise Image Salient Region Extraction Method

    Directory of Open Access Journals (Sweden)

    Bing XU

    2014-01-01

    Full Text Available The existing image salient region extraction technology is mostly suitable for processing noise-free images, and there is a lack of studies on the impact of noise on images. In this study the adaptive kernel function was employed in image salient region detection. The salient property of a region was determined by the dissimilarities between the pixels of the image region and its surroundings. The dissimilarity was measured as a decreasing function associated with adaptive kernel regression. The proposed algorithm used multi-scale fusion method to obtain the salient region of the whole image. As adaptive kernel function has strong anti-noise characteristics, the proposed algorithm was characterized with the same robustness. A numerical simulation experiment was conducted on salient region extraction of images with noise and without noise. A comparison between this study’s results and two existing salient region extraction methods revealed that the proposed method in this study was superior in its extraction accuracy of image salient regions and could reduce interference of image noise.

  9. Scene text detection via extremal region based double threshold convolutional network classification.

    Directory of Open Access Journals (Sweden)

    Wei Zhu

    Full Text Available In this paper, we present a robust text detection approach in natural images which is based on region proposal mechanism. A powerful low-level detector named saliency enhanced-MSER extended from the widely-used MSER is proposed by incorporating saliency detection methods, which ensures a high recall rate. Given a natural image, character candidates are extracted from three channels in a perception-based illumination invariant color space by saliency-enhanced MSER algorithm. A discriminative convolutional neural network (CNN is jointly trained with multi-level information including pixel-level and character-level information as character candidate classifier. Each image patch is classified as strong text, weak text and non-text by double threshold filtering instead of conventional one-step classification, leveraging confident scores obtained via CNN. To further prune non-text regions, we develop a recursive neighborhood search algorithm to track credible texts from weak text set. Finally, characters are grouped into text lines using heuristic features such as spatial location, size, color, and stroke width. We compare our approach with several state-of-the-art methods, and experiments show that our method achieves competitive performance on public datasets ICDAR 2011 and ICDAR 2013.

  10. Tagline: Information Extraction for Semi-Structured Text Elements in Medical Progress Notes

    Science.gov (United States)

    Finch, Dezon Kile

    2012-01-01

    Text analysis has become an important research activity in the Department of Veterans Affairs (VA). Statistical text mining and natural language processing have been shown to be very effective for extracting useful information from medical documents. However, neither of these techniques is effective at extracting the information stored in…

  11. Scene text detection via extremal region based double threshold convolutional network classification.

    Science.gov (United States)

    Zhu, Wei; Lou, Jing; Chen, Longtao; Xia, Qingyuan; Ren, Mingwu

    2017-01-01

    In this paper, we present a robust text detection approach in natural images which is based on region proposal mechanism. A powerful low-level detector named saliency enhanced-MSER extended from the widely-used MSER is proposed by incorporating saliency detection methods, which ensures a high recall rate. Given a natural image, character candidates are extracted from three channels in a perception-based illumination invariant color space by saliency-enhanced MSER algorithm. A discriminative convolutional neural network (CNN) is jointly trained with multi-level information including pixel-level and character-level information as character candidate classifier. Each image patch is classified as strong text, weak text and non-text by double threshold filtering instead of conventional one-step classification, leveraging confident scores obtained via CNN. To further prune non-text regions, we develop a recursive neighborhood search algorithm to track credible texts from weak text set. Finally, characters are grouped into text lines using heuristic features such as spatial location, size, color, and stroke width. We compare our approach with several state-of-the-art methods, and experiments show that our method achieves competitive performance on public datasets ICDAR 2011 and ICDAR 2013.

  12. A Relation Extraction Framework for Biomedical Text Using Hybrid Feature Set

    Directory of Open Access Journals (Sweden)

    Abdul Wahab Muzaffar

    2015-01-01

    Full Text Available The information extraction from unstructured text segments is a complex task. Although manual information extraction often produces the best results, it is harder to manage biomedical data extraction manually because of the exponential increase in data size. Thus, there is a need for automatic tools and techniques for information extraction in biomedical text mining. Relation extraction is a significant area under biomedical information extraction that has gained much importance in the last two decades. A lot of work has been done on biomedical relation extraction focusing on rule-based and machine learning techniques. In the last decade, the focus has changed to hybrid approaches showing better results. This research presents a hybrid feature set for classification of relations between biomedical entities. The main contribution of this research is done in the semantic feature set where verb phrases are ranked using Unified Medical Language System (UMLS and a ranking algorithm. Support Vector Machine and Naïve Bayes, the two effective machine learning techniques, are used to classify these relations. Our approach has been validated on the standard biomedical text corpus obtained from MEDLINE 2001. Conclusively, it can be articulated that our framework outperforms all state-of-the-art approaches used for relation extraction on the same corpus.

  13. A survey of event extraction methods from text for decision support systems

    NARCIS (Netherlands)

    Hoogenboom, F.P.; Frasincar, Flavius; Kaymak, Uzay; de Jong, Franciska; Caron, E.A.M.

    2016-01-01

    Event extraction, a specialized stream of information extraction rooted back into the 1980s, has greatly gained in popularity due to the advent of big data and the developments in the related fields of text mining and natural language processing. However, up to this date, an overview of this

  14. ViTexOCR; a script to extract text overlays from digital video

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — The ViTexOCR script presents a new method for extracting navigation data from videos with text overlays using optical character recognition (OCR) software. Over the...

  15. INTER-LINE DISTANCE ESTIMATION AND TEXT LINE EXTRACTION FOR UNCONSTRAINED ONLINE HANDWRITING

    NARCIS (Netherlands)

    Ratzlaff, E.

    2004-01-01

    Methods for detecting and extracting whole text lines from unconstrained online handwritten text are described. The general approach is a ``bottom-up'' clustering of discrete strokes into small groups that are then merged into isolated lines of text. Initial clustering of strokes into groups is

  16. Extracting Concepts Related to Homelessness from the Free Text of VA Electronic Medical Records.

    Science.gov (United States)

    Gundlapalli, Adi V; Carter, Marjorie E; Divita, Guy; Shen, Shuying; Palmer, Miland; South, Brett; Durgahee, B S Begum; Redd, Andrew; Samore, Matthew

    2014-01-01

    Mining the free text of electronic medical records (EMR) using natural language processing (NLP) is an effective method of extracting information not always captured in administrative data. We sought to determine if concepts related to homelessness, a non-medical condition, were amenable to extraction from the EMR of Veterans Affairs (VA) medical records. As there were no off-the-shelf products, a lexicon of terms related to homelessness was created. A corpus of free text documents from outpatient encounters was reviewed to create the reference standard for NLP training and testing. V3NLP Framework was used to detect instances of lexical terms and was compared to the reference standard. With a positive predictive value of 77% for extracting relevant concepts, this study demonstrates the feasibility of extracting positively asserted concepts related to homelessness from the free text of medical records.

  17. Automatic Extraction of Drug Adverse Effects from Product Characteristics (SPCs): A Text Versus Table Comparison.

    Science.gov (United States)

    Lamy, Jean-Baptiste; Ugon, Adrien; Berthelot, Hélène

    2016-01-01

    Potential adverse effects (AEs) of drugs are described in their summary of product characteristics (SPCs), a textual document. Automatic extraction of AEs from SPCs is useful for detecting AEs and for building drug databases. However, this task is difficult because each AE is associated with a frequency that must be extracted and the presentation of AEs in SPCs is heterogeneous, consisting of plain text and tables in many different formats. We propose a taxonomy for the presentation of AEs in SPCs. We set up natural language processing (NLP) and table parsing methods for extracting AEs from texts and tables of any format, and evaluate them on 10 SPCs. Automatic extraction performed better on tables than on texts. Tables should be recommended for the presentation of the AEs section of the SPCs.

  18. Extracting of implicit information in English advertising texts with phonetic and lexical-morphological means

    Directory of Open Access Journals (Sweden)

    Traikovskaya Natalya Petrovna

    2015-12-01

    Full Text Available The article deals with phonetic and lexical-morphological language means participating in the process of extracting implicit information in English-speaking advertising texts for men and women. The functioning of phonetic means of the English language is not the basis for implication of information in advertising texts. Lexical and morphological means play the role of markers of relevant information, playing the role of the activator ofimplicit information in the texts of advertising.

  19. Knowledge Extraction and Semantic Annotation of Text from the Encyclopedia of Life

    OpenAIRE

    Anne E Thessen; Cynthia Sims Parr

    2014-01-01

    Numerous digitization and ontological initiatives have focused on translating biological knowledge from narrative text to machine-readable formats. In this paper, we describe two workflows for knowledge extraction and semantic annotation of text data objects featured in an online biodiversity aggregator, the Encyclopedia of Life. One workflow tags text with DBpedia URIs based on keywords. Another workflow finds taxon names in text using GNRD for the purpose of building a species association n...

  20. REGION-BASED BUILDING ROOFTOP EXTRACTION AND CHANGE DETECTION

    Directory of Open Access Journals (Sweden)

    J. Tian

    2017-09-01

    Full Text Available Automatic extraction of building changes is important for many applications like disaster monitoring and city planning. Although a lot of research work is available based on 2D as well as 3D data, an improvement in accuracy and efficiency is still needed. The introducing of digital surface models (DSMs to building change detection has strongly improved the resulting accuracy. In this paper, a post-classification approach is proposed for building change detection using satellite stereo imagery. Firstly, DSMs are generated from satellite stereo imagery and further refined by using a segmentation result obtained from the Sobel gradients of the panchromatic image. Besides the refined DSMs, the panchromatic image and the pansharpened multispectral image are used as input features for mean-shift segmentation. The DSM is used to calculate the nDSM, out of which the initial building candidate regions are extracted. The candidate mask is further refined by morphological filtering and by excluding shadow regions. Following this, all segments that overlap with a building candidate region are determined. A building oriented segments merging procedure is introduced to generate a final building rooftop mask. As the last step, object based change detection is performed by directly comparing the building rooftops extracted from the pre- and after-event imagery and by fusing the change indicators with the roof-top region map. A quantitative and qualitative assessment of the proposed approach is provided by using WorldView-2 satellite data from Istanbul, Turkey.

  1. A Survey of Neural Network Techniques for Feature Extraction from Text

    OpenAIRE

    John, Vineet

    2017-01-01

    This paper aims to catalyze the discussions about text feature extraction techniques using neural network architectures. The research questions discussed in the paper focus on the state-of-the-art neural network techniques that have proven to be useful tools for language processing, language generation, text classification and other computational linguistics tasks.

  2. A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora.

    Science.gov (United States)

    Jahiruddin; Abulaish, Muhammad; Dey, Lipika

    2010-12-01

    A number of techniques such as information extraction, document classification, document clustering and information visualization have been developed to ease extraction and understanding of information embedded within text documents. However, knowledge that is embedded in natural language texts is difficult to extract using simple pattern matching techniques and most of these methods do not help users directly understand key concepts and their semantic relationships in document corpora, which are critical for capturing their conceptual structures. The problem arises due to the fact that most of the information is embedded within unstructured or semi-structured texts that computers can not interpret very easily. In this paper, we have presented a novel Biomedical Knowledge Extraction and Visualization framework, BioKEVis to identify key information components from biomedical text documents. The information components are centered on key concepts. BioKEVis applies linguistic analysis and Latent Semantic Analysis (LSA) to identify key concepts. The information component extraction principle is based on natural language processing techniques and semantic-based analysis. The system is also integrated with a biomedical named entity recognizer, ABNER, to tag genes, proteins and other entity names in the text. We have also presented a method for collating information extracted from multiple sources to generate semantic network. The network provides distinct user perspectives and allows navigation over documents with similar information components and is also used to provide a comprehensive view of the collection. The system stores the extracted information components in a structured repository which is integrated with a query-processing module to handle biomedical queries over text documents. We have also proposed a document ranking mechanism to present retrieved documents in order of their relevance to the user query. Copyright © 2010 Elsevier Inc. All rights reserved.

  3. A neural joint model for entity and relation extraction from biomedical text.

    Science.gov (United States)

    Li, Fei; Zhang, Meishan; Fu, Guohong; Ji, Donghong

    2017-03-31

    Extracting biomedical entities and their relations from text has important applications on biomedical research. Previous work primarily utilized feature-based pipeline models to process this task. Many efforts need to be made on feature engineering when feature-based models are employed. Moreover, pipeline models may suffer error propagation and are not able to utilize the interactions between subtasks. Therefore, we propose a neural joint model to extract biomedical entities as well as their relations simultaneously, and it can alleviate the problems above. Our model was evaluated on two tasks, i.e., the task of extracting adverse drug events between drug and disease entities, and the task of extracting resident relations between bacteria and location entities. Compared with the state-of-the-art systems in these tasks, our model improved the F1 scores of the first task by 5.1% in entity recognition and 8.0% in relation extraction, and that of the second task by 9.2% in relation extraction. The proposed model achieves competitive performances with less work on feature engineering. We demonstrate that the model based on neural networks is effective for biomedical entity and relation extraction. In addition, parameter sharing is an alternative method for neural models to jointly process this task. Our work can facilitate the research on biomedical text mining.

  4. An unsupervised text mining method for relation extraction from biomedical literature.

    Directory of Open Access Journals (Sweden)

    Changqin Quan

    Full Text Available The wealth of interaction information provided in biomedical articles motivated the implementation of text mining approaches to automatically extract biomedical relations. This paper presents an unsupervised method based on pattern clustering and sentence parsing to deal with biomedical relation extraction. Pattern clustering algorithm is based on Polynomial Kernel method, which identifies interaction words from unlabeled data; these interaction words are then used in relation extraction between entity pairs. Dependency parsing and phrase structure parsing are combined for relation extraction. Based on the semi-supervised KNN algorithm, we extend the proposed unsupervised approach to a semi-supervised approach by combining pattern clustering, dependency parsing and phrase structure parsing rules. We evaluated the approaches on two different tasks: (1 Protein-protein interactions extraction, and (2 Gene-suicide association extraction. The evaluation of task (1 on the benchmark dataset (AImed corpus showed that our proposed unsupervised approach outperformed three supervised methods. The three supervised methods are rule based, SVM based, and Kernel based separately. The proposed semi-supervised approach is superior to the existing semi-supervised methods. The evaluation on gene-suicide association extraction on a smaller dataset from Genetic Association Database and a larger dataset from publicly available PubMed showed that the proposed unsupervised and semi-supervised methods achieved much higher F-scores than co-occurrence based method.

  5. Using Gazetteers to Extract Sets of Keywords from Free-Flowing Texts

    Directory of Open Access Journals (Sweden)

    Adam Crymble

    2015-12-01

    Full Text Available If you have a copy of a text in electronic format stored on your computer, it is relatively easy to keyword search for a single term. Often you can do this by using the built-in search features in your favourite text editor. However, scholars are increasingly needing to find instances of many terms within a text or texts. For example, a scholar may want to use a gazetteer to extract all mentions of English placenames within a collection of texts so that those places can later be plotted on a map. Alternatively, they may want to extract all male given names, all pronouns, stop words, or any other set of words. Using those same built-in search features to achieve this more complex goal is time consuming and clunky. This lesson will teach you how to use Python to extract a set of keywords very quickly and systematically from a set of texts. It is expected that once you have completed this lesson, you will be able to generalise the skills to extract custom sets of keywords from any set of locally saved files.

  6. Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature.

    Science.gov (United States)

    Ravikumar, Komandur Elayavilli; Wagholikar, Kavishwar B; Li, Dingcheng; Kocher, Jean-Pierre; Liu, Hongfang

    2015-06-06

    Advances in the next generation sequencing technology has accelerated the pace of individualized medicine (IM), which aims to incorporate genetic/genomic information into medicine. One immediate need in interpreting sequencing data is the assembly of information about genetic variants and their corresponding associations with other entities (e.g., diseases or medications). Even with dedicated effort to capture such information in biological databases, much of this information remains 'locked' in the unstructured text of biomedical publications. There is a substantial lag between the publication and the subsequent abstraction of such information into databases. Multiple text mining systems have been developed, but most of them focus on the sentence level association extraction with performance evaluation based on gold standard text annotations specifically prepared for text mining systems. We developed and evaluated a text mining system, MutD, which extracts protein mutation-disease associations from MEDLINE abstracts by incorporating discourse level analysis, using a benchmark data set extracted from curated database records. MutD achieves an F-measure of 64.3% for reconstructing protein mutation disease associations in curated database records. Discourse level analysis component of MutD contributed to a gain of more than 10% in F-measure when compared against the sentence level association extraction. Our error analysis indicates that 23 of the 64 precision errors are true associations that were not captured by database curators and 68 of the 113 recall errors are caused by the absence of associated disease entities in the abstract. After adjusting for the defects in the curated database, the revised F-measure of MutD in association detection reaches 81.5%. Our quantitative analysis reveals that MutD can effectively extract protein mutation disease associations when benchmarking based on curated database records. The analysis also demonstrates that incorporating

  7. Structured learning for spatial information extraction from biomedical text: bacteria biotopes.

    Science.gov (United States)

    Kordjamshidi, Parisa; Roth, Dan; Moens, Marie-Francine

    2015-04-25

    We aim to automatically extract species names of bacteria and their locations from webpages. This task is important for exploiting the vast amount of biological knowledge which is expressed in diverse natural language texts and putting this knowledge in databases for easy access by biologists. The task is challenging and the previous results are far below an acceptable level of performance, particularly for extraction of localization relationships. Therefore, we aim to design a new system for such extractions, using the framework of structured machine learning techniques. We design a new model for joint extraction of biomedical entities and the localization relationship. Our model is based on a spatial role labeling (SpRL) model designed for spatial understanding of unrestricted text. We extend SpRL to extract discourse level spatial relations in the biomedical domain and apply it on the BioNLP-ST 2013, BB-shared task. We highlight the main differences between general spatial language understanding and spatial information extraction from the scientific text which is the focus of this work. We exploit the text's structure and discourse level global features. Our model and the designed features substantially improve on the previous systems, achieving an absolute improvement of approximately 57 percent over F1 measure of the best previous system for this task. Our experimental results indicate that a joint learning model over all entities and relationships in a document outperforms a model which extracts entities and relationships independently. Our global learning model significantly improves the state-of-the-art results on this task and has a high potential to be adopted in other natural language processing (NLP) tasks in the biomedical domain.

  8. Using text mining techniques to extract phenotypic information from the PhenoCHF corpus

    Science.gov (United States)

    2015-01-01

    Background Phenotypic information locked away in unstructured narrative text presents significant barriers to information accessibility, both for clinical practitioners and for computerised applications used for clinical research purposes. Text mining (TM) techniques have previously been applied successfully to extract different types of information from text in the biomedical domain. They have the potential to be extended to allow the extraction of information relating to phenotypes from free text. Methods To stimulate the development of TM systems that are able to extract phenotypic information from text, we have created a new corpus (PhenoCHF) that is annotated by domain experts with several types of phenotypic information relating to congestive heart failure. To ensure that systems developed using the corpus are robust to multiple text types, it integrates text from heterogeneous sources, i.e., electronic health records (EHRs) and scientific articles from the literature. We have developed several different phenotype extraction methods to demonstrate the utility of the corpus, and tested these methods on a further corpus, i.e., ShARe/CLEF 2013. Results Evaluation of our automated methods showed that PhenoCHF can facilitate the training of reliable phenotype extraction systems, which are robust to variations in text type. These results have been reinforced by evaluating our trained systems on the ShARe/CLEF corpus, which contains clinical records of various types. Like other studies within the biomedical domain, we found that solutions based on conditional random fields produced the best results, when coupled with a rich feature set. Conclusions PhenoCHF is the first annotated corpus aimed at encoding detailed phenotypic information. The unique heterogeneous composition of the corpus has been shown to be advantageous in the training of systems that can accurately extract phenotypic information from a range of different text types. Although the scope of our

  9. Using text mining techniques to extract phenotypic information from the PhenoCHF corpus.

    Science.gov (United States)

    Alnazzawi, Noha; Thompson, Paul; Batista-Navarro, Riza; Ananiadou, Sophia

    2015-01-01

    Phenotypic information locked away in unstructured narrative text presents significant barriers to information accessibility, both for clinical practitioners and for computerised applications used for clinical research purposes. Text mining (TM) techniques have previously been applied successfully to extract different types of information from text in the biomedical domain. They have the potential to be extended to allow the extraction of information relating to phenotypes from free text. To stimulate the development of TM systems that are able to extract phenotypic information from text, we have created a new corpus (PhenoCHF) that is annotated by domain experts with several types of phenotypic information relating to congestive heart failure. To ensure that systems developed using the corpus are robust to multiple text types, it integrates text from heterogeneous sources, i.e., electronic health records (EHRs) and scientific articles from the literature. We have developed several different phenotype extraction methods to demonstrate the utility of the corpus, and tested these methods on a further corpus, i.e., ShARe/CLEF 2013. Evaluation of our automated methods showed that PhenoCHF can facilitate the training of reliable phenotype extraction systems, which are robust to variations in text type. These results have been reinforced by evaluating our trained systems on the ShARe/CLEF corpus, which contains clinical records of various types. Like other studies within the biomedical domain, we found that solutions based on conditional random fields produced the best results, when coupled with a rich feature set. PhenoCHF is the first annotated corpus aimed at encoding detailed phenotypic information. The unique heterogeneous composition of the corpus has been shown to be advantageous in the training of systems that can accurately extract phenotypic information from a range of different text types. Although the scope of our annotation is currently limited to a single

  10. The Giles Ecosystem – Storage, Text Extraction, and OCR of Documents

    Directory of Open Access Journals (Sweden)

    Julia Damerow

    2017-09-01

    Full Text Available In the digital humanities, there is a constant need to turn images and PDF files into plain text to apply analyses such as topic modelling, named entity recognition, and other techniques. However, although there exist different solutions to extract text embedded in PDF files or run OCR on images, they typically require additional training (for example, scholars have to learn how to use the command line or are difficult to automate without programming skills. The Giles Ecosystem is a distributed system based on Apache Kafka that allows users to upload documents for text and image extraction. The system components are implemented using Java and the Spring Framework and are available under an Open Source license on GitHub (https://github.com/diging/. Funding statement: Funding was provided by grants from NSF SES 1656284, ASU Presidential Strategic Initiative Fund and the Smart Family Foundation.

  11. Network and Ensemble Enabled Entity Extraction in Informal Text (NEEEEIT) final report

    Energy Technology Data Exchange (ETDEWEB)

    Kegelmeyer, Philip W. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Shead, Timothy M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Dunlavy, Daniel M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2013-09-01

    This SAND report summarizes the activities and outcomes of the Network and Ensemble Enabled Entity Extraction in Information Text (NEEEEIT) LDRD project, which addressed improving the accuracy of conditional random fields for named entity recognition through the use of ensemble methods.

  12. A Comparison of Multiple Approaches for the Extractive Summarization of Portuguese Texts

    OpenAIRE

    Miguel Ângelo Abrantes Costa; Bruno Martins

    2015-01-01

    Automatic document summarization is the task of automatically generating condensed versions of source texts, presenting itself as one of the fundamental problems in the areas of Information Retrieval and Natural Language Processing. In this paper, different extractive approaches are compared in the task of summarizing individual documents corresponding to journalistic texts written in Portuguese. Through the use of the ROUGE package for measuring the quality of the produced summaries, we repo...

  13. Discovery of Predicate-Oriented Relations among Named Entities Extracted from Thai Texts

    Science.gov (United States)

    Tongtep, Nattapong; Theeramunkong, Thanaruk

    Extracting named entities (NEs) and their relations is more difficult in Thai than in other languages due to several Thai specific characteristics, including no explicit boundaries for words, phrases and sentences; few case markers and modifier clues; high ambiguity in compound words and serial verbs; and flexible word orders. Unlike most previous works which focused on NE relations of specific actions, such as work_for, live_in, located_in, and kill, this paper proposes more general types of NE relations, called predicate-oriented relation (PoR), where an extracted action part (verb) is used as a core component to associate related named entities extracted from Thai Texts. Lacking a practical parser for the Thai language, we present three types of surface features, i.e. punctuation marks (such as token spaces), entity types and the number of entities and then apply five alternative commonly used learning schemes to investigate their performance on predicate-oriented relation extraction. The experimental results show that our approach achieves the F-measure of 97.76%, 99.19%, 95.00% and 93.50% on four different types of predicate-oriented relation (action-location, location-action, action-person and person-action) in crime-related news documents using a data set of 1,736 entity pairs. The effects of NE extraction techniques, feature sets and class unbalance on the performance of relation extraction are explored.

  14. Extracting Various Classes of Data From Biological Text Using the Concept of Existence Dependency.

    Science.gov (United States)

    Taha, Kamal

    2015-11-01

    One of the key goals of biological natural language processing (NLP) is the automatic information extraction from biomedical publications. Most current constituency and dependency parsers overlook the semantic relationships between the constituents comprising a sentence and may not be well suited for capturing complex long-distance dependences. We propose in this paper a hybrid constituency-dependency parser for biological NLP information extraction called EDCC. EDCC aims at enhancing the state of the art of biological text mining by applying novel linguistic computational techniques that overcome the limitations of current constituency and dependency parsers outlined earlier, as follows: 1) it determines the semantic relationship between each pair of constituents in a sentence using novel semantic rules; and 2) it applies a semantic relationship extraction model that extracts information from different structural forms of constituents in sentences. EDCC can be used to extract different types of data from biological texts for purposes such as protein function prediction, genetic network construction, and protein-protein interaction detection. We evaluated the quality of EDCC by comparing it experimentally with six systems. Results showed marked improvement.

  15. Vaccine adverse event text mining system for extracting features from vaccine safety reports.

    Science.gov (United States)

    Botsis, Taxiarchis; Buttolph, Thomas; Nguyen, Michael D; Winiecki, Scott; Woo, Emily Jane; Ball, Robert

    2012-01-01

    To develop and evaluate a text mining system for extracting key clinical features from vaccine adverse event reporting system (VAERS) narratives to aid in the automated review of adverse event reports. Based upon clinical significance to VAERS reviewing physicians, we defined the primary (diagnosis and cause of death) and secondary features (eg, symptoms) for extraction. We built a novel vaccine adverse event text mining (VaeTM) system based on a semantic text mining strategy. The performance of VaeTM was evaluated using a total of 300 VAERS reports in three sequential evaluations of 100 reports each. Moreover, we evaluated the VaeTM contribution to case classification; an information retrieval-based approach was used for the identification of anaphylaxis cases in a set of reports and was compared with two other methods: a dedicated text classifier and an online tool. The performance metrics of VaeTM were text mining metrics: recall, precision and F-measure. We also conducted a qualitative difference analysis and calculated sensitivity and specificity for classification of anaphylaxis cases based on the above three approaches. VaeTM performed best in extracting diagnosis, second level diagnosis, drug, vaccine, and lot number features (lenient F-measure in the third evaluation: 0.897, 0.817, 0.858, 0.874, and 0.914, respectively). In terms of case classification, high sensitivity was achieved (83.1%); this was equal and better compared to the text classifier (83.1%) and the online tool (40.7%), respectively. Our VaeTM implementation of a semantic text mining strategy shows promise in providing accurate and efficient extraction of key features from VAERS narratives.

  16. A language independent acronym extraction from biomedical texts with hidden Markov models.

    Science.gov (United States)

    Osiek, Bruno Adam; Xexeo, Gexéo; Vidal de Carvalho, Luis Alfredo

    2010-11-01

    This paper proposes to model the extraction of acronyms and their meaning from unstructured text as a stochastic process using Hidden Markov Models (HMM). The underlying, or hidden, chain is derived from the acronym where the states in the chain are made by the acronyms characters. The transition between two states happens when the origin state emits a signal. Signals recognizable by the HMM are tokens extracted from text. Observations are sequence of tokens also extracted from text. Given a set of observations, the acronym definition will be the observation with the highest probability to emerge from the HMM. Modelling this extraction probabilistically allows us to deal with two difficult aspects of this process: ambiguity and noise. We characterize ambiguity when there is no unique alignment between a character in the acronym with a token in the expansion while the feature characterizing noise is the absence of such alignment. Our experiments have proven that this approach has high precision (93.50%) and recall (85.50%) rates in an environment where acronym coinage is ambiguous and noisy such as the biomedical domain. Processing and comparing the HMM approach with different ones, showed ours to reach the highest F1 score (89.40%) on the same corpus.

  17. Automatic extraction of gene/protein biological functions from biomedical text.

    Science.gov (United States)

    Koike, Asako; Niwa, Yoshiki; Takagi, Toshihisa

    2005-04-01

    With the rapid advancement of biomedical science and the development of high-throughput analysis methods, the extraction of various types of information from biomedical text has become critical. Since automatic functional annotations of genes are quite useful for interpreting large amounts of high-throughput data efficiently, the demand for automatic extraction of information related to gene functions from text has been increasing. We have developed a method for automatically extracting the biological process functions of genes/protein/families based on Gene Ontology (GO) from text using a shallow parser and sentence structure analysis techniques. When the gene/protein/family names and their functions are described in ACTOR (doer of action) and OBJECT (receiver of action) relationships, the corresponding GO-IDs are assigned to the genes/proteins/families. The gene/protein/family names are recognized using the gene/protein/family name dictionaries developed by our group. To achieve wide recognition of the gene/protein/family functions, we semi-automatically gather functional terms based on GO using co-occurrence, collocation similarities and rule-based techniques. A preliminary experiment demonstrated that our method has an estimated recall of 54-64% with a precision of 91-94% for actually described functions in abstracts. When applied to the PUBMED, it extracted over 190 000 gene-GO relationships and 150 000 family-GO relationships for major eukaryotes.

  18. Information extraction from full text scientific articles: Where are the keywords?

    Directory of Open Access Journals (Sweden)

    Perez-Iratxeta Carolina

    2003-05-01

    Full Text Available Abstract Background To date, many of the methods for information extraction of biological information from scientific articles are restricted to the abstract of the article. However, full text articles in electronic version, which offer larger sources of data, are currently available. Several questions arise as to whether the effort of scanning full text articles is worthy, or whether the information that can be extracted from the different sections of an article can be relevant. Results In this work we addressed those questions showing that the keyword content of the different sections of a standard scientific article (abstract, introduction, methods, results, and discussion is very heterogeneous. Conclusions Although the abstract contains the best ratio of keywords per total of words, other sections of the article may be a better source of biologically relevant data.

  19. The Fractal Patterns of Words in a Text: A Method for Automatic Keyword Extraction.

    Science.gov (United States)

    Najafi, Elham; Darooneh, Amir H

    2015-01-01

    A text can be considered as a one dimensional array of words. The locations of each word type in this array form a fractal pattern with certain fractal dimension. We observe that important words responsible for conveying the meaning of a text have dimensions considerably different from one, while the fractal dimensions of unimportant words are close to one. We introduce an index quantifying the importance of the words in a given text using their fractal dimensions and then ranking them according to their importance. This index measures the difference between the fractal pattern of a word in the original text relative to a shuffled version. Because the shuffled text is meaningless (i.e., words have no importance), the difference between the original and shuffled text can be used to ascertain degree of fractality. The degree of fractality may be used for automatic keyword detection. Words with the degree of fractality higher than a threshold value are assumed to be the retrieved keywords of the text. We measure the efficiency of our method for keywords extraction, making a comparison between our proposed method and two other well-known methods of automatic keyword extraction.

  20. Extracting salient sublexical units from written texts: "Emophon," a corpus-based approach to phonological iconicity.

    Science.gov (United States)

    Aryani, Arash; Jacobs, Arthur M; Conrad, Markus

    2013-01-01

    A GROWING BODY OF LITERATURE IN PSYCHOLOGY, LINGUISTICS, AND THE NEUROSCIENCES HAS PAID INCREASING ATTENTION TO THE UNDERSTANDING OF THE RELATIONSHIPS BETWEEN PHONOLOGICAL REPRESENTATIONS OF WORDS AND THEIR MEANING: a phenomenon also known as phonological iconicity. In this article, we investigate how a text's intended emotional meaning, particularly in literature and poetry, may be reflected at the level of sublexical phonological salience and the use of foregrounded elements. To extract such elements from a given text, we developed a probabilistic model to predict the exceeding of a confidence interval for specific sublexical units concerning their frequency of occurrence within a given text contrasted with a reference linguistic corpus for the German language. Implementing this model in a computational application, we provide a text analysis tool which automatically delivers information about sublexical phonological salience allowing researchers, inter alia, to investigate effects of the sublexical emotional tone of texts based on current findings on phonological iconicity.

  1. Knowledge extraction and semantic annotation of text from the encyclopedia of life.

    Science.gov (United States)

    Thessen, Anne E; Parr, Cynthia Sims

    2014-01-01

    Numerous digitization and ontological initiatives have focused on translating biological knowledge from narrative text to machine-readable formats. In this paper, we describe two workflows for knowledge extraction and semantic annotation of text data objects featured in an online biodiversity aggregator, the Encyclopedia of Life. One workflow tags text with DBpedia URIs based on keywords. Another workflow finds taxon names in text using GNRD for the purpose of building a species association network. Both workflows work well: the annotation workflow has an F1 Score of 0.941 and the association algorithm has an F1 Score of 0.885. Existing text annotators such as Terminizer and DBpedia Spotlight performed well, but require some optimization to be useful in the ecology and evolution domain. Important future work includes scaling up and improving accuracy through the use of distributional semantics.

  2. DiMeX: A Text Mining System for Mutation-Disease Association Extraction.

    Science.gov (United States)

    Mahmood, A S M Ashique; Wu, Tsung-Jung; Mazumder, Raja; Vijay-Shanker, K

    2016-01-01

    The number of published articles describing associations between mutations and diseases is increasing at a fast pace. There is a pressing need to gather such mutation-disease associations into public knowledge bases, but manual curation slows down the growth of such databases. We have addressed this problem by developing a text-mining system (DiMeX) to extract mutation to disease associations from publication abstracts. DiMeX consists of a series of natural language processing modules that preprocess input text and apply syntactic and semantic patterns to extract mutation-disease associations. DiMeX achieves high precision and recall with F-scores of 0.88, 0.91 and 0.89 when evaluated on three different datasets for mutation-disease associations. DiMeX includes a separate component that extracts mutation mentions in text and associates them with genes. This component has been also evaluated on different datasets and shown to achieve state-of-the-art performance. The results indicate that our system outperforms the existing mutation-disease association tools, addressing the low precision problems suffered by most approaches. DiMeX was applied on a large set of abstracts from Medline to extract mutation-disease associations, as well as other relevant information including patient/cohort size and population data. The results are stored in a database that can be queried and downloaded at http://biotm.cis.udel.edu/dimex/. We conclude that this high-throughput text-mining approach has the potential to significantly assist researchers and curators to enrich mutation databases.

  3. Improving the extraction of complex regulatory events from scientific text by using ontology-based inference

    Directory of Open Access Journals (Sweden)

    Kim Jung-jae

    2011-10-01

    Full Text Available Abstract Background The extraction of complex events from biomedical text is a challenging task and requires in-depth semantic analysis. Previous approaches associate lexical and syntactic resources with ontologies for the semantic analysis, but fall short in testing the benefits from the use of domain knowledge. Results We developed a system that deduces implicit events from explicitly expressed events by using inference rules that encode domain knowledge. We evaluated the system with the inference module on three tasks: First, when tested against a corpus with manually annotated events, the inference module of our system contributes 53.2% of correct extractions, but does not cause any incorrect results. Second, the system overall reproduces 33.1% of the transcription regulatory events contained in RegulonDB (up to 85.0% precision and the inference module is required for 93.8% of the reproduced events. Third, we applied the system with minimum adaptations to the identification of cell activity regulation events, confirming that the inference improves the performance of the system also on this task. Conclusions Our research shows that the inference based on domain knowledge plays a significant role in extracting complex events from text. This approach has great potential in recognizing the complex concepts of such biomedical ontologies as Gene Ontology in the literature.

  4. Extracting and connecting chemical structures from text sources using chemicalize.org.

    Science.gov (United States)

    Southan, Christopher; Stracz, Andras

    2013-04-23

    Exploring bioactive chemistry requires navigating between structures and data from a variety of text-based sources. While PubChem currently includes approximately 16 million document-extracted structures (15 million from patents) the extent of public inter-document and document-to-database links is still well below any estimated total, especially for journal articles. A major expansion in access to text-entombed chemistry is enabled by chemicalize.org. This on-line resource can process IUPAC names, SMILES, InChI strings, CAS numbers and drug names from pasted text, PDFs or URLs to generate structures, calculate properties and launch searches. Here, we explore its utility for answering questions related to chemical structures in documents and where these overlap with database records. These aspects are illustrated using a common theme of Dipeptidyl Peptidase 4 (DPPIV) inhibitors. Full-text open URL sources facilitated the download of over 1400 structures from a DPPIV patent and the alignment of specific examples with IC50 data. Uploading the SMILES to PubChem revealed extensive linking to patents and papers, including prior submissions from chemicalize.org as submitting source. A DPPIV medicinal chemistry paper was completely extracted and structures were aligned to the activity results table, as well as linked to other documents via PubChem. In both cases, key structures with data were partitioned from common chemistry by dividing them into individual new PDFs for conversion. Over 500 structures were also extracted from a batch of PubMed abstracts related to DPPIV inhibition. The drug structures could be stepped through each text occurrence and included some converted MeSH-only IUPAC names not linked in PubChem. Performing set intersections proved effective for detecting compounds-in-common between documents and merged extractions. This work demonstrates the utility of chemicalize.org for the exploration of chemical structure connectivity between documents and

  5. Knowledge-based extraction of adverse drug events from biomedical text.

    Science.gov (United States)

    Kang, Ning; Singh, Bharat; Bui, Chinh; Afzal, Zubair; van Mulligen, Erik M; Kors, Jan A

    2014-03-04

    Many biomedical relation extraction systems are machine-learning based and have to be trained on large annotated corpora that are expensive and cumbersome to construct. We developed a knowledge-based relation extraction system that requires minimal training data, and applied the system for the extraction of adverse drug events from biomedical text. The system consists of a concept recognition module that identifies drugs and adverse effects in sentences, and a knowledge-base module that establishes whether a relation exists between the recognized concepts. The knowledge base was filled with information from the Unified Medical Language System. The performance of the system was evaluated on the ADE corpus, consisting of 1644 abstracts with manually annotated adverse drug events. Fifty abstracts were used for training, the remaining abstracts were used for testing. The knowledge-based system obtained an F-score of 50.5%, which was 34.4 percentage points better than the co-occurrence baseline. Increasing the training set to 400 abstracts improved the F-score to 54.3%. When the system was compared with a machine-learning system, jSRE, on a subset of the sentences in the ADE corpus, our knowledge-based system achieved an F-score that is 7 percentage points higher than the F-score of jSRE trained on 50 abstracts, and still 2 percentage points higher than jSRE trained on 90% of the corpus. A knowledge-based approach can be successfully used to extract adverse drug events from biomedical text without need for a large training set. Whether use of a knowledge base is equally advantageous for other biomedical relation-extraction tasks remains to be investigated.

  6. Extraction of V-N-Collocations from Text Corpora A Feasibility Study for German

    CERN Document Server

    Breidt, E

    1996-01-01

    The usefulness of a statistical approach suggested by Church et al. (1991) is evaluated for the extraction of verb-noun (V-N) collocations from German text corpora. Some problematic issues of that method arising from properties of the German language are discussed and various modifications of the method are considered that might improve extraction results for German. The precision and recall of all variant methods is evaluated for V-N collocations containing support verbs, and the consequences for further work on the extraction of collocations from German corpora are discussed. With a sufficiently large corpus (>= 6 mio. word-tokens), the average error rate of wrong extractions can be reduced to 2.2% (97.8% precision) with the most restrictive method, however with a loss in data of almost 50% compared to a less restrictive method with still 87.6% precision. Depending on the goal to be achieved, emphasis can be put on a high recall for lexicographic purposes or on high precision for automatic lexical acquisiti...

  7. Linking genes to literature: text mining, information extraction, and retrieval applications for biology.

    Science.gov (United States)

    Krallinger, Martin; Valencia, Alfonso; Hirschman, Lynette

    2008-01-01

    Efficient access to information contained in online scientific literature collections is essential for life science research, playing a crucial role from the initial stage of experiment planning to the final interpretation and communication of the results. The biological literature also constitutes the main information source for manual literature curation used by expert-curated databases. Following the increasing popularity of web-based applications for analyzing biological data, new text-mining and information extraction strategies are being implemented. These systems exploit existing regularities in natural language to extract biologically relevant information from electronic texts automatically. The aim of the BioCreative challenge is to promote the development of such tools and to provide insight into their performance. This review presents a general introduction to the main characteristics and applications of currently available text-mining systems for life sciences in terms of the following: the type of biological information demands being addressed; the level of information granularity of both user queries and results; and the features and methods commonly exploited by these applications. The current trend in biomedical text mining points toward an increasing diversification in terms of application types and techniques, together with integration of domain-specific resources such as ontologies. Additional descriptions of some of the systems discussed here are available on the internet http://zope.bioinfo.cnio.es/bionlp_tools/.

  8. Complex Biological Event Extraction from Full Text using Signatures of Linguistic and Semantic Features

    Energy Technology Data Exchange (ETDEWEB)

    McGrath, Liam R.; Domico, Kelly O.; Corley, Courtney D.; Webb-Robertson, Bobbie-Jo M.

    2011-06-24

    Building on technical advances from the BioNLP 2009 Shared Task Challenge, the 2011 challenge sets forth to generalize techniques to other complex biological event extraction tasks. In this paper, we present the implementation and evaluation of a signature-based machine-learning technique to predict events from full texts of infectious disease documents. Specifically, our approach uses novel signatures composed of traditional linguistic features and semantic knowledge to predict event triggers and their candidate arguments. Using a leave-one out analysis, we report the contribution of linguistic and shallow semantic features in the trigger prediction and candidate argument extraction. Lastly, we examine evaluations and posit causes for errors of infectious disease track subtasks.

  9. tmVar: a text mining approach for extracting sequence variants in biomedical literature.

    Science.gov (United States)

    Wei, Chih-Hsuan; Harris, Bethany R; Kao, Hung-Yu; Lu, Zhiyong

    2013-06-01

    Text-mining mutation information from the literature becomes a critical part of the bioinformatics approach for the analysis and interpretation of sequence variations in complex diseases in the post-genomic era. It has also been used for assisting the creation of disease-related mutation databases. Most of existing approaches are rule-based and focus on limited types of sequence variations, such as protein point mutations. Thus, extending their extraction scope requires significant manual efforts in examining new instances and developing corresponding rules. As such, new automatic approaches are greatly needed for extracting different kinds of mutations with high accuracy. Here, we report tmVar, a text-mining approach based on conditional random field (CRF) for extracting a wide range of sequence variants described at protein, DNA and RNA levels according to a standard nomenclature developed by the Human Genome Variation Society. By doing so, we cover several important types of mutations that were not considered in past studies. Using a novel CRF label model and feature set, our method achieves higher performance than a state-of-the-art method on both our corpus (91.4 versus 78.1% in F-measure) and their own gold standard (93.9 versus 89.4% in F-measure). These results suggest that tmVar is a high-performance method for mutation extraction from biomedical literature. tmVar software and its corpus of 500 manually curated abstracts are available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/pub/tmVar

  10. Interdisciplinary Approach to the Mental Lexicon: Neural Network and Text Extraction From Long-term Memory

    Directory of Open Access Journals (Sweden)

    Vardan G. Arutyunyan

    2013-01-01

    Full Text Available The paper touches upon the principles of mental lexicon organization in the light of recent research in psycho- and neurolinguistics. As a focal point of discussion two main approaches to mental lexicon functioning are considered: modular or dual-system approach, developed within generativism and opposite single-system approach, representatives of which are the connectionists and supporters of network models. The paper is an endeavor towards advocating the viewpoint that mental lexicon is complex psychological organization based upon specific composition of neural network. In this regard, the paper further elaborates on the matter of storing text in human mental space and introduces a model of text extraction from long-term memory. Based upon data available, the author develops a methodology of modeling structures of knowledge representation in the systems of artificial intelligence.

  11. A Comparison of Multiple Approaches for the Extractive Summarization of Portuguese Texts

    Directory of Open Access Journals (Sweden)

    Miguel Ângelo Abrantes Costa

    2015-07-01

    Full Text Available Automatic document summarization is the task of automatically generating condensed versions of source texts, presenting itself as one of the fundamental problems in the areas of Information Retrieval and Natural Language Processing. In this paper, different extractive approaches are compared in the task of summarizing individual documents corresponding to journalistic texts written in Portuguese. Through the use of the ROUGE package for measuring the quality of the produced summaries, we report on results for two different experimental domains, involving (i the generation of headlines for news articles written in European Portuguese, and (ii the generation of summaries for news articles written in Brazilian Portuguese. The results demonstrate that methods based on the selection of the first sentences have the best results  when building extractive news headlines in terms of several ROUGE metrics. Regarding the generation of summaries with more than one sentence, the method that achieved the best results was the LSA Squared algorithm, for the various ROUGE metrics.

  12. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.

    Science.gov (United States)

    Savova, Guergana K; Masanz, James J; Ogren, Philip V; Zheng, Jiaping; Sohn, Sunghwan; Kipper-Schuler, Karin C; Chute, Christopher G

    2010-01-01

    We aim to build and evaluate an open-source natural language processing system for information extraction from electronic medical record clinical free-text. We describe and evaluate our system, the clinical Text Analysis and Knowledge Extraction System (cTAKES), released open-source at http://www.ohnlp.org. The cTAKES builds on existing open-source technologies-the Unstructured Information Management Architecture framework and OpenNLP natural language processing toolkit. Its components, specifically trained for the clinical domain, create rich linguistic and semantic annotations. Performance of individual components: sentence boundary detector accuracy=0.949; tokenizer accuracy=0.949; part-of-speech tagger accuracy=0.936; shallow parser F-score=0.924; named entity recognizer and system-level evaluation F-score=0.715 for exact and 0.824 for overlapping spans, and accuracy for concept mapping, negation, and status attributes for exact and overlapping spans of 0.957, 0.943, 0.859, and 0.580, 0.939, and 0.839, respectively. Overall performance is discussed against five applications. The cTAKES annotations are the foundation for methods and modules for higher-level semantic processing of clinical free-text.

  13. EnvMine: A text-mining system for the automatic extraction of contextual information

    Directory of Open Access Journals (Sweden)

    de Lorenzo Victor

    2010-06-01

    Full Text Available Abstract Background For ecological studies, it is crucial to count on adequate descriptions of the environments and samples being studied. Such a description must be done in terms of their physicochemical characteristics, allowing a direct comparison between different environments that would be difficult to do otherwise. Also the characterization must include the precise geographical location, to make possible the study of geographical distributions and biogeographical patterns. Currently, there is no schema for annotating these environmental features, and these data have to be extracted from textual sources (published articles. So far, this had to be performed by manual inspection of the corresponding documents. To facilitate this task, we have developed EnvMine, a set of text-mining tools devoted to retrieve contextual information (physicochemical variables and geographical locations from textual sources of any kind. Results EnvMine is capable of retrieving the physicochemical variables cited in the text, by means of the accurate identification of their associated units of measurement. In this task, the system achieves a recall (percentage of items retrieved of 92% with less than 1% error. Also a Bayesian classifier was tested for distinguishing parts of the text describing environmental characteristics from others dealing with, for instance, experimental settings. Regarding the identification of geographical locations, the system takes advantage of existing databases such as GeoNames to achieve 86% recall with 92% precision. The identification of a location includes also the determination of its exact coordinates (latitude and longitude, thus allowing the calculation of distance between the individual locations. Conclusion EnvMine is a very efficient method for extracting contextual information from different text sources, like published articles or web pages. This tool can help in determining the precise location and physicochemical

  14. v3NLP Framework: Tools to Build Applications for Extracting Concepts from Clinical Text.

    Science.gov (United States)

    Divita, Guy; Carter, Marjorie E; Tran, Le-Thuy; Redd, Doug; Zeng, Qing T; Duvall, Scott; Samore, Matthew H; Gundlapalli, Adi V

    2016-01-01

    Substantial amounts of clinically significant information are contained only within the narrative of the clinical notes in electronic medical records. The v3NLP Framework is a set of "best-of-breed" functionalities developed to transform this information into structured data for use in quality improvement, research, population health surveillance, and decision support. MetaMap, cTAKES and similar well-known natural language processing (NLP) tools do not have sufficient scalability out of the box. The v3NLP Framework evolved out of the necessity to scale-up these tools up and provide a framework to customize and tune techniques that fit a variety of tasks, including document classification, tuned concept extraction for specific conditions, patient classification, and information retrieval. Beyond scalability, several v3NLP Framework-developed projects have been efficacy tested and benchmarked. While v3NLP Framework includes annotators, pipelines and applications, its functionalities enable developers to create novel annotators and to place annotators into pipelines and scaled applications. The v3NLP Framework has been successfully utilized in many projects including general concept extraction, risk factors for homelessness among veterans, and identification of mentions of the presence of an indwelling urinary catheter. Projects as diverse as predicting colonization with methicillin-resistant Staphylococcus aureus and extracting references to military sexual trauma are being built using v3NLP Framework components. The v3NLP Framework is a set of functionalities and components that provide Java developers with the ability to create novel annotators and to place those annotators into pipelines and applications to extract concepts from clinical text. There are scale-up and scale-out functionalities to process large numbers of records.

  15. EnvMine: a text-mining system for the automatic extraction of contextual information.

    Science.gov (United States)

    Tamames, Javier; de Lorenzo, Victor

    2010-06-01

    For ecological studies, it is crucial to count on adequate descriptions of the environments and samples being studied. Such a description must be done in terms of their physicochemical characteristics, allowing a direct comparison between different environments that would be difficult to do otherwise. Also the characterization must include the precise geographical location, to make possible the study of geographical distributions and biogeographical patterns. Currently, there is no schema for annotating these environmental features, and these data have to be extracted from textual sources (published articles). So far, this had to be performed by manual inspection of the corresponding documents. To facilitate this task, we have developed EnvMine, a set of text-mining tools devoted to retrieve contextual information (physicochemical variables and geographical locations) from textual sources of any kind. EnvMine is capable of retrieving the physicochemical variables cited in the text, by means of the accurate identification of their associated units of measurement. In this task, the system achieves a recall (percentage of items retrieved) of 92% with less than 1% error. Also a Bayesian classifier was tested for distinguishing parts of the text describing environmental characteristics from others dealing with, for instance, experimental settings.Regarding the identification of geographical locations, the system takes advantage of existing databases such as GeoNames to achieve 86% recall with 92% precision. The identification of a location includes also the determination of its exact coordinates (latitude and longitude), thus allowing the calculation of distance between the individual locations. EnvMine is a very efficient method for extracting contextual information from different text sources, like published articles or web pages. This tool can help in determining the precise location and physicochemical variables of sampling sites, thus facilitating the performance

  16. Fuzzy-Based Segmentation for Variable Font-Sized Text Extraction from Images/Videos

    Directory of Open Access Journals (Sweden)

    Samabia Tehsin

    2014-01-01

    Full Text Available Textual information embedded in multimedia can provide a vital tool for indexing and retrieval. A lot of work is done in the field of text localization and detection because of its very fundamental importance. One of the biggest challenges of text detection is to deal with variation in font sizes and image resolution. This problem gets elevated due to the undersegmentation or oversegmentation of the regions in an image. The paper addresses this problem by proposing a solution using novel fuzzy-based method. This paper advocates postprocessing segmentation method that can solve the problem of variation in text sizes and image resolution. The methodology is tested on ICDAR 2011 Robust Reading Challenge dataset which amply proves the strength of the recommended method.

  17. A crowdsourcing workflow for extracting chemical-induced disease relations from free text.

    Science.gov (United States)

    Li, Tong Shu; Bravo, Àlex; Furlong, Laura I; Good, Benjamin M; Su, Andrew I

    2016-01-01

    Relations between chemicals and diseases are one of the most queried biomedical interactions. Although expert manual curation is the standard method for extracting these relations from the literature, it is expensive and impractical to apply to large numbers of documents, and therefore alternative methods are required. We describe here a crowdsourcing workflow for extracting chemical-induced disease relations from free text as part of the BioCreative V Chemical Disease Relation challenge. Five non-expert workers on the CrowdFlower platform were shown each potential chemical-induced disease relation highlighted in the original source text and asked to make binary judgments about whether the text supported the relation. Worker responses were aggregated through voting, and relations receiving four or more votes were predicted as true. On the official evaluation dataset of 500 PubMed abstracts, the crowd attained a 0.505F-score (0.475 precision, 0.540 recall), with a maximum theoretical recall of 0.751 due to errors with named entity recognition. The total crowdsourcing cost was $1290.67 ($2.58 per abstract) and took a total of 7 h. A qualitative error analysis revealed that 46.66% of sampled errors were due to task limitations and gold standard errors, indicating that performance can still be improved. All code and results are publicly available athttps://github.com/SuLab/crowd_cid_relexDatabase URL:https://github.com/SuLab/crowd_cid_relex. © The Author(s) 2016. Published by Oxford University Press.

  18. The enhancement of TextRank algorithm by using word2vec and its application on topic extraction

    Science.gov (United States)

    Zuo, Xiaolei; Zhang, Silan; Xia, Jingbo

    2017-08-01

    TextRank is a traditional method for keyword matching and topic extraction, while its drawback stems from the ignoring of the semantic similarity among texts. By using word embedding technique, Word2Vec was incorporated into traditional TextRank and four simulation tests were carried on for model comparison. The results showed that the hybrid combination of Word2Vec and TextRank algorithms achieved better keyword/topic extraction towards our testing text dataset.

  19. Protein function prediction using text-based features extracted from the biomedical literature: the CAFA challenge.

    Science.gov (United States)

    Wong, Andrew; Shatkay, Hagit

    2013-01-01

    Advances in sequencing technology over the past decade have resulted in an abundance of sequenced proteins whose function is yet unknown. As such, computational systems that can automatically predict and annotate protein function are in demand. Most computational systems use features derived from protein sequence or protein structure to predict function. In an earlier work, we demonstrated the utility of biomedical literature as a source of text features for predicting protein subcellular location. We have also shown that the combination of text-based and sequence-based prediction improves the performance of location predictors. Following up on this work, for the Critical Assessment of Function Annotations (CAFA) Challenge, we developed a text-based system that aims to predict molecular function and biological process (using Gene Ontology terms) for unannotated proteins. In this paper, we present the preliminary work and evaluation that we performed for our system, as part of the CAFA challenge. We have developed a preliminary system that represents proteins using text-based features and predicts protein function using a k-nearest neighbour classifier (Text-KNN). We selected text features for our classifier by extracting key terms from biomedical abstracts based on their statistical properties. The system was trained and tested using 5-fold cross-validation over a dataset of 36,536 proteins. System performance was measured using the standard measures of precision, recall, F-measure and overall accuracy. The performance of our system was compared to two baseline classifiers: one that assigns function based solely on the prior distribution of protein function (Base-Prior) and one that assigns function based on sequence similarity (Base-Seq). The overall prediction accuracy of Text-KNN, Base-Prior, and Base-Seq for molecular function classes are 62%, 43%, and 58% while the overall accuracy for biological process classes are 17%, 11%, and 28% respectively. Results

  20. Protein Function Prediction using Text-based Features extracted from the Biomedical Literature: The CAFA Challenge

    Science.gov (United States)

    2013-01-01

    Background Advances in sequencing technology over the past decade have resulted in an abundance of sequenced proteins whose function is yet unknown. As such, computational systems that can automatically predict and annotate protein function are in demand. Most computational systems use features derived from protein sequence or protein structure to predict function. In an earlier work, we demonstrated the utility of biomedical literature as a source of text features for predicting protein subcellular location. We have also shown that the combination of text-based and sequence-based prediction improves the performance of location predictors. Following up on this work, for the Critical Assessment of Function Annotations (CAFA) Challenge, we developed a text-based system that aims to predict molecular function and biological process (using Gene Ontology terms) for unannotated proteins. In this paper, we present the preliminary work and evaluation that we performed for our system, as part of the CAFA challenge. Results We have developed a preliminary system that represents proteins using text-based features and predicts protein function using a k-nearest neighbour classifier (Text-KNN). We selected text features for our classifier by extracting key terms from biomedical abstracts based on their statistical properties. The system was trained and tested using 5-fold cross-validation over a dataset of 36,536 proteins. System performance was measured using the standard measures of precision, recall, F-measure and overall accuracy. The performance of our system was compared to two baseline classifiers: one that assigns function based solely on the prior distribution of protein function (Base-Prior) and one that assigns function based on sequence similarity (Base-Seq). The overall prediction accuracy of Text-KNN, Base-Prior, and Base-Seq for molecular function classes are 62%, 43%, and 58% while the overall accuracy for biological process classes are 17%, 11%, and 28

  1. Automatic extraction of reference gene from literature in plants based on texting mining.

    Science.gov (United States)

    He, Lin; Shen, Gengyu; Li, Fei; Huang, Shuiqing

    2015-01-01

    Real-Time Quantitative Polymerase Chain Reaction (qRT-PCR) is widely used in biological research. It is a key to the availability of qRT-PCR experiment to select a stable reference gene. However, selecting an appropriate reference gene usually requires strict biological experiment for verification with high cost in the process of selection. Scientific literatures have accumulated a lot of achievements on the selection of reference gene. Therefore, mining reference genes under specific experiment environments from literatures can provide quite reliable reference genes for similar qRT-PCR experiments with the advantages of reliability, economic and efficiency. An auxiliary reference gene discovery method from literature is proposed in this paper which integrated machine learning, natural language processing and text mining approaches. The validity tests showed that this new method has a better precision and recall on the extraction of reference genes and their environments.

  2. Unsupervised Learning of mDTD Extraction Patterns for Web Text Mining.

    Science.gov (United States)

    Kim, Dongseok; Jung, Hanmin; Lee, Gary Geunbae

    2003-01-01

    Presents a new extraction pattern, modified Document Type Definition (mDTD), which relies on analytical interpretation to identify extraction target from the contents of Web documents. Experiments with 330 Korean and 220 English Web documents on audio and video shopping sites yielded an average extraction precision of 91.3% for Korean and 81.9%…

  3. Extraction of events and rules of land use/cover change from the policy text

    Science.gov (United States)

    Lin, Guangfa; Xia, Beicheng; Huang, Wangli; Jiang, Huixian; Chen, Youfei

    2007-06-01

    The database of recording the snapshots of land parcels history is the foundation for the most of the models on simulating land use/cover change (LUCC) process. But the sequences of temporal snapshots are not sufficient to deduce and describe the mechanism of LUCC process. The temporal relationship between scenarios of LUCC we recorded could not be transfer into causal relationship categorically, which was regarded as a key factor in spatial-temporal reasoning. The proprietor of land parcels adapted themselves to the policies from governments and the change of production market, and then made decisions in this or that way. The occurrence of each change of a land parcel in an urban area was often related with one or more decision texts when it was investigated on the local scale with high resolution of the background scene. These decision texts may come from different sections of a hierarchical government system on different levels, such as villages or communities, towns or counties, cities, provinces or even the paramount. All these texts were balance results between advantages and disadvantages of different interest groups. They are the essential forces of LUCC in human dimension. Up to now, a methodology is still wanted for on how to express these forces in a simulation system using GIS as a language. The presented paper was part of our initial research on this topic. The term "Event" is a very important concept in the frame of "Object-Oriented" theory in computer science. While in the domain of temporal GIS, the concept of event was developed in another category. The definitions of the event and their transformation relationship were discussed in this paper on three modeling levels as real world level, conceptual level and programming level. In this context, with a case study of LUCC in recent 30 years in Xiamen city of Fujian province, P. R. China, the paper focused on how to extract information of events and rules from the policy files collected and integrate

  4. Clinical records anonymisation and text extraction (CRATE): an open-source software system.

    Science.gov (United States)

    Cardinal, Rudolf N

    2017-04-26

    Electronic medical records contain information of value for research, but contain identifiable and often highly sensitive confidential information. Patient-identifiable information cannot in general be shared outside clinical care teams without explicit consent, but anonymisation/de-identification allows research uses of clinical data without explicit consent. This article presents CRATE (Clinical Records Anonymisation and Text Extraction), an open-source software system with separable functions: (1) it anonymises or de-identifies arbitrary relational databases, with sensitivity and precision similar to previous comparable systems; (2) it uses public secure cryptographic methods to map patient identifiers to research identifiers (pseudonyms); (3) it connects relational databases to external tools for natural language processing; (4) it provides a web front end for research and administrative functions; and (5) it supports a specific model through which patients may consent to be contacted about research. Creation and management of a research database from sensitive clinical records with secure pseudonym generation, full-text indexing, and a consent-to-contact process is possible and practical using entirely free and open-source software.

  5. Efficient extraction of protein-protein interactions from full-text articles.

    Science.gov (United States)

    Hakenberg, Jörg; Leaman, Robert; Vo, Nguyen Ha; Jonnalagadda, Siddhartha; Sullivan, Ryan; Miller, Christopher; Tari, Luis; Baral, Chitta; Gonzalez, Graciela

    2010-01-01

    Proteins and their interactions govern virtually all cellular processes, such as regulation, signaling, metabolism, and structure. Most experimental findings pertaining to such interactions are discussed in research papers, which, in turn, get curated by protein interaction databases. Authors, editors, and publishers benefit from efforts to alleviate the tasks of searching for relevant papers, evidence for physical interactions, and proper identifiers for each protein involved. The BioCreative II.5 community challenge addressed these tasks in a competition-style assessment to evaluate and compare different methodologies, to make aware of the increasing accuracy of automated methods, and to guide future implementations. In this paper, we present our approaches for protein-named entity recognition, including normalization, and for extraction of protein-protein interactions from full text. Our overall goal is to identify efficient individual components, and we compare various compositions to handle a single full-text article in between 10 seconds and 2 minutes. We propose strategies to transfer document-level annotations to the sentence-level, which allows for the creation of a more fine-grained training corpus; we use this corpus to automatically derive around 5,000 patterns. We rank sentences by relevance to the task of finding novel interactions with physical evidence, using a sentence classifier built from this training corpus. Heuristics for paraphrasing sentences help to further remove unnecessary information that might interfere with patterns, such as additional adjectives, clauses, or bracketed expressions. In BioCreative II.5, we achieved an f-score of 22 percent for finding protein interactions, and 43 percent for mapping proteins to UniProt IDs; disregarding species, f-scores are 30 percent and 55 percent, respectively. On average, our best-performing setup required around 2 minutes per full text. All data and pattern sets as well as Java classes that

  6. Text data extraction for a prospective, research-focused data mart: implementation and validation

    Directory of Open Access Journals (Sweden)

    Hinchcliff Monique

    2012-09-01

    Full Text Available Abstract Background Translational research typically requires data abstracted from medical records as well as data collected specifically for research. Unfortunately, many data within electronic health records are represented as text that is not amenable to aggregation for analyses. We present a scalable open source SQL Server Integration Services package, called Regextractor, for including regular expression parsers into a classic extract, transform, and load workflow. We have used Regextractor to abstract discrete data from textual reports from a number of ‘machine generated’ sources. To validate this package, we created a pulmonary function test data mart and analyzed the quality of the data mart versus manual chart review. Methods Eleven variables from pulmonary function tests performed closest to the initial clinical evaluation date were studied for 100 randomly selected subjects with scleroderma. One research assistant manually reviewed, abstracted, and entered relevant data into a database. Correlation with data obtained from the automated pulmonary function test data mart within the Northwestern Medical Enterprise Data Warehouse was determined. Results There was a near perfect (99.5% agreement between results generated from the Regextractor package and those obtained via manual chart abstraction. The pulmonary function test data mart has been used subsequently to monitor disease progression of patients in the Northwestern Scleroderma Registry. In addition to the pulmonary function test example presented in this manuscript, the Regextractor package has been used to create cardiac catheterization and echocardiography data marts. The Regextractor package was released as open source software in October 2009 and has been downloaded 552 times as of 6/1/2012. Conclusions Collaboration between clinical researchers and biomedical informatics experts enabled the development and validation of a tool (Regextractor to parse, abstract and assemble

  7. Text data extraction for a prospective, research-focused data mart: implementation and validation.

    Science.gov (United States)

    Hinchcliff, Monique; Just, Eric; Podlusky, Sofia; Varga, John; Chang, Rowland W; Kibbe, Warren A

    2012-09-13

    Translational research typically requires data abstracted from medical records as well as data collected specifically for research. Unfortunately, many data within electronic health records are represented as text that is not amenable to aggregation for analyses. We present a scalable open source SQL Server Integration Services package, called Regextractor, for including regular expression parsers into a classic extract, transform, and load workflow. We have used Regextractor to abstract discrete data from textual reports from a number of 'machine generated' sources. To validate this package, we created a pulmonary function test data mart and analyzed the quality of the data mart versus manual chart review. Eleven variables from pulmonary function tests performed closest to the initial clinical evaluation date were studied for 100 randomly selected subjects with scleroderma. One research assistant manually reviewed, abstracted, and entered relevant data into a database. Correlation with data obtained from the automated pulmonary function test data mart within the Northwestern Medical Enterprise Data Warehouse was determined. There was a near perfect (99.5%) agreement between results generated from the Regextractor package and those obtained via manual chart abstraction. The pulmonary function test data mart has been used subsequently to monitor disease progression of patients in the Northwestern Scleroderma Registry. In addition to the pulmonary function test example presented in this manuscript, the Regextractor package has been used to create cardiac catheterization and echocardiography data marts. The Regextractor package was released as open source software in October 2009 and has been downloaded 552 times as of 6/1/2012. Collaboration between clinical researchers and biomedical informatics experts enabled the development and validation of a tool (Regextractor) to parse, abstract and assemble structured data from text data contained in the electronic health

  8. Knowledge-based extraction of adverse drug events from biomedical text

    NARCIS (Netherlands)

    N. Kang (Ning); B. Singh (Bharat); C. Bui (Chinh); Z. Afzal (Zubair); E.M. van Mulligen (Erik); J.A. Kors (Jan)

    2014-01-01

    textabstractBackground: Many biomedical relation extraction systems are machine-learning based and have to be trained on large annotated corpora that are expensive and cumbersome to construct. We developed a knowledge-based relation extraction system that requires minimal training data, and applied

  9. Chemical Composition and Biological Activity of Extracts Obtained by Supercritical Extraction and Ethanolic Extraction of Brown, Green and Red Propolis Derived from Different Geographic Regions in Brazil.

    Directory of Open Access Journals (Sweden)

    Bruna Aparecida Souza Machado

    Full Text Available The variations in the chemical composition, and consequently, on the biological activity of the propolis, are associated with its type and geographic origin. Considering this fact, this study evaluated propolis extracts obtained by supercritical extraction (SCO2 and ethanolic extraction (EtOH, in eight samples of different types of propolis (red, green and brown, collected from different regions in Brazil. The content of phenolic compounds, flavonoids, in vitro antioxidant activity (DPPH and ABTS, Artepillin C, p-coumaric acid and antimicrobial activity against two bacteria were determined for all extracts. For the EtOH extracts, the anti-proliferative activity regarding the cell lines of B16F10, were also evaluated. Amongst the samples evaluated, the red propolis from the Brazilian Northeast (states of Sergipe and Alagoas showed the higher biological potential, as well as the larger content of antioxidant compounds. The best results were shown for the extracts obtained through the conventional extraction method (EtOH. However, the highest concentrations of Artepillin C and p-coumaric acid were identified in the extracts from SCO2, indicating a higher selectivity for the extraction of these compounds. It was verified that the composition and biological activity of the Brazilian propolis vary significantly, depending on the type of sample and geographical area of collection.

  10. Region-Based Building Rooftop Extraction and Change Detection

    Science.gov (United States)

    Tian, J.; Metzlaff, L.; d'Angelo, P.; Reinartz, P.

    2017-09-01

    Automatic extraction of building changes is important for many applications like disaster monitoring and city planning. Although a lot of research work is available based on 2D as well as 3D data, an improvement in accuracy and efficiency is still needed. The introducing of digital surface models (DSMs) to building change detection has strongly improved the resulting accuracy. In this paper, a post-classification approach is proposed for building change detection using satellite stereo imagery. Firstly, DSMs are generated from satellite stereo imagery and further refined by using a segmentation result obtained from the Sobel gradients of the panchromatic image. Besides the refined DSMs, the panchromatic image and the pansharpened multispectral image are used as input features for mean-shift segmentation. The DSM is used to calculate the nDSM, out of which the initial building candidate regions are extracted. The candidate mask is further refined by morphological filtering and by excluding shadow regions. Following this, all segments that overlap with a building candidate region are determined. A building oriented segments merging procedure is introduced to generate a final building rooftop mask. As the last step, object based change detection is performed by directly comparing the building rooftops extracted from the pre- and after-event imagery and by fusing the change indicators with the roof-top region map. A quantitative and qualitative assessment of the proposed approach is provided by using WorldView-2 satellite data from Istanbul, Turkey.

  11. Carniola oživljena: Changing Practice in Citing Slovenian Regions in English Texts

    Directory of Open Access Journals (Sweden)

    Donald F. Reindl

    2010-05-01

    Full Text Available The past century has witnessed a striking change in the representation of Slovenia’s traditional regions in English texts. After the Second World War, Slovenians progressively replaced the traditional English exonyms for these regions with endonyms in English texts. This trend was accompanied by published works and teaching practice that increasingly insisted on the exclusive use of endonyms in English texts. However, following the dissolution of Yugoslavia and Slovenian independence, there has been a return to the traditional English exonyms. This article maps this changing practice through selected English texts from the past three centuries. It also addresses a number of pitfalls connected with the use of endonyms as well as persistent questions regarding the use of endonyms. Because English is a global language, the choices made by those writing in English directly affect how Slovenia and Slovenian identity are represented at the global level. As such, the conclusions of this paper apply directly to Slovenian-English translation practice and indirectly to Slovenian literature and culture conveyed through English translation.

  12. Validation of the Total Visual Acuity Extraction Algorithm (TOVA) for Automated Extraction of Visual Acuity Data From Free Text, Unstructured Clinical Records.

    Science.gov (United States)

    Baughman, Douglas M; Su, Grace L; Tsui, Irena; Lee, Cecilia S; Lee, Aaron Y

    2017-03-01

    With increasing volumes of electronic health record data, algorithm-driven extraction may aid manual extraction. Visual acuity often is extracted manually in vision research. The total visual acuity extraction algorithm (TOVA) is presented and validated for automated extraction of visual acuity from free text, unstructured clinical notes. Consecutive inpatient ophthalmology notes over an 8-year period from the University of Washington healthcare system in Seattle, WA were used for validation of TOVA. The total visual acuity extraction algorithm applied natural language processing to recognize Snellen visual acuity in free text notes and assign laterality. The best corrected measurement was determined for each eye and converted to logMAR. The algorithm was validated against manual extraction of a subset of notes. A total of 6266 clinical records were obtained giving 12,452 data points. In a subset of 644 validated notes, comparison of manually extracted data versus TOVA output showed 95% concordance. Interrater reliability testing gave κ statistics of 0.94 (95% confidence interval [CI], 0.89-0.99), 0.96 (95% CI, 0.94-0.98), 0.95 (95% CI, 0.92-0.98), and 0.94 (95% CI, 0.90-0.98) for acuity numerators, denominators, adjustments, and signs, respectively. Pearson correlation coefficient was 0.983. Linear regression showed an R2 of 0.966 (P unstructured clinical notes and provides an open source method of data extraction. Automated visual acuity extraction through natural language processing can be a valuable tool for data extraction from free text ophthalmology notes.

  13. Apache Clinical Text and Knowledge Extraction System (cTAKES) | Informatics Technology for Cancer Research (ITCR)

    Science.gov (United States)

    The tool extracts deep phenotypic information from the clinical narrative at the document-, episode-, and patient-level. The final output is FHIR compliant patient-level phenotypic summary which can be consumed by research warehouses or the DeepPhe native visualization tool.

  14. MedXN: an open source medication extraction and normalization tool for clinical text

    Science.gov (United States)

    Sohn, Sunghwan; Clark, Cheryl; Halgrim, Scott R; Murphy, Sean P; Chute, Christopher G; Liu, Hongfang

    2014-01-01

    Objective We developed the Medication Extraction and Normalization (MedXN) system to extract comprehensive medication information and normalize it to the most appropriate RxNorm concept unique identifier (RxCUI) as specifically as possible. Methods Medication descriptions in clinical notes were decomposed into medication name and attributes, which were separately extracted using RxNorm dictionary lookup and regular expression. Then, each medication name and its attributes were combined together according to RxNorm convention to find the most appropriate RxNorm representation. To do this, we employed serialized hierarchical steps implemented in Apache's Unstructured Information Management Architecture. We also performed synonym expansion, removed false medications, and employed inference rules to improve the medication extraction and normalization performance. Results An evaluation on test data of 397 medication mentions showed F-measures of 0.975 for medication name and over 0.90 for most attributes. The RxCUI assignment produced F-measures of 0.932 for medication name and 0.864 for full medication information. Most false negative RxCUI assignments in full medication information are due to human assumption of missing attributes and medication names in the gold standard. Conclusions The MedXN system (http://sourceforge.net/projects/ohnlp/files/MedXN/) was able to extract comprehensive medication information with high accuracy and demonstrated good normalization capability to RxCUI as long as explicit evidence existed. More sophisticated inference rules might result in further improvements to specific RxCUI assignments for incomplete medication descriptions. PMID:24637954

  15. Automatic segmentation of brain images: selection of region extraction methods

    Science.gov (United States)

    Gong, Leiguang; Kulikowski, Casimir A.; Mezrich, Reuben S.

    1991-07-01

    In automatically analyzing brain structures from a MR image, the choice of low level region extraction methods depends on the characteristics of both the target object and the surrounding anatomical structures in the image. The authors have experimented with local thresholding, global thresholding, and other techniques, using various types of MR images for extracting the major brian landmarks and different types of lesions. This paper describes specifically a local- binary thresholding method and a new global-multiple thresholding technique developed for MR image segmentation and analysis. The initial testing results on their segmentation performance are presented, followed by a comparative analysis of the two methods and their ability to extract different types of normal and abnormal brain structures -- the brain matter itself, tumors, regions of edema surrounding lesions, multiple sclerosis lesions, and the ventricles of the brain. The analysis and experimental results show that the global multiple thresholding techniques are more than adequate for extracting regions that correspond to the major brian structures, while local binary thresholding is helpful for more accurate delineation of small lesions such as those produced by MS, and for the precise refinement of lesion boundaries. The detection of other landmarks, such as the interhemispheric fissure, may require other techniques, such as line-fitting. These experiments have led to the formulation of a set of generic computer-based rules for selecting the appropriate segmentation packages for particular types of problems, based on which further development of an innovative knowledge- based, goal directed biomedical image analysis framework is being made. The system will carry out the selection automatically for a given specific analysis task.

  16. Impact of Machine-Translated Text on Entity and Relationship Extraction

    Science.gov (United States)

    2014-12-01

    onto an existing ontology of frames at the sentence level, using FrameNet, a structured language model, and through Semantic Role Labeling (SRL...Arabic language news articles collected from the web using Contour, a social network analysis tool acquired via a Small Business Innovation Research... semantic modeling software to automatically build detailed network models from unstructured text. Contour imports unstructured text and then maps the text

  17. Comparison of Grouping Methods for Template Extraction from VA Medical Record Text.

    Science.gov (United States)

    Redd, Andrew M; Gundlapalli, Adi V; Divita, Guy; Tran, Le-Thuy; Pettey, Warren B P; Samore, Matthew H

    2017-01-01

    We investigate options for grouping templates for the purpose of template identification and extraction from electronic medical records. We sampled a corpus of 1000 documents originating from Veterans Health Administration (VA) electronic medical record. We grouped documents through hashing and binning tokens (Hashed) as well as by the top 5% of tokens identified as important through the term frequency inverse document frequency metric (TF-IDF). We then compared the approaches on the number of groups with 3 or more and the resulting longest common subsequences (LCSs) common to all documents in the group. We found that the Hashed method had a higher success rate for finding LCSs, and longer LCSs than the TF-IDF method, however the TF-IDF approach found more groups than the Hashed and subsequently more long sequences, however the average length of LCSs were lower. In conclusion, each algorithm appears to have areas where it appears to be superior.

  18. Conversation Thread Extraction and Topic Detection in Text-Based Chat

    National Research Council Canada - National Science Library

    Adams, Paige H

    2008-01-01

    Text-based chat systems are widely used within the Department of Defense, but the standard systems available do not provide robust capabilities for search, information retrieval, or information assurance...

  19. Text mining tools for extracting information about microbial biodiversity in food

    OpenAIRE

    Deleger, Louise; Bossy, Robert; Nédellec, Claire

    2017-01-01

    Introduction Information on food microbial biodiversity is scattered across millions of scientific papers (2 million references in the PubMed bibliographic database in 2017). It is impossible to manually achieve an exhaustive analysis of these documents. Text-mining and knowledge engineering methods can assist the researcher in finding relevant information. Material & Methods We propose to study bacterial biodiversity using text-mining tools from the Alvis platform. First, w...

  20. In vitro antiplasmodial activity and toxicity assessment of plant extracts used in traditional malaria therapy in the Lake Victoria Region

    Directory of Open Access Journals (Sweden)

    Teresa Akeng'a Ayuko

    2009-08-01

    Full Text Available As part of our program screening the flora of the Lake Victoria Region, a total of 54 organic extracts from seven plant families (8 species were individually tested for antiplasmodial activity against chloroquine-sensitive [Sierra Leone (D-6] and chloroquine-resistant [Vietnam (W-2] strains. Only 22% of these extracts exhibited very high in vitro antiplasmodial activity. Six methanol (MeOH extracts and one chloroform extract showed in vitro antiplasmodial activity against the D-6 Plasmodium falciparum strain, while only three MeOH extracts were active against the W-2 strain. All of the ethyl acetate extracts proved to be inactive against both strains of P. falciparum. A brine shrimp cytotoxicity assay was used to predict the potential toxicity of the extracts. The cytotoxicity to antiplasmodial ratios for the MeOH extracts were found to be greater than 100, which could indicate that the extracts are of low toxicity.

  1. Parts-of-Speech Tagger Errors Do Not Necessarily Degrade Accuracy in Extracting Information from Biomedical Text

    Directory of Open Access Journals (Sweden)

    2008-06-01

    Full Text Available Background: An ongoing assessment of the literature is difficult with the rapidly increasing volume of research publications and limited effective information extraction tools which identify entity relationships from text. A recent study reported development of Muscorian, a generic text processing tool for extracting protein-protein interactions from text that achieved comparable performance to biomedical-specific text processing tools. This result was unexpected since potential errors from a series of text analysis processes is likely to adversely affect the outcome of the entire process. Most biomedical entity relationship extraction tools have used biomedical-specific parts-of-speech (POS tagger as errors in POS tagging and are likely to affect subsequent semantic analysis of the text, such as shallow parsing. This study aims to evaluate the parts-of-speech (POS tagging accuracy and attempts to explore whether a comparable performance is obtained when a generic POS tagger, MontyTagger, was used in place of MedPost, a tagger trained in biomedical text. Results: Our results demonstrated that MontyTagger, Muscorian's POS tagger, has a POS tagging accuracy of 83.1% when tested on biomedical text. Replacing MontyTagger with MedPost did not result in a significant improvement in entity relationship extraction from text; precision of 55.6% from MontyTagger versus 56.8% from MedPost on directional relationships and 86.1% from MontyTagger compared to 81.8% from MedPost on nondirectional relationships. This is unexpected as the potential for poor POS tagging by MontyTagger is likely to affect the outcome of the information extraction. An analysis of POS tagging errors demonstrated that 78.5% of tagging errors are being compensated by shallow parsing. Thus, despite 83.1% tagging accuracy, MontyTagger has a functional tagging accuracy of 94.6%. Conclusions: The POS tagging error does not adversely affect the information extraction task if the

  2. Computerized extraction of information on the quality of diabetes care from free text in electronic patient records of general practitioners

    NARCIS (Netherlands)

    Voorham, Jaco; Denig, Petra

    2007-01-01

    Objective: This study evaluated a computerized method for extracting numeric clinical measurements related to diabetes care from free text in electronic patient records (EPR) of general practitioners. Design and Measurements: Accuracy of this number-oriented approach was compared to manual chart

  3. An e-Learning System for Extracting Text Comprehension and Learning Style Characteristics

    Science.gov (United States)

    Samarakou, Maria; Tsaganou, Grammatiki; Papadakis, Andreas

    2018-01-01

    Technology-mediated learning is very actively and widely researched, with numerous e-learning environments designed for different educational purposes developed during the past few decades. Still, their organization and texts are not structured according to any theory of educational comprehension. Modern education is even more flexible and, thus,…

  4. Adding a Capability to Extract Sentiment from Text Using HanDles

    Science.gov (United States)

    2012-05-01

    effectuer la transition d’HanDles vers un environnement opérationnel, les intervenants de RDDC et des FC doivent décider quels sont les documents les plus...choisi une catégorie de texte, nous pourrons procéder à la formation et à la mise à l’essai du système au sein d’un environnement plus réaliste que ceux

  5. A Text Mining Approach for Extracting Lessons Learned from Project Documentation: An Illustrative Case Study

    Directory of Open Access Journals (Sweden)

    Benjamin Matthies

    2017-12-01

    Full Text Available Lessons learned are important building blocks for continuous learning in project-based organisations. Nonetheless, the practical reality is that lessons learned are often not consistently reused for organisational learning. Two problems are commonly described in this context: the information overload and the lack of procedures and methods for the assessment and implementation of lessons learned. This paper addresses these problems, and appropriate solutions are combined in a systematic lesson learned process. Latent Dirichlet Allocation is presented to solve the first problem. Regarding the second problem, established risk management methods are adapted. The entire lessons learned process will be demonstrated in a practical case study

  6. How to Design and Present Texts to Cultivate Balanced Regional Images in Geography Education

    Science.gov (United States)

    Lee, Dong-Min; Ryu, Jaemyong

    2013-01-01

    This article examines possibilities associated with the cultivation of balanced regional images via the use of simple methods. Two experiments based on the primacy effect and the painting picture rule, or visual depiction of regions, were conducted. The results show significant differences in the formation of regional images. More specifically,…

  7. C-C1-02: Data Extraction From Text, Step 1: Preparing Test for Machine Processing

    Science.gov (United States)

    Carrell, David

    2010-01-01

    Background: Natural language processing (NLP) uses software to assist in the extraction of information from clinical text, a process usually performed entirely by chart abstractors. Before NLP can be applied the text in question must be prepared for machine processing. In research settings this pre- processing work often involves several successive and related tasks, requiring substantial amounts of time and attention from people representing various types of clinical, scientific and technical expertise. Appreciating the tasks and participants involved in pre-processing clinical text can make the work more manageable, efficient, and effective. Methods: The information presented here comes from case study analyses of three small-scale projects involving preparation of clinical text (pathology reports, radiology reports, and progress notes) for processing by the Cancer Text Information Extraction System. Supplementing these experiences is information from anecdotal conversations with natural language processing experts. Results: Ten separate pre-processing tasks were identified: obtaining source feeds, assessing completeness, de-duplication, universe description, cleaning and formatting, de-identification, database loading, sampling, preparation of the NLP system input feed, and quality assurance. Nine types of expertise or task participants required for preprocessing were identified: IRB representative, source-system manager, network/dbase administrator, programmer, statistician, investigator, informaticist, clinical domain expert, and manual chart abstractor. Conclusions: Pre-processing clinical text is an important phase and potentially challenging aspect of extracting information from clinical text using NLP. Because researchers require accurate information about the larger universe of documents or patients represented by the sampled and processed text, pre-processing can present numerous challenges, the solutions to which draw on many areas of expertise in a

  8. Extracting salient sublexical units from written texts: “Emophon,” a corpus-based approach to phonological iconicity

    Science.gov (United States)

    Aryani, Arash; Jacobs, Arthur M.; Conrad, Markus

    2013-01-01

    A growing body of literature in psychology, linguistics, and the neurosciences has paid increasing attention to the understanding of the relationships between phonological representations of words and their meaning: a phenomenon also known as phonological iconicity. In this article, we investigate how a text's intended emotional meaning, particularly in literature and poetry, may be reflected at the level of sublexical phonological salience and the use of foregrounded elements. To extract such elements from a given text, we developed a probabilistic model to predict the exceeding of a confidence interval for specific sublexical units concerning their frequency of occurrence within a given text contrasted with a reference linguistic corpus for the German language. Implementing this model in a computational application, we provide a text analysis tool which automatically delivers information about sublexical phonological salience allowing researchers, inter alia, to investigate effects of the sublexical emotional tone of texts based on current findings on phonological iconicity. PMID:24101907

  9. Extracting salient sublexical units from written texts:‘Emophon’, a corpus-based approach to phonological iconicity

    Directory of Open Access Journals (Sweden)

    Arash eAryani

    2013-10-01

    Full Text Available A growing body of literature in psychology, linguistics, and the neurosciences has paid increasing attention to the understanding of the relationships between phonological representations of words and their meaning: a phenomenon also known as phonological iconicity. In this article, we investigate how a text’s intended emotional meaning, particularly in literature and poetry, may be reflected at the level of sublexical phonological salience and the use of foregrounded elements. To extract such elements from a given text, we developed a probabilistic model to predict the exceeding of a confidence interval for specific sublexical units concerning their frequency of occurrence within a given text contrasted with a reference linguistic corpus for the German language. Implementing this model in a computational application, we provide a text analysis tool which automatically delivers information about sublexical phonological salience allowing researchers, inter alia, to investigate effects of the sublexical emotional tone of texts based on current findings on phonological iconicity.

  10. Large-scale automatic extraction of side effects associated with targeted anticancer drugs from full-text oncological articles.

    Science.gov (United States)

    Xu, Rong; Wang, QuanQiu

    2015-06-01

    Targeted anticancer drugs such as imatinib, trastuzumab and erlotinib dramatically improved treatment outcomes in cancer patients, however, these innovative agents are often associated with unexpected side effects. The pathophysiological mechanisms underlying these side effects are not well understood. The availability of a comprehensive knowledge base of side effects associated with targeted anticancer drugs has the potential to illuminate complex pathways underlying toxicities induced by these innovative drugs. While side effect association knowledge for targeted drugs exists in multiple heterogeneous data sources, published full-text oncological articles represent an important source of pivotal, investigational, and even failed trials in a variety of patient populations. In this study, we present an automatic process to extract targeted anticancer drug-associated side effects (drug-SE pairs) from a large number of high profile full-text oncological articles. We downloaded 13,855 full-text articles from the Journal of Oncology (JCO) published between 1983 and 2013. We developed text classification, relationship extraction, signaling filtering, and signal prioritization algorithms to extract drug-SE pairs from downloaded articles. We extracted a total of 26,264 drug-SE pairs with an average precision of 0.405, a recall of 0.899, and an F1 score of 0.465. We show that side effect knowledge from JCO articles is largely complementary to that from the US Food and Drug Administration (FDA) drug labels. Through integrative correlation analysis, we show that targeted drug-associated side effects positively correlate with their gene targets and disease indications. In conclusion, this unique database that we built from a large number of high-profile oncological articles could facilitate the development of computational models to understand toxic effects associated with targeted anticancer drugs. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. Feature Extraction for Facial Expression Recognition based on Hybrid Face Regions

    Directory of Open Access Journals (Sweden)

    LAJEVARDI, S.M.

    2009-10-01

    Full Text Available Facial expression recognition has numerous applications, including psychological research, improved human computer interaction, and sign language translation. A novel facial expression recognition system based on hybrid face regions (HFR is investigated. The expression recognition system is fully automatic, and consists of the following modules: face detection, facial detection, feature extraction, optimal features selection, and classification. The features are extracted from both whole face image and face regions (eyes and mouth using log Gabor filters. Then, the most discriminate features are selected based on mutual information criteria. The system can automatically recognize six expressions: anger, disgust, fear, happiness, sadness and surprise. The selected features are classified using the Naive Bayesian (NB classifier. The proposed method has been extensively assessed using Cohn-Kanade database and JAFFE database. The experiments have highlighted the efficiency of the proposed HFR method in enhancing the classification rate.

  12. Extraction of Accurate Stomach Contour Using Approximated Stomach Region

    OpenAIRE

    小林, 富士男; 尾崎, 誠; コバヤシ, フジオ; オザキ, マコト; Fujio, KOBAYASHI; Makoto, OZAKI

    1999-01-01

    In this paper, the method of stomach extraction is proposed. The stomach contour is automatically and accurately extracted by the characteristics of X-ray image. The approximate stomach is obtained by the combination image which is constructed from binarize of the original image and its differential image. The stomach contour is extracted by the brightness of the differential image and the shape of stomach approximation. The stomach contour is accurately extracted.

  13. Using Medical Text Extraction, Reasoning and Mapping System (MTERMS) to process medication information in outpatient clinical notes.

    Science.gov (United States)

    Zhou, Li; Plasek, Joseph M; Mahoney, Lisa M; Karipineni, Neelima; Chang, Frank; Yan, Xuemin; Chang, Fenny; Dimaggio, Dana; Goldman, Debora S; Rocha, Roberto A

    2011-01-01

    Clinical information is often coded using different terminologies, and therefore is not interoperable. Our goal is to develop a general natural language processing (NLP) system, called Medical Text Extraction, Reasoning and Mapping System (MTERMS), which encodes clinical text using different terminologies and simultaneously establishes dynamic mappings between them. MTERMS applies a modular, pipeline approach flowing from a preprocessor, semantic tagger, terminology mapper, context analyzer, and parser to structure inputted clinical notes. Evaluators manually reviewed 30 free-text and 10 structured outpatient clinical notes compared to MTERMS output. MTERMS achieved an overall F-measure of 90.6 and 94.0 for free-text and structured notes respectively for medication and temporal information. The local medication terminology had 83.0% coverage compared to RxNorm's 98.0% coverage for free-text notes. 61.6% of mappings between the terminologies are exact match. Capture of duration was significantly improved (91.7% vs. 52.5%) from systems in the third i2b2 challenge.

  14. Regional Consumer Magazines and the Ideal White Reader: Constructing and Retaining Geography as Text.

    Science.gov (United States)

    Fry, Katherine

    1994-01-01

    Examines representations of nature and culture in the consumer magazines "Midwest Living,""Southern Living," and "Sunset." Finds distinct patterns in each that construct the Midwest, South, and West as cultural/geographic texts. Finds that the magazines' emphasis on advertising and tourism both obscures and…

  15. Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

    Directory of Open Access Journals (Sweden)

    André SANTOS

    2012-07-01

    Full Text Available Scientific publications are the main vehicle to disseminate information in the field of biotechnology for wastewater treatment. Indeed, the new research paradigms and the application of high-throughput technologies have increased the rate of publication considerably. The problem is that manual curation becomes harder, prone-to-errors and time-consuming, leading to a probable loss of information and inefficient knowledge acquisition. As a result, research outputs are hardly reaching engineers, hampering the calibration of mathematical models used to optimize the stability and performance of biotechnological systems. In this context, we have developed a data curation workflow, based on text mining techniques, to extract numerical parameters from scientific literature, and applied it to the biotechnology domain. A workflow was built to process wastewater-related articles with the main goal of identifying physico-chemical parameters mentioned in the text. This work describes the implementation of the workflow, identifies achievements and current limitations in the overall process, and presents the results obtained for a corpus of 50 full-text documents.

  16. Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

    Directory of Open Access Journals (Sweden)

    Anália LOURENÇO

    2013-07-01

    Full Text Available Scientific publications are the main vehicle to disseminate information in the field of biotechnology for wastewater treatment. Indeed, the new research paradigms and the application of high-throughput technologies have increased the rate of publication considerably. The problem is that manual curation becomes harder, prone-to-errors and time-consuming, leading to a probable loss of information and inefficient knowledge acquisition. As a result, research outputs are hardly reaching engineers, hampering the calibration of mathematical models used to optimize the stability and performance of biotechnological systems. In this context, we have developed a data curation workflow, based on text mining techniques, to extract numerical parameters from scientific literature, and applied it to the biotechnology domain. A workflow was built to process wastewater-related articles with the main goal of identifying physico-chemical parameters mentioned in the text. This work describes the implementation of the workflow, identifies achievements and current limitations in the overall process, and presents the results obtained for a corpus of 50 full-text documents.

  17. Regional Urban Extent Extraction Using Multi-Sensor Data and One-Class Classification

    Directory of Open Access Journals (Sweden)

    Xiya Zhang

    2015-06-01

    Full Text Available Stable night-time light data from the Defense Meteorological Satellite Program (DMSP Operational Line-scan System (OLS provide a unique proxy for anthropogenic development. This paper presents a regional urban extent extraction method using a one-class classifier and combinations of DMSP/OLS stable night-time light (NTL data, MODIS normalized difference vegetation index (NDVI data, and land surface temperature (LST data. We first analyzed how well MODIS NDVI and LST data quantify the properties of urban areas. Considering that urban area is the only class of interest, we applied the one-class support vector machine (OCSVM to classify different combinations of the three datasets. We evaluated the effectiveness of the proposed method and compared with the locally optimized threshold method in regional urban extent mapping in China. The experimental results demonstrate that DMSP/OLS NTL data, MODIS NDVI and LST data provide different but complementary information sources to quantify the urban extent at a regional scale. The results also indicate that the OCSVM classification of the combination of all three datasets generally outperformed the locally optimized threshold method. The proposed method effectively and efficiently extracted the urban extent at a regional scale, and is applicable to other study areas.

  18. PESCADOR, a web-based tool to assist text-mining of biointeractions extracted from PubMed queries

    Directory of Open Access Journals (Sweden)

    Barbosa-Silva Adriano

    2011-11-01

    Full Text Available Abstract Background Biological function is greatly dependent on the interactions of proteins with other proteins and genes. Abstracts from the biomedical literature stored in the NCBI's PubMed database can be used for the derivation of interactions between genes and proteins by identifying the co-occurrences of their terms. Often, the amount of interactions obtained through such an approach is large and may mix processes occurring in different contexts. Current tools do not allow studying these data with a focus on concepts of relevance to a user, for example, interactions related to a disease or to a biological mechanism such as protein aggregation. Results To help the concept-oriented exploration of such data we developed PESCADOR, a web tool that extracts a network of interactions from a set of PubMed abstracts given by a user, and allows filtering the interaction network according to user-defined concepts. We illustrate its use in exploring protein aggregation in neurodegenerative disease and in the expansion of pathways associated to colon cancer. Conclusions PESCADOR is a platform independent web resource available at: http://cbdm.mdc-berlin.de/tools/pescador/

  19. Application of text information extraction system for real-time cancer case identification in an integrated healthcare organization

    Directory of Open Access Journals (Sweden)

    Fagen Xie

    2017-01-01

    Full Text Available Background: Surgical pathology reports (SPR contain rich clinical diagnosis information. The text information extraction system (TIES is an end-to-end application leveraging natural language processing technologies and focused on the processing of pathology and/or radiology reports. Methods: We deployed the TIES system and integrated SPRs into the TIES system on a daily basis at Kaiser Permanente Southern California. The breast cancer cases diagnosed in December 2013 from the Cancer Registry (CANREG were used to validate the performance of the TIES system. The National Cancer Institute Metathesaurus (NCIM concept terms and codes to describe breast cancer were identified through the Unified Medical Language System Terminology Service (UTS application. The identified NCIM codes were used to search for the coded SPRs in the back-end datastore directly. The identified cases were then compared with the breast cancer patients pulled from CANREG. Results: A total of 437 breast cancer concept terms and 14 combinations of “breast” and “cancer” terms were identified from the UTS application. A total of 249 breast cancer cases diagnosed in December 2013 was pulled from CANREG. Out of these 249 cases, 241 were successfully identified by the TIES system from a total of 457 reports. The TIES system also identified an additional 277 cases that were not part of the validation sample. Out of the 277 cases, 11% were determined as highly likely to be cases after manual examinations, and 86% were in CANREG but were diagnosed in months other than December of 2013. Conclusions: The study demonstrated that the TIES system can effectively identify potential breast cancer cases in our care setting. Identified potential cases can be easily confirmed by reviewing the corresponding annotated reports through the front-end visualization interface. The TIES system is a great tool for identifying potential various cancer cases in a timely manner and on a regular basis

  20. Shadow Detection Based on Regions of Light Sources for Object Extraction in Nighttime Video

    Directory of Open Access Journals (Sweden)

    Gil-beom Lee

    2017-03-01

    Full Text Available Intelligent video surveillance systems detect pre-configured surveillance events through background modeling, foreground and object extraction, object tracking, and event detection. Shadow regions inside video frames sometimes appear as foreground objects, interfere with ensuing processes, and finally degrade the event detection performance of the systems. Conventional studies have mostly used intensity, color, texture, and geometric information to perform shadow detection in daytime video, but these methods lack the capability of removing shadows in nighttime video. In this paper, a novel shadow detection algorithm for nighttime video is proposed; this algorithm partitions each foreground object based on the object’s vertical histogram and screens out shadow objects by validating their orientations heading toward regions of light sources. From the experimental results, it can be seen that the proposed algorithm shows more than 93.8% shadow removal and 89.9% object extraction rates for nighttime video sequences, and the algorithm outperforms conventional shadow removal algorithms designed for daytime videos.

  1. Chemical Composition and Biological Activity of Extracts Obtained by Supercritical Extraction and Ethanolic Extraction of Brown, Green and Red Propolis Derived from Different Geographic Regions in Brazil.

    Science.gov (United States)

    Machado, Bruna Aparecida Souza; Silva, Rejane Pina Dantas; Barreto, Gabriele de Abreu; Costa, Samantha Serra; Silva, Danielle Figuerêdo da; Brandão, Hugo Neves; Rocha, José Luiz Carneiro da; Dellagostin, Odir Antônio; Henriques, João Antônio Pegas; Umsza-Guez, Marcelo Andres; Padilha, Francine Ferreira

    2016-01-01

    The variations in the chemical composition, and consequently, on the biological activity of the propolis, are associated with its type and geographic origin. Considering this fact, this study evaluated propolis extracts obtained by supercritical extraction (SCO2) and ethanolic extraction (EtOH), in eight samples of different types of propolis (red, green and brown), collected from different regions in Brazil. The content of phenolic compounds, flavonoids, in vitro antioxidant activity (DPPH and ABTS), Artepillin C, p-coumaric acid and antimicrobial activity against two bacteria were determined for all extracts. For the EtOH extracts, the anti-proliferative activity regarding the cell lines of B16F10, were also evaluated. Amongst the samples evaluated, the red propolis from the Brazilian Northeast (states of Sergipe and Alagoas) showed the higher biological potential, as well as the larger content of antioxidant compounds. The best results were shown for the extracts obtained through the conventional extraction method (EtOH). However, the highest concentrations of Artepillin C and p-coumaric acid were identified in the extracts from SCO2, indicating a higher selectivity for the extraction of these compounds. It was verified that the composition and biological activity of the Brazilian propolis vary significantly, depending on the type of sample and geographical area of collection.

  2. Chemical Composition and Biological Activity of Extracts Obtained by Supercritical Extraction and Ethanolic Extraction of Brown, Green and Red Propolis Derived from Different Geographic Regions in Brazil

    Science.gov (United States)

    Machado, Bruna Aparecida Souza; Silva, Rejane Pina Dantas; Barreto, Gabriele de Abreu; Costa, Samantha Serra; da Silva, Danielle Figuerêdo; Brandão, Hugo Neves; da Rocha, José Luiz Carneiro; Dellagostin, Odir Antônio; Henriques, João Antônio Pegas; Umsza-Guez, Marcelo Andres; Padilha, Francine Ferreira

    2016-01-01

    The variations in the chemical composition, and consequently, on the biological activity of the propolis, are associated with its type and geographic origin. Considering this fact, this study evaluated propolis extracts obtained by supercritical extraction (SCO2) and ethanolic extraction (EtOH), in eight samples of different types of propolis (red, green and brown), collected from different regions in Brazil. The content of phenolic compounds, flavonoids, in vitro antioxidant activity (DPPH and ABTS), Artepillin C, p-coumaric acid and antimicrobial activity against two bacteria were determined for all extracts. For the EtOH extracts, the anti-proliferative activity regarding the cell lines of B16F10, were also evaluated. Amongst the samples evaluated, the red propolis from the Brazilian Northeast (states of Sergipe and Alagoas) showed the higher biological potential, as well as the larger content of antioxidant compounds. The best results were shown for the extracts obtained through the conventional extraction method (EtOH). However, the highest concentrations of Artepillin C and p-coumaric acid were identified in the extracts from SCO2, indicating a higher selectivity for the extraction of these compounds. It was verified that the composition and biological activity of the Brazilian propolis vary significantly, depending on the type of sample and geographical area of collection. PMID:26745799

  3. GMDH-GA Hybrid Model Extracting Exon Region from DNA Sequences

    OpenAIRE

    Ohta, Kouji; Yoshihara, Ikuo; Yamamori, Kunihito; Yasunaga, Moritoshi

    2004-01-01

    Abstract ###A model building method based on Group Method of Data Handling (GMDH) optimized by ###GA is developed for extracting exon regions. GMDH, that is originally a method to construct ###higher order polynomial model, is extended to constructing higher order logical model. ###The model built by proposed method is compared with Genetic Programming (GP)-based ###model as to the extraction rate of best, worst and average. The proposed method is superior to GP ###as to extraction rate of al...

  4. Selection/extraction of spectral regions for autofluorescence spectra measured in the oral cavity

    NARCIS (Netherlands)

    Skurichina, M; Paclik, P; Duin, RPW; de Veld, D; Sterenborg, HJCM; Witjes, MJH; Roodenburg, JLN; Fred, A; Caelli, T; Duin, RPW; Campilho, A; DeRidder, D

    2004-01-01

    Recently a number of successful algorithms to select/extract discriminative spectral regions was introduced. These methods may be more beneficial than the standard feature selection/extraction methods for spectral classification. In this paper, on the example of autofluorescence spectra measured in

  5. In situ genomic DNA extraction for PCR analysis of regions of interest in four plant species and one filamentous fungi

    Directory of Open Access Journals (Sweden)

    Luis E. Rojas

    2014-07-01

    Full Text Available The extraction methods of genomic DNA are usually laborious and hazardous to human health and the environment by the use of organic solvents (chloroform and phenol. In this work a protocol for in situ extraction of genomic DNA by alkaline lysis is validated. It was used in order to amplify regions of DNA in four species of plants and fungi by polymerase chain reaction (PCR. From plant material of Saccharum officinarum L., Carica papaya L. and Digitalis purpurea L. it was possible to extend different regions of the genome through PCR. Furthermore, it was possible to amplify a fragment of avr-4 gene DNA purified from lyophilized mycelium of Mycosphaerella fijiensis. Additionally, it was possible to amplify the region ap24 transgene inserted into the genome of banana cv. `Grande naine' (Musa AAA. Key words: alkaline lysis, Carica papaya L., Digitalis purpurea L., Musa, Saccharum officinarum L.

  6. EXTRACT

    DEFF Research Database (Denmark)

    Pafilis, Evangelos; Buttigieg, Pier Luigi; Ferrell, Barbra

    2016-01-01

    The microbial and molecular ecology research communities have made substantial progress on developing standards for annotating samples with environment metadata. However, sample manual annotation is a highly labor intensive process and requires familiarity with the terminologies used. We have the...... and text-mining-assisted curation revealed that EXTRACT speeds up annotation by 15-25% and helps curators to detect terms that would otherwise have been missed.Database URL: https://extract.hcmr.gr/......., organism, tissue and disease terms. The evaluators in the BioCreative V Interactive Annotation Task found the system to be intuitive, useful, well documented and sufficiently accurate to be helpful in spotting relevant text passages and extracting organism and environment terms. Comparison of fully manual...

  7. Combining position weight matrices and document-term matrix for efficient extraction of associations of methylated genes and diseases from free text.

    Directory of Open Access Journals (Sweden)

    Arwa Bin Raies

    Full Text Available BACKGROUND: In a number of diseases, certain genes are reported to be strongly methylated and thus can serve as diagnostic markers in many cases. Scientific literature in digital form is an important source of information about methylated genes implicated in particular diseases. The large volume of the electronic text makes it difficult and impractical to search for this information manually. METHODOLOGY: We developed a novel text mining methodology based on a new concept of position weight matrices (PWMs for text representation and feature generation. We applied PWMs in conjunction with the document-term matrix to extract with high accuracy associations between methylated genes and diseases from free text. The performance results are based on large manually-classified data. Additionally, we developed a web-tool, DEMGD, which automates extraction of these associations from free text. DEMGD presents the extracted associations in summary tables and full reports in addition to evidence tagging of text with respect to genes, diseases and methylation words. The methodology we developed in this study can be applied to similar association extraction problems from free text. CONCLUSION: The new methodology developed in this study allows for efficient identification of associations between concepts. Our method applied to methylated genes in different diseases is implemented as a Web-tool, DEMGD, which is freely available at http://www.cbrc.kaust.edu.sa/demgd/. The data is available for online browsing and download.

  8. Groundwater methane in a potential coal seam gas extraction region

    Directory of Open Access Journals (Sweden)

    Marnie L. Atkins

    2015-09-01

    New hydrological insights for the region: Methane was found in all geological units ranging between 0.26 and 4427 μg L−1 (median 10.68 μg L−1. Median methane concentrations were highest in chloride-type groundwater (13.26 μg L−1, n = 58 while bicarbonate-type groundwater had lower concentrations (3.71 μg L−1. Groundwater from alluvial sediments had significantly higher median methane concentrations (91.46 μg L−1 than groundwater from both the basalt aquifers (0.7 μg L−1 and bedrock aquifers (4.63 μg L−1; indicating geology was a major driver of methane distribution. Methane carbon stable isotope ratios ranged from –90.9‰ to –29.5‰, suggesting a biogenic origin with some methane oxidation. No significant correlations were observed between methane concentrations and redox indicators (nitrate, manganese, iron and sulphate except between iron and methane in the Lismore Basalt (r2 = 0.66, p < 0.001, implying redox conditions were not the main predictor of methane distribution.

  9. Emotion Discrimination using spatially Compact Regions of Interest extracted from Imaging EEG Activity

    Directory of Open Access Journals (Sweden)

    Jorge Ivan Padilla-Buritica

    2016-07-01

    Full Text Available Lately, research on computational models of emotion had been getting much attention due to their potential for understanding the mechanisms of emotions and their promising broad range of applications that potentially bridge the gap between human and machine interactions. We propose a new method for emotion classification that relies on features extracted from those active brain areas that are most likely related to emotions. To this end, we carry out the selection of spatially compact regions of interest that are computed using the brain neural activity reconstructed from electroencephalography data. Throughout this study, we consider three representative feature extraction methods widely applied to emotion detection tasks, including Power spectral density, Wavelet, and Hjorth parameters. Further feature selection is carried out using principal component analysis. For validation purpose, these features are used to feed a support vector machine classifier that is trained under the leave-one-out cross-validation strategy. Obtained results on real affective data show that incorporation of the proposed training method in combination with the enhanced spatial resolution provided by the source estimation allows improving the performed accuracy of discrimination in most of the considered emotions, namely: dominance, valence, and linking.

  10. Combining position weight matrices and document-term matrix for efficient extraction of associations of methylated genes and diseases from free text.

    Science.gov (United States)

    Bin Raies, Arwa; Mansour, Hicham; Incitti, Roberto; Bajic, Vladimir B

    2013-01-01

    In a number of diseases, certain genes are reported to be strongly methylated and thus can serve as diagnostic markers in many cases. Scientific literature in digital form is an important source of information about methylated genes implicated in particular diseases. The large volume of the electronic text makes it difficult and impractical to search for this information manually. We developed a novel text mining methodology based on a new concept of position weight matrices (PWMs) for text representation and feature generation. We applied PWMs in conjunction with the document-term matrix to extract with high accuracy associations between methylated genes and diseases from free text. The performance results are based on large manually-classified data. Additionally, we developed a web-tool, DEMGD, which automates extraction of these associations from free text. DEMGD presents the extracted associations in summary tables and full reports in addition to evidence tagging of text with respect to genes, diseases and methylation words. The methodology we developed in this study can be applied to similar association extraction problems from free text. The new methodology developed in this study allows for efficient identification of associations between concepts. Our method applied to methylated genes in different diseases is implemented as a Web-tool, DEMGD, which is freely available at http://www.cbrc.kaust.edu.sa/demgd/. The data is available for online browsing and download.

  11. Combining Position Weight Matrices and Document-Term Matrix for Efficient Extraction of Associations of Methylated Genes and Diseases from Free Text

    KAUST Repository

    Bin Raies, Arwa

    2013-10-16

    Background:In a number of diseases, certain genes are reported to be strongly methylated and thus can serve as diagnostic markers in many cases. Scientific literature in digital form is an important source of information about methylated genes implicated in particular diseases. The large volume of the electronic text makes it difficult and impractical to search for this information manually.Methodology:We developed a novel text mining methodology based on a new concept of position weight matrices (PWMs) for text representation and feature generation. We applied PWMs in conjunction with the document-term matrix to extract with high accuracy associations between methylated genes and diseases from free text. The performance results are based on large manually-classified data. Additionally, we developed a web-tool, DEMGD, which automates extraction of these associations from free text. DEMGD presents the extracted associations in summary tables and full reports in addition to evidence tagging of text with respect to genes, diseases and methylation words. The methodology we developed in this study can be applied to similar association extraction problems from free text.Conclusion:The new methodology developed in this study allows for efficient identification of associations between concepts. Our method applied to methylated genes in different diseases is implemented as a Web-tool, DEMGD, which is freely available at http://www.cbrc.kaust.edu.sa/demgd/. The data is available for online browsing and download. © 2013 Bin Raies et al.

  12. Larvicidal, antimicrobial and brine shrimp activities of extracts from Cissampelos mucronata and Tephrosia villosa from coast region, Tanzania

    Directory of Open Access Journals (Sweden)

    Erasto Paul

    2011-04-01

    Full Text Available Abstract Background The leaves and roots of Cissampelos mucronata A. Rich (Menispermaceae are widely used in the tropics and subtropics to manage various ailments such as gastro-intestinal complaints, menstrual problems, venereal diseases and malaria. In the Coast region, Tanzania, roots are used to treat wounds due to extraction of jigger. Leaves of Tephrosia villosa (L Pers (Leguminosae are reported to be used in the treatment of diabetes mellitus in India. In this study, extracts from the roots and aerial parts of C. mucronata and extracts from leaves, fruits, twigs and roots of T. villosa were evaluated for larvicidal activity, brine shrimps toxicity and antimicrobial activity. Methods Powdered materials from C. mucronata were extracted sequentially by dichloromethane followed by ethanol while materials from T.villosa were extracted by ethanol only. The extracts obtained were evaluated for larvicidal activity using Culex quinquefasciatus Say larvae, cytotoxicity using brine shrimp larvae and antimicrobial activity using bacteria and fungi. Results Extracts from aerial parts of C. Mucronata exhibited antibacterial activity against Staphylococcus aureus, Escherichia coli, Pseudomonas aeruginosa, Salmonella typhi, Vibrio cholera, Bacillus anthracis, Streptococcus faecalis and antifungal activity against Candida albicans and Cryptococcus neoformans. They exhibited very low toxicity to brine shrimps and had no larvicidal activity. The root extracts exhibited good larvicidal activity but weak antimicrobial activity. The root dichloromethane extracts from C. mucronata was found to be more toxic with an LC50 value of 59.608 μg/mL while ethanolic extracts from root were not toxic with LC50>100 μg/mL. Ethanol extracts from fruits and roots of T. villosa were found to be very toxic with LC50 values of 9.690 μg/mL and 4.511 μg/mL, respectively, while, ethanol extracts from leaves and twigs of T. villosa were found to be non toxic (LC50>100

  13. Salient Region Detection by Fusing Foreground and Background Cues Extracted from Single Image

    Directory of Open Access Journals (Sweden)

    Qiangqiang Zhou

    2016-01-01

    Full Text Available Saliency detection is an important preprocessing step in many application fields such as computer vision, robotics, and graphics to reduce computational cost by focusing on significant positions and neglecting the nonsignificant in the scene. Different from most previous methods which mainly utilize the contrast of low-level features, various feature maps are fused in a simple linear weighting form. In this paper, we propose a novel salient object detection algorithm which takes both background and foreground cues into consideration and integrate a bottom-up coarse salient regions extraction and a top-down background measure via boundary labels propagation into a unified optimization framework to acquire a refined saliency detection result. Wherein the coarse saliency map is also fused by three components, the first is local contrast map which is in more accordance with the psychological law, the second is global frequency prior map, and the third is global color distribution map. During the formation of background map, first we construct an affinity matrix and select some nodes which lie on border as labels to represent the background and then carry out a propagation to generate the regional background map. The evaluation of the proposed model has been implemented on four datasets. As demonstrated in the experiments, our proposed method outperforms most existing saliency detection models with a robust performance.

  14. The 2 mrad crossing-angle ILC interaction region and extraction line

    CERN Document Server

    Appleby, Robert; Bambade, Philip; Dadoun, Olivier; Parker, Brett; Keller, Lewis; Moffeit, Kenneth C; Nosochkov, Yuri; Seryi, Andrei; Spencer, Cherrill M; Carter, John; Napoly, Olivier

    2006-01-01

    A complete optics design for the 2mrad crossing angle interaction region and extraction line was presented at Snowmass 2005. Since this time, the design task force has been working on developing and improving the performance of the extraction line. The work has focused on optimising the final doublet parameters and on reducing the power losses resulting from the disrupted beam transport. In this paper, the most recent status of the 2mrad layout and the corresponding performance are presented.

  15. An Enhanced Text-Mining Framework for Extracting Disaster Relevant Data through Social Media and Remote Sensing Data Fusion

    Science.gov (United States)

    Scheele, C. J.; Huang, Q.

    2016-12-01

    In the past decade, the rise in social media has led to the development of a vast number of social media services and applications. Disaster management represents one of such applications leveraging massive data generated for event detection, response, and recovery. In order to find disaster relevant social media data, current approaches utilize natural language processing (NLP) methods based on keywords, or machine learning algorithms relying on text only. However, these approaches cannot be perfectly accurate due to the variability and uncertainty in language used on social media. To improve current methods, the enhanced text-mining framework is proposed to incorporate location information from social media and authoritative remote sensing datasets for detecting disaster relevant social media posts, which are determined by assessing the textual content using common text mining methods and how the post relates spatiotemporally to the disaster event. To assess the framework, geo-tagged Tweets were collected for three different spatial and temporal disaster events: hurricane, flood, and tornado. Remote sensing data and products for each event were then collected using RealEarthTM. Both Naive Bayes and Logistic Regression classifiers were used to compare the accuracy within the enhanced text-mining framework. Finally, the accuracies from the enhanced text-mining framework were compared to the current text-only methods for each of the case study disaster events. The results from this study address the need for more authoritative data when using social media in disaster management applications.

  16. Extraction of Pluvial Flood Relevant Volunteered Geographic Information (VGI by Deep Learning from User Generated Texts and Photos

    Directory of Open Access Journals (Sweden)

    Yu Feng

    2018-01-01

    Full Text Available In recent years, pluvial floods caused by extreme rainfall events have occurred frequently. Especially in urban areas, they lead to serious damages and endanger the citizens’ safety. Therefore, real-time information about such events is desirable. With the increasing popularity of social media platforms, such as Twitter or Instagram, information provided by voluntary users becomes a valuable source for emergency response. Many applications have been built for disaster detection and flood mapping using crowdsourcing. Most of the applications so far have merely used keyword filtering or classical language processing methods to identify disaster relevant documents based on user generated texts. As the reliability of social media information is often under criticism, the precision of information retrieval plays a significant role for further analyses. Thus, in this paper, high quality eyewitnesses of rainfall and flooding events are retrieved from social media by applying deep learning approaches on user generated texts and photos. Subsequently, events are detected through spatiotemporal clustering and visualized together with these high quality eyewitnesses in a web map application. Analyses and case studies are conducted during flooding events in Paris, London and Berlin.

  17. ADEPt, a semantically-enriched pipeline for extracting adverse drug events from free-text electronic health records.

    Directory of Open Access Journals (Sweden)

    Ehtesham Iqbal

    Full Text Available Adverse drug events (ADEs are unintended responses to medical treatment. They can greatly affect a patient's quality of life and present a substantial burden on healthcare. Although Electronic health records (EHRs document a wealth of information relating to ADEs, they are frequently stored in the unstructured or semi-structured free-text narrative requiring Natural Language Processing (NLP techniques to mine the relevant information. Here we present a rule-based ADE detection and classification pipeline built and tested on a large Psychiatric corpus comprising 264k patients using the de-identified EHRs of four UK-based psychiatric hospitals. The pipeline uses characteristics specific to Psychiatric EHRs to guide the annotation process, and distinguishes: a the temporal value associated with the ADE mention (whether it is historical or present, b the categorical value of the ADE (whether it is assertive, hypothetical, retrospective or a general discussion and c the implicit contextual value where the status of the ADE is deduced from surrounding indicators, rather than explicitly stated. We manually created the rulebase in collaboration with clinicians and pharmacists by studying ADE mentions in various types of clinical notes. We evaluated the open-source Adverse Drug Event annotation Pipeline (ADEPt using 19 ADEs specific to antipsychotics and antidepressants medication. The ADEs chosen vary in severity, regularity and persistence. The average F-measure and accuracy achieved by our tool across all tested ADEs were 0.83 and 0.83 respectively. In addition to annotation power, the ADEPT pipeline presents an improvement to the state of the art context-discerning algorithm, ConText.

  18. Application of Text Mining to Extract Hotel Attributes and Construct Perceptual Map of Five Star Hotels from Online Review: Study of Jakarta and Singapore Five-Star Hotels

    Directory of Open Access Journals (Sweden)

    Arga Hananto

    2015-12-01

    Full Text Available The use of post-purchase online consumer review in hotel attributes study was still scarce in the literature. Arguably, post purchase online review data would gain more accurate attributes thatconsumers actually consider in their purchase decision. This study aims to extract attributes from two samples of five-star hotel reviews (Jakarta and Singapore with text mining methodology. In addition,this study also aims to describe positioning of five-star hotels in Jakarta and Singapore based on the extracted attributes using Correspondence Analysis. This study finds that reviewers of five star hotels in both cities mentioned similar attributes such as service, staff, club, location, pool and food. Attributes derived from text mining seem to be viable input to build fairly accurate positioning map of hotels. This study has demonstrated the viability of online review as a source of data for hotel attribute and positioning studies.

  19. A Novel Frequency Domain Iterative Image Registration Algorithm Based on Local Region Extraction

    Directory of Open Access Journals (Sweden)

    Xiujie Qu

    2015-01-01

    Full Text Available Because of the differences of imaging time, position between sensor and target position, scaling, rotation, translation, and other transformations between the series of images will be generated by the imaging system. The conventional phase correlation algorithm has been widely applied because of its advantages of high speed, precision, and weak influence of the geometric distortion when computing these changing parameters. However, when the scaling factor and the rotation angle are too large, it is difficult to use the conventional phase correlation method for high precision registration. To solve this problem, this paper presents a novel method, which combines the speeded up robust features algorithm and the phase correlation method under the log polar. Through local region extraction and reusing a two-step iterative phase correlation algorithm, this method avoids excessive computation and the demand of characteristics of the image and effectively improves the accuracy of registration. A plurality of visible light image simulation verifies that this is a fast, accurate, and robust algorithm, even when the image has large angle rotation and large multiple scaling.

  20. Extracting Vegetation Coverage in Dry-hot Valley Regions Based on Alternating Angle Minimum Algorithm

    Science.gov (United States)

    Y Yang, M.; Wang, J.; Zhang, Q.

    2017-07-01

    Vegetation coverage is one of the most important indicators for ecological environment change, and is also an effective index for the assessment of land degradation and desertification. The dry-hot valley regions have sparse surface vegetation, and the spectral information about the vegetation in such regions usually has a weak representation in remote sensing, so there are considerable limitations for applying the commonly-used vegetation index method to calculate the vegetation coverage in the dry-hot valley regions. Therefore, in this paper, Alternating Angle Minimum (AAM) algorithm of deterministic model is adopted for selective endmember for pixel unmixing of MODIS image in order to extract the vegetation coverage, and accuracy test is carried out by the use of the Landsat TM image over the same period. As shown by the results, in the dry-hot valley regions with sparse vegetation, AAM model has a high unmixing accuracy, and the extracted vegetation coverage is close to the actual situation, so it is promising to apply the AAM model to the extraction of vegetation coverage in the dry-hot valley regions.

  1. Region of interest extraction based on multiscale visual saliency analysis for remote sensing images

    Science.gov (United States)

    Zhang, Yinggang; Zhang, Libao; Yu, Xianchuan

    2015-01-01

    Region of interest (ROI) extraction is an important component of remote sensing image processing. However, traditional ROI extraction methods are usually prior knowledge-based and depend on classification, segmentation, and a global searching solution, which are time-consuming and computationally complex. We propose a more efficient ROI extraction model for remote sensing images based on multiscale visual saliency analysis (MVS), implemented in the CIE L*a*b* color space, which is similar to visual perception of the human eye. We first extract the intensity, orientation, and color feature of the image using different methods: the visual attention mechanism is used to eliminate the intensity feature using a difference of Gaussian template; the integer wavelet transform is used to extract the orientation feature; and color information content analysis is used to obtain the color feature. Then, a new feature-competition method is proposed that addresses the different contributions of each feature map to calculate the weight of each feature image for combining them into the final saliency map. Qualitative and quantitative experimental results of the MVS model as compared with those of other models show that it is more effective and provides more accurate ROI extraction results with fewer holes inside the ROI.

  2. The freetext matching algorithm: a computer program to extract diagnoses and causes of death from unstructured text in electronic health records.

    Science.gov (United States)

    Shah, Anoop D; Martinez, Carlos; Hemingway, Harry

    2012-08-07

    Electronic health records are invaluable for medical research, but much information is stored as free text rather than in a coded form. For example, in the UK General Practice Research Database (GPRD), causes of death and test results are sometimes recorded only in free text. Free text can be difficult to use for research if it requires time-consuming manual review. Our aim was to develop an automated method for extracting coded information from free text in electronic patient records. We reviewed the electronic patient records in GPRD of a random sample of 3310 patients who died in 2001, to identify the cause of death. We developed a computer program called the Freetext Matching Algorithm (FMA) to map diagnoses in text to the Read Clinical Terminology. The program uses lookup tables of synonyms and phrase patterns to identify diagnoses, dates and selected test results. We tested it on two random samples of free text from GPRD (1000 texts associated with death in 2001, and 1000 general texts from cases and controls in a coronary artery disease study), comparing the output to the U.S. National Library of Medicine's MetaMap program and the gold standard of manual review. Among 3310 patients registered in the GPRD who died in 2001, the cause of death was recorded in coded form in 38.1% of patients, and in the free text alone in 19.4%. On the 1000 texts associated with death, FMA coded 683 of the 735 positive diagnoses, with precision (positive predictive value) 98.4% (95% confidence interval (CI) 97.2, 99.2) and recall (sensitivity) 92.9% (95% CI 90.8, 94.7). On the general sample, FMA detected 346 of the 447 positive diagnoses, with precision 91.5% (95% CI 88.3, 94.1) and recall 77.4% (95% CI 73.2, 81.2), which was similar to MetaMap. We have developed an algorithm to extract coded information from free text in GP records with good precision. It may facilitate research using free text in electronic patient records, particularly for extracting the cause of death.

  3. ZK DrugResist 2.0: A TextMiner to extract semantic relations of drug resistance from PubMed.

    Science.gov (United States)

    Khalid, Zoya; Sezerman, Osman Ugur

    2017-05-01

    Extracting useful knowledge from an unstructured textual data is a challenging task for biologists, since biomedical literature is growing exponentially on a daily basis. Building an automated method for such tasks is gaining much attention of researchers. ZK DrugResist is an online tool that automatically extracts mutations and expression changes associated with drug resistance from PubMed. In this study we have extended our tool to include semantic relations extracted from biomedical text covering drug resistance and established a server including both of these features. Our system was tested for three relations, Resistance (R), Intermediate (I) and Susceptible (S) by applying hybrid feature set. From the last few decades the focus has changed to hybrid approaches as it provides better results. In our case this approach combines rule-based methods with machine learning techniques. The results showed 97.67% accuracy with 96% precision, recall and F-measure. The results have outperformed the previously existing relation extraction systems thus can facilitate computational analysis of drug resistance against complex diseases and further can be implemented on other areas of biomedicine. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. Modelling and extraction of variability in free-text medication prescriptions from an anonymised primary care electronic medical record research database.

    Science.gov (United States)

    Karystianis, George; Sheppard, Therese; Dixon, William G; Nenadic, Goran

    2016-02-09

    Free-text medication prescriptions contain detailed instruction information that is key when preparing drug data for analysis. The objective of this study was to develop a novel model and automated text-mining method to extract detailed structured medication information from free-text prescriptions and explore their variability (e.g. optional dosages) in primary care research databases. We introduce a prescription model that provides minimum and maximum values for dose number, frequency and interval, allowing modelling variability and flexibility within a drug prescription. We developed a text mining system that relies on rules to extract such structured information from prescription free-text dosage instructions. The system was applied to medication prescriptions from an anonymised primary care electronic record database (Clinical Practice Research Datalink, CPRD). We have evaluated our approach on a test set of 220 CPRD prescription free-text directions. The system achieved an overall accuracy of 91 % at the prescription level, with 97 % accuracy across the attribute levels. We then further analysed over 56,000 most common free text prescriptions from CPRD records and found that 1 in 4 has inherent variability, i.e. a choice in taking medication specified by different minimum and maximum doses, duration or frequency. Our approach provides an accurate, automated way of coding prescription free text information, including information about flexibility and variability within a prescription. The method allows the researcher to decide how best to prepare the prescription data for drug efficacy and safety analyses in any given setting, and test various scenarios and their impact.

  5. Shadow Detection Based on Regions of Light Sources for Object Extraction in Nighttime Video.

    Science.gov (United States)

    Lee, Gil-Beom; Lee, Myeong-Jin; Lee, Woo-Kyung; Park, Joo-Heon; Kim, Tae-Hwan

    2017-03-22

    Intelligent video surveillance systems detect pre-configured surveillance events through background modeling, foreground and object extraction, object tracking, and event detection. Shadow regions inside video frames sometimes appear as foreground objects, interfere with ensuing processes, and finally degrade the event detection performance of the systems. Conventional studies have mostly used intensity, color, texture, and geometric information to perform shadow detection in daytime video, but these methods lack the capability of removing shadows in nighttime video. In this paper, a novel shadow detection algorithm for nighttime video is proposed; this algorithm partitions each foreground object based on the object's vertical histogram and screens out shadow objects by validating their orientations heading toward regions of light sources. From the experimental results, it can be seen that the proposed algorithm shows more than 93.8% shadow removal and 89.9% object extraction rates for nighttime video sequences, and the algorithm outperforms conventional shadow removal algorithms designed for daytime videos.

  6. Adverse Event extraction from Structured Product Labels using the Event-based Text-mining of Health Electronic Records (ETHER)system.

    Science.gov (United States)

    Pandey, Abhishek; Kreimeyer, Kory; Foster, Matthew; Botsis, Taxiarchis; Dang, Oanh; Ly, Thomas; Wang, Wei; Forshee, Richard

    2018-01-01

    Structured Product Labels follow an XML-based document markup standard approved by the Health Level Seven organization and adopted by the US Food and Drug Administration as a mechanism for exchanging medical products information. Their current organization makes their secondary use rather challenging. We used the Side Effect Resource database and DailyMed to generate a comparison dataset of 1159 Structured Product Labels. We processed the Adverse Reaction section of these Structured Product Labels with the Event-based Text-mining of Health Electronic Records system and evaluated its ability to extract and encode Adverse Event terms to Medical Dictionary for Regulatory Activities Preferred Terms. A small sample of 100 labels was then selected for further analysis. Of the 100 labels, Event-based Text-mining of Health Electronic Records achieved a precision and recall of 81 percent and 92 percent, respectively. This study demonstrated Event-based Text-mining of Health Electronic Record's ability to extract and encode Adverse Event terms from Structured Product Labels which may potentially support multiple pharmacoepidemiological tasks.

  7. Selling 'Fracking': Legitimation of High Speed Oil and Gas Extraction in the Marcellus Shale Region

    Science.gov (United States)

    Matz, Jacob R.

    The advent of horizontal hydraulic fracture drilling, or 'fracking,' a technology used to access oil and natural gas deposits, has allowed for the extraction of deep, unconventional shale gas and oil deposits in various shale seams throughout the United States and world. One such shale seam, the Marcellus shale, extends from New York State, across Pennsylvania, and throughout West Virginia, where shale gas development has significantly increased within the last decade. This boom has created a massive amount of economic activity surrounding the energy industry, creating jobs for workers, income from leases and royalties for landowners, and profits for energy conglomerates. However, this bounty comes with risks to environmental and public health, and has led to divisive community polarization over the issue in the Marcellus shale region. In the face of potential environmental and social disruption, and a great deal of controversy surrounding 'fracking,' the oil and gas industry has had to undertake a myriad of public relations campaigns and initiatives to legitimize their extraction efforts in the Marcellus shale region, and to project the oil and gas industry in a positive light to residents, policy makers, and landowners. This thesis describes one such public relations initiative, the Energy in Depth Northeast Marcellus Initiative. Through qualitative content analysis of Energy in Depth's online web material, this thesis examines the ways in which the oil and gas industry narrates the shale gas boom in the Marcellus shale region, and the ways in which the industry frames the discourse surrounding natural gas development. Through the use of environmental imagery, appeals to scientific reason, and appeals to patriotism, the oil and gas industry uses Energy in Depth to frame the shale gas extraction process in a positive way, all the while framing those who question or oppose the processes of shale gas extraction as irrational obstructionists.

  8. Quantitative methodology to extract regional magnetotelluric impedances and determine the dimension of the conductivity structure

    Energy Technology Data Exchange (ETDEWEB)

    Groom, R. [PetRos EiKon Incorporated, Ontario (Canada); Kurtz, R.; Jones, A.; Boerner, D. [Geological Survey of Canada, Ontario (Canada)

    1996-05-01

    This paper describes a systematic method for determining the appropriate dimensionality of magnetotelluric (MT) data from a site, and illustrates the application of this method to analyze both synthetic data and real data. Additionally, it describes the extraction of regional impedance responses from multiple sites. This method was examined extensively with synthetic data, and proven to be successful. It was demonstrated for two neighboring sites that the analysis methodology can be extremely useful in unraveling the bulk regional response when hidden by strong three-dimensional effects. Although there may still be some uncertainties remaining in the true levels for the regional responses for stations LIT000 and LITW02, the analysis has provided models which not only fit the data but are consistent for neighboring sites. It was suggested from these data that the stations are seeing significantly different structures. 12 refs.

  9. Semiosis of Humor in Oral Texts. Instances from a Coastal Area in Regions 8 and 9 of Chile

    Directory of Open Access Journals (Sweden)

    Contreras Oyarzún, Constantino

    2008-12-01

    Full Text Available The author analyzes a set of short texts from the oral tradition collected in a peripheral area of Chile inhabited by Spanish-speaking people who live in contact with the Mapuche-speaking community. In general, it is a study of the internal relations of signs (co-text and of their connection with the socio-cultural environment (context. In particular, the author focuses the analysis on the linguistic resources employed for the expression of humor.

    Este artículo comprende el análisis de un conjunto de textos breves de tradición oral recogidos en una zona lateral de habla castellana y de contacto con el mapuche o araucano de Chile. En lo general, aborda el estudio de las relaciones internas de los signos (co-texto y sus vínculos con el entorno sociocultural (contexto. En lo particular, el análisis se detiene en los recursos lingüsticos utilizados para la expresión del humor.

  10. Antioxidant and antitopoisomerase activities in plant extracts of some Colombian flora from La Marcada Natural Regional Park

    Directory of Open Access Journals (Sweden)

    Jaime Niño

    2011-09-01

    Full Text Available Many plants have been used to treat some diseases and infections since time immemorial, and this potential has been exploited by the pharmaceutical industry in the search of new analgesic, anticarcinogenic and antimicrobial agents, among other active agents. in order to contribute with bioprospection studies on the Colombian flora, 35 extracts from 13 plant species belonging to seven families (Apocynaceae, Cactaceae, Costaceae, Eremolepidaceae, Passifloraceae, Solanaceae and Urticaceae were collected from La Marcada Natural Regional Park (LMNRP, Colombia. Dichloromethane, n-hexane and aqueous-methanol crude extracts were prepared and evaluated for their activity against Saccharomyces cerevisiae RS322N, R52Y and RS321 strains in the yeast mutant assay and their antioxidant capacity through the DPPH test. The dichloromethane extract from Myriocarpa stipitata (Urticaceae showed moderate inhibitory activity against the three S. cerevisiae strains tested. The capacity of the dichloromethane extract from M. stipitata to inhibit the enzyme topoisomerase I and to cause DNA damage was inferred from these results. In the DPPH assay, the n-hexane crude extract from Costus sp. (Costaceae showed good antioxidant activity (48%; in addition, the crude dichloromethane and aqueous-methanol extracts from Rhipsalis micrantha (Cactaceae showed moderate antioxidant activity with percentage of 29 and 21%, respectively. Rev. Biol. Trop. 59 (3: 1089-1097. Epub 2011 September 01.Desde tiempos inmemoriales, muchas plantas han sido usadas para el tratamiento de varias enfermedades e infecciones, este potencial ha sido explotado por la industria farmacéutica en la búsqueda de nuevos agentes analgésicos, anticancerígenos y antimicrobianos, entre otros. Consientes con esto, se evaluó la actividad de 35 extractos de 13 especies de plantas recolectadas en el Parque Regional Natural La Marcada (PRNLM, Colombia contra las cepas mutadas de Saccharomyces cerevisiae RS322N, R

  11. Text-Attentional Convolutional Neural Network for Scene Text Detection.

    Science.gov (United States)

    He, Tong; Huang, Weilin; Qiao, Yu; Yao, Jian

    2016-06-01

    Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature globally computed from a whole image component (patch), where the cluttered background information may dominate true text features in the deep representation. This leads to less discriminative power and poorer robustness. In this paper, we present a new system for scene text detection by proposing a novel text-attentional convolutional neural network (Text-CNN) that particularly focuses on extracting text-related regions and features from the image components. We develop a new learning mechanism to train the Text-CNN with multi-level and rich supervised information, including text region mask, character label, and binary text/non-text information. The rich supervision information enables the Text-CNN with a strong capability for discriminating ambiguous texts, and also increases its robustness against complicated background components. The training process is formulated as a multi-task learning problem, where low-level supervised information greatly facilitates the main task of text/non-text classification. In addition, a powerful low-level detector called contrast-enhancement maximally stable extremal regions (MSERs) is developed, which extends the widely used MSERs by enhancing intensity contrast between text patterns and background. This allows it to detect highly challenging text patterns, resulting in a higher recall. Our approach achieved promising results on the ICDAR 2013 data set, with an F-measure of 0.82, substantially improving the state-of-the-art results.

  12. Text-Attentional Convolutional Neural Networks for Scene Text Detection.

    Science.gov (United States)

    He, Tong; Huang, Weilin; Qiao, Yu; Yao, Jian

    2016-03-28

    Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature computed globally from a whole image component (patch), where the cluttered background information may dominate true text features in the deep representation. This leads to less discriminative power and poorer robustness. In this work, we present a new system for scene text detection by proposing a novel Text-Attentional Convolutional Neural Network (Text-CNN) that particularly focuses on extracting text-related regions and features from the image components. We develop a new learning mechanism to train the Text-CNN with multi-level and rich supervised information, including text region mask, character label, and binary text/nontext information. The rich supervision information enables the Text-CNN with a strong capability for discriminating ambiguous texts, and also increases its robustness against complicated background components. The training process is formulated as a multi-task learning problem, where low-level supervised information greatly facilitates main task of text/non-text classification. In addition, a powerful low-level detector called Contrast- Enhancement Maximally Stable Extremal Regions (CE-MSERs) is developed, which extends the widely-used MSERs by enhancing intensity contrast between text patterns and background. This allows it to detect highly challenging text patterns, resulting in a higher recall. Our approach achieved promising results on the ICDAR 2013 dataset, with a F-measure of 0.82, improving the state-of-the-art results substantially.

  13. High-resolution photogrammetric surface extraction over glaciated regions from WorldView stereo pairs

    Science.gov (United States)

    Noh, M.; Howat, I. M.; Morin, P. J.; Porter, C. C.

    2013-12-01

    The monitoring of surface change in glaciated regions such as Alaska, Greenland and Antarctica is an important pursuit in climate-related Earth Science. Repeat Digital Elevation Models (DEM) created by photogrammetric surface extraction from a time-series of stereo pairs provide an efficient and low cost means for analyzing surface change over large, remote areas. Stereo-photogrammetric DEM extraction over glaciated regions is challenging due to typically low-contrast surfaces such as ice, snow, mountain shadows and steep slopes, resulting in large feature search areas and matching failures. A method for reducing the feature search area is critical for successful and efficient DEM extraction in this terrain. The SETSM (Surface Extraction with TIN-based Search-space Minimization) algorithm is developed for overcoming these problems and performs surface extraction automatically, without any user-defined or a-priori information, such as seed DEMs, using only the sensor Rational Polynomial Coefficients (RPCs) for geometric constraints. Rotation-invariant, multi-patch Normalized Cross Correlation (NCC) is used as its basic similarity measurement. SETSM constructs a TIN (Triangular Irregular Network) in the object-space domain in order to minimize the necessary search space. It employs a pyramiding strategy that uses iteratively finer resolution TIN's to minimize the search space and uses a vertical line locus to provide precise geometric constraints for reducing the search area. As a major benefit, SETSM relatively adjusts the Rational Function Model (RFM) between stereo pairs to reduce the offset between corresponding points projected by the vertical line locus caused by RPC errors, dramatically reducing the number of matching failures. In SETSM, this offset is iteratively removed with a parabolic adjustment of the NCC solution. As a demonstration, Worldview stereo pairs for a variety of test areas in Alaska, Greenland and Antarctica are selected for creating 2m grid

  14. Chemical composition of hydroethanolic extracts from Siparuna guianensis, medicinal plant used as anxiolytics in Amazon region

    Directory of Open Access Journals (Sweden)

    Giuseppina Negri

    2012-10-01

    Full Text Available Siparuna guianensis Aubl., Siparunaceae, is used as anxiolytic plants in folk medicine by South-American indians, "caboclos" and river-dwellers. This work focused the evaluation of phenolic composition of hydroethanolic extract of S. guianensis through HPLC-DAD-ESI/MS/MS. The constituents exhibited protonated, deprotonated and sodiated molecules and the MS/MS fragmentation of protonated, deprotonated and sodiated molecules provided product ions with rich structural information. Vicenin-2 (apigenin-6,8-di-C-glucoside was the main constituent found in S. guianensis together quercetin-3,7-di-O-rhamnoside and kaempferol-3,7di-O-rhamnoside. A commercial extract of Passiflora incarnata (Phytomedicine was used as surrogate standard and also was analyzed through HPLC-DAD-ESI/ MS/MS, showing flavones C-glycosides as constituents, among them, vicenin-2 and vitexin. The main constituent was vitexin. Flavonols triglycosides was also found in low content in S. guianensis and were tentatively characterized as quercetin-3O-rutinoside-7-O-rhamnoside, quercetin-3-O-pentosyl-pentoside-7-O-rhamnoside and kaempferol-3-O-pentosyl-pentoside-7-O-rhamnoside. Apigenin and kaempferol derivatives had been reported as anxiolytic agents. Flavonoids present in this extract were correlated with flavonoids reported as anxiolytics.

  15. Chemical composition of hydroethanolic extracts from Siparuna guianensis, medicinal plant used as anxiolytics in Amazon region

    Directory of Open Access Journals (Sweden)

    Giuseppina Negri

    2012-03-01

    Full Text Available Siparuna guianensis Aubl., Siparunaceae, is used as anxiolytic plants in folk medicine by South-American indians, "caboclos" and river-dwellers. This work focused the evaluation of phenolic composition of hydroethanolic extract of S. guianensis through HPLC-DAD-ESI/MS/MS. The constituents exhibited protonated, deprotonated and sodiated molecules and the MS/MS fragmentation of protonated, deprotonated and sodiated molecules provided product ions with rich structural information. Vicenin-2 (apigenin-6,8-di-C-glucoside was the main constituent found in S. guianensis together quercetin-3,7-di-O-rhamnoside and kaempferol-3,7di-O-rhamnoside. A commercial extract of Passiflora incarnata (Phytomedicine was used as surrogate standard and also was analyzed through HPLC-DAD-ESI/ MS/MS, showing flavones C-glycosides as constituents, among them, vicenin-2 and vitexin. The main constituent was vitexin. Flavonols triglycosides was also found in low content in S. guianensis and were tentatively characterized as quercetin-3O-rutinoside-7-O-rhamnoside, quercetin-3-O-pentosyl-pentoside-7-O-rhamnoside and kaempferol-3-O-pentosyl-pentoside-7-O-rhamnoside. Apigenin and kaempferol derivatives had been reported as anxiolytic agents. Flavonoids present in this extract were correlated with flavonoids reported as anxiolytics.

  16. Pedestrian detection in thermal images: An automated scale based region extraction with curvelet space validation

    Science.gov (United States)

    Lakshmi, A.; Faheema, A. G. J.; Deodhare, Dipti

    2016-05-01

    Pedestrian detection is a key problem in night vision processing with a dozen of applications that will positively impact the performance of autonomous systems. Despite significant progress, our study shows that performance of state-of-the-art thermal image pedestrian detectors still has much room for improvement. The purpose of this paper is to overcome the challenge faced by the thermal image pedestrian detectors, which employ intensity based Region Of Interest (ROI) extraction followed by feature based validation. The most striking disadvantage faced by the first module, ROI extraction, is the failed detection of cloth insulted parts. To overcome this setback, this paper employs an algorithm and a principle of region growing pursuit tuned to the scale of the pedestrian. The statistics subtended by the pedestrian drastically vary with the scale and deviation from normality approach facilitates scale detection. Further, the paper offers an adaptive mathematical threshold to resolve the problem of subtracting the background while extracting cloth insulated parts as well. The inherent false positives of the ROI extraction module are limited by the choice of good features in pedestrian validation step. One such feature is curvelet feature, which has found its use extensively in optical images, but has as yet no reported results in thermal images. This has been used to arrive at a pedestrian detector with a reduced false positive rate. This work is the first venture made to scrutinize the utility of curvelet for characterizing pedestrians in thermal images. Attempt has also been made to improve the speed of curvelet transform computation. The classification task is realized through the use of the well known methodology of Support Vector Machines (SVMs). The proposed method is substantiated with qualified evaluation methodologies that permits us to carry out probing and informative comparisons across state-of-the-art features, including deep learning methods, with six

  17. Estimation of regional air-quality damages from Marcellus Shale natural gas extraction in Pennsylvania

    Science.gov (United States)

    Litovitz, Aviva; Curtright, Aimee; Abramzon, Shmuel; Burger, Nicholas; Samaras, Constantine

    2013-03-01

    This letter provides a first-order estimate of conventional air pollutant emissions, and the monetary value of the associated environmental and health damages, from the extraction of unconventional shale gas in Pennsylvania. Region-wide estimated damages ranged from 7.2 to 32 million dollars for 2011. The emissions from Pennsylvania shale gas extraction represented only a few per cent of total statewide emissions, and the resulting statewide damages were less than those estimated for each of the state’s largest coal-based power plants. On the other hand, in counties where activities are concentrated, NOx emissions from all shale gas activities were 20-40 times higher than allowable for a single minor source, despite the fact that individual new gas industry facilities generally fall below the major source threshold for NOx. Most emissions are related to ongoing activities, i.e., gas production and compression, which can be expected to persist beyond initial development and which are largely unrelated to the unconventional nature of the resource. Regulatory agencies and the shale gas industry, in developing regulations and best practices, should consider air emissions from these long-term activities, especially if development occurs in more populated areas of the state where per-ton emissions damages are significantly higher.

  18. Emotion Discrimination Using Spatially Compact Regions of Interest Extracted from Imaging EEG Activity.

    Science.gov (United States)

    Padilla-Buritica, Jorge I; Martinez-Vargas, Juan D; Castellanos-Dominguez, German

    2016-01-01

    Lately, research on computational models of emotion had been getting much attention due to their potential for understanding the mechanisms of emotions and their promising broad range of applications that potentially bridge the gap between human and machine interactions. We propose a new method for emotion classification that relies on features extracted from those active brain areas that are most likely related to emotions. To this end, we carry out the selection of spatially compact regions of interest that are computed using the brain neural activity reconstructed from Electroencephalography data. Throughout this study, we consider three representative feature extraction methods widely applied to emotion detection tasks, including Power spectral density, Wavelet, and Hjorth parameters. Further feature selection is carried out using principal component analysis. For validation purpose, these features are used to feed a support vector machine classifier that is trained under the leave-one-out cross-validation strategy. Obtained results on real affective data show that incorporation of the proposed training method in combination with the enhanced spatial resolution provided by the source estimation allows improving the performed accuracy of discrimination in most of the considered emotions, namely: dominance, valence, and liking.

  19. A novel airport extraction model based on saliency region detection for high spatial resolution remote sensing images

    Science.gov (United States)

    Lv, Wen; Zhang, Libao; Zhu, Yongchun

    2017-06-01

    The airport is one of the most crucial traffic facilities in military and civil fields. Automatic airport extraction in high spatial resolution remote sensing images has many applications such as regional planning and military reconnaissance. Traditional airport extraction strategies usually base on prior knowledge and locate the airport target by template matching and classification, which will cause high computation complexity and large costs of computing resources for high spatial resolution remote sensing images. In this paper, we propose a novel automatic airport extraction model based on saliency region detection, airport runway extraction and adaptive threshold segmentation. In saliency region detection, we choose frequency-tuned (FT) model for computing airport saliency using low level features of color and luminance that is easy and fast to implement and can provide full-resolution saliency maps. In airport runway extraction, Hough transform is adopted to count the number of parallel line segments. In adaptive threshold segmentation, the Otsu threshold segmentation algorithm is proposed to obtain more accurate airport regions. The experimental results demonstrate that the proposed model outperforms existing saliency analysis models and shows good performance in the extraction of the airport.

  20. Extracting Regional Ionospheric TEC Measurements from Dense GPS (GNSS) Networks in Areas of High Seismic Risk

    Science.gov (United States)

    Reuveni, Y.; Bock, Y.; Geng, J.; Tong, X.; Moore, A. W.

    2013-12-01

    The ionosphere structure and peak electron density vary strongly with time, geographic location, and certain solar and geomagnetic disturbances, causing it to be dynamically variable, and hence, one of the main sources of GPS errors. Since ionospheric delays are a key limitation to successful GPS integer-cycle phase ambiguity resolution and point positioning accuracy, it is useful to estimate these delays on regional scales when using dense GPS networks. When estimating the Total Electron Content (TEC), one has to take into account the inner delay differences between the two frequencies, which are also known as the Differential Code Biases (DCBs), and can cause errors of several meters if they are ignored. Although DCB estimates for GNSS satellites and IGS ground receivers are provided on a regular basis by the International GNSS Service (IGS) analysis centers (such as CODE, JPL, and ESA), the DCBs for regional and local network receivers are not provided, and some of the IGS ground receiver estimates are not available from all analysis centers. Additionally, the DCB estimates vary between different GNSS satellites and ground receivers, where the majority of the DCBs values are based on the assumption that they are constant over 1 day or 1 month for any given GPS satellite or receiver. However, this assumption is far from being valid, since in fact the DCB values often vary diurnally or semi-diurnally. Developing and implementing regional ionospheric TEC models can be used in real-time to reduce errors in precise point positioning for dense real-time GPS networks. In addition, regional TEC maps extracted from GPS ionospheric path delays can be used, along with tropospheric delays, for mitigating errors in Interferometric Synthetic Aperture Radar (InSAR) images, especially for the L-band signals. The regional ionospheric TEC maps can also be used for the detection and characterization of ionospheric perturbations, which is valuable for both telluric natural hazards

  1. A Proposed Arabic Handwritten Text Normalization Method

    Directory of Open Access Journals (Sweden)

    Tarik Abu-Ain

    2014-11-01

    Full Text Available Text normalization is an important technique in document image analysis and recognition. It consists of many preprocessing stages, which include slope correction, text padding, skew correction, and straight the writing line. In this side, text normalization has an important role in many procedures such as text segmentation, feature extraction and characters recognition. In the present article, a new method for text baseline detection, straightening, and slant correction for Arabic handwritten texts is proposed. The method comprises a set of sequential steps: first components segmentation is done followed by components text thinning; then, the direction features of the skeletons are extracted, and the candidate baseline regions are determined. After that, selection of the correct baseline region is done, and finally, the baselines of all components are aligned with the writing line.  The experiments are conducted on IFN/ENIT benchmark Arabic dataset. The results show that the proposed method has a promising and encouraging performance.

  2. Characterization of new cardioprotective principle isolated from methanolic extract of Allium humile leaves from Himalayan region

    Directory of Open Access Journals (Sweden)

    Yogita Dobhal

    2016-06-01

    Full Text Available In modern era scientists have been trying to validate many properties of Allium species, especially in terms of the identity of the active components, their mechanism of action and exploring the potential benefits as food supplements. Thus, the present study has been designed to characterize the isolated cardioprotective compound from Allium humile leaves. Chromatographic purification of the methanolic extract of A. humile leaves isolated ajoene (enol form (AH-1- a new potent cardioprotective principle, along with three known compounds allicin (AH-2 and alliin (AH-3 and a flavonoid quercetin (AH-4. The structures of all the isolates (AH-1, AH-2 were characterized by using modern spectroscopic analysis UV, IR, 1H and 13C NMR and mass spectrometry. Furthermore, the new isolated compound pharmacologically conformed for cardioprotective effect. The data of known compounds (AH-2, AH-4 were further compared with the reported data for these compounds.

  3. Text Mining.

    Science.gov (United States)

    Trybula, Walter J.

    1999-01-01

    Reviews the state of research in text mining, focusing on newer developments. The intent is to describe the disparate investigations currently included under the term text mining and provide a cohesive structure for these efforts. A summary of research identifies key organizations responsible for pushing the development of text mining. A section…

  4. Full text

    African Journals Online (AJOL)

    IndexCopernicus Portal System

    Conclusions: We conclude that jellyfish causes many stings among fishermen in the Basra region. Their stings lead to immediate and delayed skin reactions. Self- treatment by topical remedies is common. INTRODUCTION. Venomous marine creatures are a well- recognized hazard for those working and swimming in the ...

  5. Full text

    African Journals Online (AJOL)

    IndexCopernicus Portal System

    35 cm abdomino-pelvic echogene mass, extending from the pubis symphysis to the epigastric region, displacing the liver, spleen and the urinary bladder. The computed tomography showed similar finding (figure 1). A giant ovarian cyst was diagnosed. Median laparotomy was performed and a 30 cm retroperitoneal cyst ...

  6. Effect of socket preservation therapies following tooth extraction in non-molar regions in humans: a systematic review

    NARCIS (Netherlands)

    ten Heggeler, J.M.A.G.; Slot, D.E.; van der Weijden, G.A.

    2011-01-01

    Objective: To assess, based on the existing literature, the benefit of socket preservation therapies in patients with a tooth extraction in the anterior or premolar region as compared with no additional treatment with respect to bone level. Material and methods: MEDLINE-PubMed and the Cochrane

  7. Larvicidal, antimicrobial and brine shrimp activities of extracts from Cissampelos mucronata and Tephrosia villosa from coast region, Tanzania

    Science.gov (United States)

    2011-01-01

    Background The leaves and roots of Cissampelos mucronata A. Rich (Menispermaceae) are widely used in the tropics and subtropics to manage various ailments such as gastro-intestinal complaints, menstrual problems, venereal diseases and malaria. In the Coast region, Tanzania, roots are used to treat wounds due to extraction of jigger. Leaves of Tephrosia villosa (L) Pers (Leguminosae) are reported to be used in the treatment of diabetes mellitus in India. In this study, extracts from the roots and aerial parts of C. mucronata and extracts from leaves, fruits, twigs and roots of T. villosa were evaluated for larvicidal activity, brine shrimps toxicity and antimicrobial activity. Methods Powdered materials from C. mucronata were extracted sequentially by dichloromethane followed by ethanol while materials from T.villosa were extracted by ethanol only. The extracts obtained were evaluated for larvicidal activity using Culex quinquefasciatus Say larvae, cytotoxicity using brine shrimp larvae and antimicrobial activity using bacteria and fungi. Results Extracts from aerial parts of C. Mucronata exhibited antibacterial activity against Staphylococcus aureus, Escherichia coli, Pseudomonas aeruginosa, Salmonella typhi, Vibrio cholera, Bacillus anthracis, Streptococcus faecalis and antifungal activity against Candida albicans and Cryptococcus neoformans. They exhibited very low toxicity to brine shrimps and had no larvicidal activity. The root extracts exhibited good larvicidal activity but weak antimicrobial activity. The root dichloromethane extracts from C. mucronata was found to be more toxic with an LC50 value of 59.608 μg/mL while ethanolic extracts from root were not toxic with LC50>100 μg/mL). Ethanol extracts from fruits and roots of T. villosa were found to be very toxic with LC50 values of 9.690 μg/mL and 4.511 μg/mL, respectively, while, ethanol extracts from leaves and twigs of T. villosa were found to be non toxic (LC50>100 μg/mL). Conclusion These results

  8. Regional urban area extraction using DMSP-OLS data and MODIS data

    Science.gov (United States)

    Zhang, X. Y.; Cai, C.; Li, P. J.

    2014-03-01

    Stable night lights data from Defense Meteorological Satellite Program (DMSP) Operational Line-scan System (OLS) provide a unique proxy for anthropogenic development. This paper proposed two new methods of extracting regional urban extents using DMSP-OLS data, MODIS NDVI data and Land Surface Temperature (LST) data. MODIS NDVI data were used to reduce the over-glow effect, since urban areas generally have lower vegetation index values than the surrounding areas (e.g. agricultural and forest areas). On the other hand, urban areas generally show higher surface temperatures than the surrounding areas. Since urban area is the only class of interest, a one-class classifier, the One-Class Support Vector Machine (OCSVM), was selected as the classifier. The first method is classification of different data combinations for mapping: (1) OLS data and NDVI data, (2) OLS data and LST data, and (3) OLS data, NDVI data and LST data combined. The second one is a morphological reconstruction based method which combines classification results from OLS plus NDVI data and from OLS plus LST data. In the morphological reconstruction based method, the classification result using OLS and NDVI data was used as a mask image, while the classification result using OLS and LST data was used as a marker image. The north China area covering 14 provinces was selected as study area. Classification results from Landsat TM/ETM+ data from selected areas with different development levels were used as reference data to validate the proposed methods. The results show that the proposed methods effectively reduced the over-glow effect caused by DSMP-OLS data and achieved better results compared to the results from the traditional thresholding technique. The combination of all three datasets produces more accurate results than those of using any two datasets. The proposed morphological reconstruction based method achieves the best result in urban extent mapping.

  9. A new breast cancer risk analysis approach using features extracted from multiple sub-regions on bilateral mammograms

    Science.gov (United States)

    Sun, Wenqing; Tseng, Tzu-Liang B.; Zheng, Bin; Zhang, Jianying; Qian, Wei

    2015-03-01

    A novel breast cancer risk analysis approach is proposed for enhancing performance of computerized breast cancer risk analysis using bilateral mammograms. Based on the intensity of breast area, five different sub-regions were acquired from one mammogram, and bilateral features were extracted from every sub-region. Our dataset includes 180 bilateral mammograms from 180 women who underwent routine screening examinations, all interpreted as negative and not recalled by the radiologists during the original screening procedures. A computerized breast cancer risk analysis scheme using four image processing modules, including sub-region segmentation, bilateral feature extraction, feature selection, and classification was designed to detect and compute image feature asymmetry between the left and right breasts imaged on the mammograms. The highest computed area under the curve (AUC) is 0.763 ± 0.021 when applying the multiple sub-region features to our testing dataset. The positive predictive value and the negative predictive value were 0.60 and 0.73, respectively. The study demonstrates that (1) features extracted from multiple sub-regions can improve the performance of our scheme compared to using features from whole breast area only; (2) a classifier using asymmetry bilateral features can effectively predict breast cancer risk; (3) incorporating texture and morphological features with density features can boost the classification accuracy.

  10. Methanol extract of Nigella sativa seed induces changes in the levels of neurotransmitter amino acids in male rat brain regions.

    Science.gov (United States)

    El-Naggar, Tarek; Carretero, María Emilia; Arce, Carmen; Gómez-Serranillos, María Pilar

    2017-12-01

    Nigella sativa L. (Ranunculaceae) (NS) has been used for medicinal and culinary purposes. Different parts of the plant are used to treat many disorders. This study investigates the effects of NS methanol extract on brain neurotransmitter amino acid levels. We measured the changes in aspartate, glutamate, glycine and γ-aminobutyric acid in five brain regions of male Wistar rats after methanol extract treatment. Animals were injected intraperitoneally with saline solution (controls) or NS methanol extract (equivalent of 2.5 g/kg body weight) and sacrificed 1 h later or after administering 1 daily dose for 8 days. The neurotransmitters were measured in the hypothalamus, cortex, striatum, hippocampus and thalamus by HPLC. Results showed significant changes in amino acids compared to basal values. Glutamate increased significantly (16-36%) in the regions analyzed except the striatum. Aspartate in the hypothalamus (50 and 76%) and glycine in hippocampus (32 and 25%), thalamus (66 and 29%) and striatum (75 and 48%) also increased with the two treatment intervals. γ-Aminobutyric acid significantly increased in the hippocampus (38 and 32%) and thalamus (22 and 40%) but decreased in the cortex and hypothalamus although in striatum only after eight days of treatment (24%). Our results suggest that injected methanol extract modifies amino acid levels in the rat brain regions. These results could be of interest since some neurodegenerative diseases are related to amino acid level imbalances in the central nervous system, suggesting the prospect for therapeutic use of NS against these disorders.

  11. A la Recherche du Temps Perdu: extracting temporal relations from medical text in the 2012 i2b2 NLP challenge.

    Science.gov (United States)

    Cherry, Colin; Zhu, Xiaodan; Martin, Joel; de Bruijn, Berry

    2013-01-01

    An analysis of the timing of events is critical for a deeper understanding of the course of events within a patient record. The 2012 i2b2 NLP challenge focused on the extraction of temporal relationships between concepts within textual hospital discharge summaries. The team from the National Research Council Canada (NRC) submitted three system runs to the second track of the challenge: typifying the time-relationship between pre-annotated entities. The NRC system was designed around four specialist modules containing statistical machine learning classifiers. Each specialist targeted distinct sets of relationships: local relationships, 'sectime'-type relationships, non-local overlap-type relationships, and non-local causal relationships. The best NRC submission achieved a precision of 0.7499, a recall of 0.6431, and an F1 score of 0.6924, resulting in a statistical tie for first place. Post hoc improvements led to a precision of 0.7537, a recall of 0.6455, and an F1 score of 0.6954, giving the highest scores reported on this task to date. Methods for general relation extraction extended well to temporal relations, and gave top-ranked state-of-the-art results. Careful ordering of predictions within result sets proved critical to this success.

  12. Antimicrobial potential of ethanol extracts of plants against gramnegative bacilli isolated from cervicovaginal mucosa of sheep bred in the region of Petrolina-PE

    Directory of Open Access Journals (Sweden)

    Valdenice Félix da Silva

    2014-02-01

    Full Text Available Reproductive tract infections are the main causes of losses from the low reproductive efficiency of sheep. Gram negative bacilli belonging to the normal flora of the genital region can trigger diseases. The pathogenicity of these agents is expressed when females are with weakened immune system, either by food or stress management. Flaws in and concern about antibiotic residues in animal production have prompted research regarding alternatives for the treatment of diseases. The herbal medicine hás been considered in this context is the subject of numerous studies. This study aimed to evaluate the antibacterial potential of ethanol extracts of plants belonging to the flora of the Northeast against gram negative bacilli isolated from cervical-vaginal mucosa of sheep. Six plants were selected from Caatinga biome: Encholirium spectabile, Bromelia laciniosa, Neoglaziovia variegata, Amburana cearensis Hymenaea martiana and Selaginella convoluta. The plant material was processed to obtain the crude extract. This was tested by microdilution plate and determining the minimum bactericidal concentration, the second document of Clinical and Laboratory Standards Institute (CLSI and the extracts diluted in water and alcohol. We used 43 gram negative isolates, as follows: 14 E. coli, 10 Enterobacter spp., 10 Acinetobacter spp. 9 and Klebsiella spp. In the aqueous dilution Klebsiella spp. showed response only to species B. laciniosa, S. convoluta and H. martiana. All tested extracts showed antibacterial activity against Acinetobacter spp and no activity against E. coli and Enterobacter spp. Among the extracts diluted in water H. martiana showed the highest antibacterial activity. In all dilution alcoholic extracts showed inhibitory activity against all bacterial genera, but no statistical difference between them.

  13. Extracting Neural Oscillation Signatures of Laser-Induced Nociception in Pain-Related Regions in Rats

    Directory of Open Access Journals (Sweden)

    Xuezhu Li

    2017-10-01

    Full Text Available Previous studies have shown that multiple brain regions are involved in pain perception and pain-related neural processes by forming a functionally connected pain network. It is still unclear how these pain-related brain areas actively work together to generate the experience of pain. To get a better insight into the pain network, we implanted electrodes in four pain-related areas of rats including the anterior cingulate cortex (ACC, orbitofrontal cortex (OFC, primary somatosensory cortex (S1 and periaqueductal gray (PAG. We analyzed the pattern of local field potential (LFP oscillations under noxious laser stimulations and innoxious laser stimulations. A high-dimensional feature matrix was built based on the LFP characters for both experimental conditions. Generalized linear models (GLMs were trained to classify recorded LFPs under noxious vs. innoxious condition. We found a general power decrease in α and β bands and power increase in γ band in the recorded areas under noxious condition. After noxious laser stimulation, there was a consistent change in LFP power and correlation in all four brain areas among all 13 rats. With GLM classifiers, noxious laser trials were distinguished from innoxious laser trials with high accuracy (86% using high-dimensional LFP features. This work provides a basis for further research to examine which aspects (e.g., sensory, motor or affective processes of noxious stimulation should drive distinct neural activity across the pain network.

  14. Measurements of long-range near-side angular correlations in $\\sqrt{s_{\\text{NN}}}=5$TeV proton-lead collisions in the forward region

    CERN Document Server

    Aaij, Roel; Adeva, Bernardo; Adinolfi, Marco; Affolder, Anthony; Ajaltouni, Ziad; Akar, Simon; Albrecht, Johannes; Alessio, Federico; Alexander, Michael; Ali, Suvayu; Alkhazov, Georgy; Alvarez Cartelle, Paula; Alves Jr, Antonio Augusto; Amato, Sandra; Amerio, Silvia; Amhis, Yasmine; An, Liupan; Anderlini, Lucio; Anderson, Jonathan; Andreassi, Guido; Andreotti, Mirco; Andrews, Jason; Appleby, Robert; Aquines Gutierrez, Osvaldo; Archilli, Flavio; d'Argent, Philippe; Artamonov, Alexander; Artuso, Marina; Aslanides, Elie; Auriemma, Giulio; Baalouch, Marouen; Bachmann, Sebastian; Back, John; Badalov, Alexey; Baesso, Clarissa; Baldini, Wander; Barlow, Roger; Barschel, Colin; Barsuk, Sergey; Barter, William; Batozskaya, Varvara; Battista, Vincenzo; Bay, Aurelio; Beaucourt, Leo; Beddow, John; Bedeschi, Franco; Bediaga, Ignacio; Bel, Lennaert; Bellee, Violaine; Belloli, Nicoletta; Belyaev, Ivan; Ben-Haim, Eli; Bencivenni, Giovanni; Benson, Sean; Benton, Jack; Berezhnoy, Alexander; Bernet, Roland; Bertolin, Alessandro; Bettler, Marc-Olivier; van Beuzekom, Martinus; Bien, Alexander; Bifani, Simone; Billoir, Pierre; Bird, Thomas; Birnkraut, Alex; Bizzeti, Andrea; Blake, Thomas; Blanc, Frédéric; Blouw, Johan; Blusk, Steven; Bocci, Valerio; Bondar, Alexander; Bondar, Nikolay; Bonivento, Walter; Borghi, Silvia; Borsato, Martino; Bowcock, Themistocles; Bowen, Espen Eie; Bozzi, Concezio; Braun, Svende; Britsch, Markward; Britton, Thomas; Brodzicka, Jolanta; Brook, Nicholas; Buchanan, Emma; Burr, Christopher; Bursche, Albert; Buytaert, Jan; Cadeddu, Sandro; Calabrese, Roberto; Calvi, Marta; Calvo Gomez, Miriam; Campana, Pierluigi; Campora Perez, Daniel; Capriotti, Lorenzo; Carbone, Angelo; Carboni, Giovanni; Cardinale, Roberta; Cardini, Alessandro; Carniti, Paolo; Carson, Laurence; Carvalho Akiba, Kazuyoshi; Casse, Gianluigi; Cassina, Lorenzo; Castillo Garcia, Lucia; Cattaneo, Marco; Cauet, Christophe; Cavallero, Giovanni; Cenci, Riccardo; Charles, Matthew; Charpentier, Philippe; Chefdeville, Maximilien; Chen, Shanzhen; Cheung, Shu-Faye; Chiapolini, Nicola; Chrzaszcz, Marcin; Cid Vidal, Xabier; Ciezarek, Gregory; Clarke, Peter; Clemencic, Marco; Cliff, Harry; Closier, Joel; Coco, Victor; Cogan, Julien; Cogneras, Eric; Cogoni, Violetta; Cojocariu, Lucian; Collazuol, Gianmaria; Collins, Paula; Comerma-Montells, Albert; Contu, Andrea; Cook, Andrew; Coombes, Matthew; Coquereau, Samuel; Corti, Gloria; Corvo, Marco; Couturier, Benjamin; Cowan, Greig; Craik, Daniel Charles; Crocombe, Andrew; Cruz Torres, Melissa Maria; Cunliffe, Samuel; Currie, Robert; D'Ambrosio, Carmelo; Dall'Occo, Elena; Dalseno, Jeremy; David, Pieter; Davis, Adam; De Aguiar Francisco, Oscar; De Bruyn, Kristof; De Capua, Stefano; De Cian, Michel; De Miranda, Jussara; De Paula, Leandro; De Simone, Patrizia; Dean, Cameron Thomas; Decamp, Daniel; Deckenhoff, Mirko; Del Buono, Luigi; Déléage, Nicolas; Demmer, Moritz; Derkach, Denis; Deschamps, Olivier; Dettori, Francesco; Dey, Biplab; Di Canto, Angelo; Di Ruscio, Francesco; Dijkstra, Hans; Donleavy, Stephanie; Dordei, Francesca; Dorigo, Mirco; Dosil Suárez, Alvaro; Dossett, David; Dovbnya, Anatoliy; Dreimanis, Karlis; Dufour, Laurent; Dujany, Giulio; Dupertuis, Frederic; Durante, Paolo; Dzhelyadin, Rustem; Dziurda, Agnieszka; Dzyuba, Alexey; Easo, Sajan; Egede, Ulrik; Egorychev, Victor; Eidelman, Semen; Eisenhardt, Stephan; Eitschberger, Ulrich; Ekelhof, Robert; Eklund, Lars; El Rifai, Ibrahim; Elsasser, Christian; Ely, Scott; Esen, Sevda; Evans, Hannah Mary; Evans, Timothy; Falabella, Antonio; Färber, Christian; Farley, Nathanael; Farry, Stephen; Fay, Robert; Ferguson, Dianne; Fernandez Albor, Victor; Ferrari, Fabio; Ferreira Rodrigues, Fernando; Ferro-Luzzi, Massimiliano; Filippov, Sergey; Fiore, Marco; Fiorini, Massimiliano; Firlej, Miroslaw; Fitzpatrick, Conor; Fiutowski, Tomasz; Fohl, Klaus; Fol, Philip; Fontana, Marianna; Fontanelli, Flavio; Forshaw, Dean Charles; Forty, Roger; Frank, Markus; Frei, Christoph; Frosini, Maddalena; Fu, Jinlin; Furfaro, Emiliano; Gallas Torreira, Abraham; Galli, Domenico; Gallorini, Stefano; Gambetta, Silvia; Gandelman, Miriam; Gandini, Paolo; Gao, Yuanning; García Pardiñas, Julián; Garra Tico, Jordi; Garrido, Lluis; Gascon, David; Gaspar, Clara; Gauld, Rhorry; Gavardi, Laura; Gazzoni, Giulio; Gerick, David; Gersabeck, Evelina; Gersabeck, Marco; Gershon, Timothy; Ghez, Philippe; Gianì, Sebastiana; Gibson, Valerie; Girard, Olivier Göran; Giubega, Lavinia-Helena; Gligorov, V.V.; Göbel, Carla; Golubkov, Dmitry; Golutvin, Andrey; Gomes, Alvaro; Gotti, Claudio; Grabalosa Gándara, Marc; Graciani Diaz, Ricardo; Granado Cardoso, Luis Alberto; Graugés, Eugeni; Graverini, Elena; Graziani, Giacomo; Grecu, Alexandru; Greening, Edward; Gregson, Sam; Griffith, Peter; Grillo, Lucia; Grünberg, Oliver; Gui, Bin; Gushchin, Evgeny; Guz, Yury; Gys, Thierry; Hadavizadeh, Thomas; Hadjivasiliou, Christos; Haefeli, Guido; Haen, Christophe; Haines, Susan; Hall, Samuel; Hamilton, Brian; Han, Xiaoxue; Hansmann-Menzemer, Stephanie; Harnew, Neville; Harnew, Samuel; Harrison, Jonathan; He, Jibo; Head, Timothy; Heijne, Veerle; Heister, Arno; Hennessy, Karol; Henrard, Pierre; Henry, Louis; Hernando Morata, Jose Angel; van Herwijnen, Eric; Heß, Miriam; Hicheur, Adlène; Hill, Donal; Hoballah, Mostafa; Hombach, Christoph; Hulsbergen, Wouter; Humair, Thibaud; Hussain, Nazim; Hutchcroft, David; Hynds, Daniel; Idzik, Marek; Ilten, Philip; Jacobsson, Richard; Jaeger, Andreas; Jalocha, Pawel; Jans, Eddy; Jawahery, Abolhassan; Jing, Fanfan; John, Malcolm; Johnson, Daniel; Jones, Christopher; Joram, Christian; Jost, Beat; Jurik, Nathan; Kandybei, Sergii; Kanso, Walaa; Karacson, Matthias; Karbach, Moritz; Karodia, Sarah; Kecke, Matthieu; Kelsey, Matthew; Kenyon, Ian; Kenzie, Matthew; Ketel, Tjeerd; Khairullin, Egor; Khanji, Basem; Khurewathanakul, Chitsanu; Kirn, Thomas; Klaver, Suzanne; Klimaszewski, Konrad; Kochebina, Olga; Kolpin, Michael; Komarov, Ilya; Koopman, Rose; Koppenburg, Patrick; Kozeiha, Mohamad; Kravchuk, Leonid; Kreplin, Katharina; Kreps, Michal; Krocker, Georg; Krokovny, Pavel; Kruse, Florian; Krzemien, Wojciech; Kucewicz, Wojciech; Kucharczyk, Marcin; Kudryavtsev, Vasily; Kuonen, Axel Kevin; Kurek, Krzysztof; Kvaratskheliya, Tengiz; Lacarrere, Daniel; Lafferty, George; Lai, Adriano; Lambert, Dean; Lanfranchi, Gaia; Langenbruch, Christoph; Langhans, Benedikt; Latham, Thomas; Lazzeroni, Cristina; Le Gac, Renaud; van Leerdam, Jeroen; Lees, Jean-Pierre; Lefèvre, Regis; Leflat, Alexander; Lefrançois, Jacques; Lemos Cid, Edgar; Leroy, Olivier; Lesiak, Tadeusz; Leverington, Blake; Li, Yiming; Likhomanenko, Tatiana; Liles, Myfanwy; Lindner, Rolf; Linn, Christian; Lionetto, Federica; Liu, Bo; Liu, Xuesong; Loh, David; Longstaff, Iain; Lopes, Jose; Lucchesi, Donatella; Lucio Martinez, Miriam; Luo, Haofei; Lupato, Anna; Luppi, Eleonora; Lupton, Oliver; Lusiani, Alberto; Machefert, Frederic; Maciuc, Florin; Maev, Oleg; Maguire, Kevin; Malde, Sneha; Malinin, Alexander; Manca, Giulia; Mancinelli, Giampiero; Manning, Peter Michael; Mapelli, Alessandro; Maratas, Jan; Marchand, Jean François; Marconi, Umberto; Marin Benito, Carla; Marino, Pietro; Marks, Jörg; Martellotti, Giuseppe; Martin, Morgan; Martinelli, Maurizio; Martinez Santos, Diego; Martinez Vidal, Fernando; Martins Tostes, Danielle; Massafferri, André; Matev, Rosen; Mathad, Abhijit; Mathe, Zoltan; Matteuzzi, Clara; Mauri, Andrea; Maurin, Brice; Mazurov, Alexander; McCann, Michael; McCarthy, James; McNab, Andrew; McNulty, Ronan; Meadows, Brian; Meier, Frank; Meissner, Marco; Melnychuk, Dmytro; Merk, Marcel; Michielin, Emanuele; Milanes, Diego Alejandro; Minard, Marie-Noelle; Mitzel, Dominik Stefan; Molina Rodriguez, Josue; Monroy, Ignacio Alberto; Monteil, Stephane; Morandin, Mauro; Morawski, Piotr; Mordà, Alessandro; Morello, Michael Joseph; Moron, Jakub; Morris, Adam Benjamin; Mountain, Raymond; Muheim, Franz; Müller, Dominik; Müller, Janine; Müller, Katharina; Müller, Vanessa; Mussini, Manuel; Muster, Bastien; Naik, Paras; Nakada, Tatsuya; Nandakumar, Raja; Nandi, Anita; Nasteva, Irina; Needham, Matthew; Neri, Nicola; Neubert, Sebastian; Neufeld, Niko; Neuner, Max; Nguyen, Anh Duc; Nguyen, Thi-Dung; Nguyen-Mau, Chung; Niess, Valentin; Niet, Ramon; Nikitin, Nikolay; Nikodem, Thomas; Novoselov, Alexey; O'Hanlon, Daniel Patrick; Oblakowska-Mucha, Agnieszka; Obraztsov, Vladimir; Ogilvy, Stephen; Okhrimenko, Oleksandr; Oldeman, Rudolf; Onderwater, Gerco; Osorio Rodrigues, Bruno; Otalora Goicochea, Juan Martin; Otto, Adam; Owen, Patrick; Oyanguren, Maria Aranzazu; Palano, Antimo; Palombo, Fernando; Palutan, Matteo; Panman, Jacob; Papanestis, Antonios; Pappagallo, Marco; Pappalardo, Luciano; Pappenheimer, Cheryl; Parker, William; Parkes, Christopher; Passaleva, Giovanni; Patel, Girish; Patel, Mitesh; Patrignani, Claudia; Pearce, Alex; Pellegrino, Antonio; Penso, Gianni; Pepe Altarelli, Monica; Perazzini, Stefano; Perret, Pascal; Pescatore, Luca; Petridis, Konstantinos; Petrolini, Alessandro; Petruzzo, Marco; Picatoste Olloqui, Eduardo; Pietrzyk, Boleslaw; Pilař, Tomas; Pinci, Davide; Pistone, Alessandro; Piucci, Alessio; Playfer, Stephen; Plo Casasus, Maximo; Poikela, Tuomas; Polci, Francesco; Poluektov, Anton; Polyakov, Ivan; Polycarpo, Erica; Popov, Alexander; Popov, Dmitry; Popovici, Bogdan; Potterat, Cédric; Price, Eugenia; Price, Joseph David; Prisciandaro, Jessica; Pritchard, Adrian; Prouve, Claire; Pugatch, Valery; Puig Navarro, Albert; Punzi, Giovanni; Qian, Wenbin; Quagliani, Renato; Rachwal, Bartolomiej; Rademacker, Jonas; Rama, Matteo; Rangel, Murilo; Raniuk, Iurii; Rauschmayr, Nathalie; Raven, Gerhard; Redi, Federico; Reichert, Stefanie; Reid, Matthew; dos Reis, Alberto; Ricciardi, Stefania; Richards, Sophie; Rihl, Mariana; Rinnert, Kurt; Rives Molina, Vincente; Robbe, Patrick; Rodrigues, Ana Barbara; Rodrigues, Eduardo; Rodriguez Lopez, Jairo Alexis; Rodriguez Perez, Pablo; Roiser, Stefan; Romanovsky, Vladimir; Romero Vidal, Antonio; Ronayne, John William; Rotondo, Marcello; Rouvinet, Julien; Ruf, Thomas; Ruiz Valls, Pablo; Saborido Silva, Juan Jose; Sagidova, Naylya; Sail, Paul; Saitta, Biagio; Salustino Guimaraes, Valdir; Sanchez Mayordomo, Carlos; Sanmartin Sedes, Brais; Santacesaria, Roberta; Santamarina Rios, Cibran; Santimaria, Marco; Santovetti, Emanuele; Sarti, Alessio; Satriano, Celestina; Satta, Alessia; Saunders, Daniel Martin; Savrina, Darya; Schael, Stefan; Schiller, Manuel; Schindler, Heinrich; Schlupp, Maximilian; Schmelling, Michael; Schmelzer, Timon; Schmidt, Burkhard; Schneider, Olivier; Schopper, Andreas; Schubiger, Maxime; Schune, Marie Helene; Schwemmer, Rainer; Sciascia, Barbara; Sciubba, Adalberto; Semennikov, Alexander; Sergi, Antonino; Serra, Nicola; Serrano, Justine; Sestini, Lorenzo; Seyfert, Paul; Shapkin, Mikhail; Shapoval, Illya; Shcheglov, Yury; Shears, Tara; Shekhtman, Lev; Shevchenko, Vladimir; Shires, Alexander; Siddi, Benedetto Gianluca; Silva Coutinho, Rafael; Silva de Oliveira, Luiz Gustavo; Simi, Gabriele; Sirendi, Marek; Skidmore, Nicola; Skwarnicki, Tomasz; Smith, Edmund; Smith, Eluned; Smith, Iwan Thomas; Smith, Jackson; Smith, Mark; Snoek, Hella; Sokoloff, Michael; Soler, Paul; Soomro, Fatima; Souza, Daniel; Souza De Paula, Bruno; Spaan, Bernhard; Spradlin, Patrick; Sridharan, Srikanth; Stagni, Federico; Stahl, Marian; Stahl, Sascha; Stefkova, Slavomira; Steinkamp, Olaf; Stenyakin, Oleg; Stevenson, Scott; Stoica, Sabin; Stone, Sheldon; Storaci, Barbara; Stracka, Simone; Straticiuc, Mihai; Straumann, Ulrich; Sun, Liang; Sutcliffe, William; Swientek, Krzysztof; Swientek, Stefan; Syropoulos, Vasileios; Szczekowski, Marek; Szumlak, Tomasz; T'Jampens, Stephane; Tayduganov, Andrey; Tekampe, Tobias; Teklishyn, Maksym; Tellarini, Giulia; Teubert, Frederic; Thomas, Christopher; Thomas, Eric; van Tilburg, Jeroen; Tisserand, Vincent; Tobin, Mark; Todd, Jacob; Tolk, Siim; Tomassetti, Luca; Tonelli, Diego; Topp-Joergensen, Stig; Torr, Nicholas; Tournefier, Edwige; Tourneur, Stephane; Trabelsi, Karim; Tran, Minh Tâm; Tresch, Marco; Trisovic, Ana; Tsaregorodtsev, Andrei; Tsopelas, Panagiotis; Tuning, Niels; Ukleja, Artur; Ustyuzhanin, Andrey; Uwer, Ulrich; Vacca, Claudia; Vagnoni, Vincenzo; Valenti, Giovanni; Vallier, Alexis; Vazquez Gomez, Ricardo; Vazquez Regueiro, Pablo; Vázquez Sierra, Carlos; Vecchi, Stefania; van Veghel, Maarten; Velthuis, Jaap; Veltri, Michele; Veneziano, Giovanni; Vesterinen, Mika; Viaud, Benoit; Vieira, Daniel; Vieites Diaz, Maria; Vilasis-Cardona, Xavier; Volkov, Vladimir; Vollhardt, Achim; Volyanskyy, Dmytro; Voong, David; Vorobyev, Alexey; Vorobyev, Vitaly; Voß, Christian; de Vries, Jacco; Waldi, Roland; Wallace, Charlotte; Wallace, Ronan; Walsh, John; Wandernoth, Sebastian; Wang, Jianchun; Ward, David; Watson, Nigel; Websdale, David; Weiden, Andreas; Whitehead, Mark; Wilkinson, Guy; Wilkinson, Michael; Williams, Mark Richard James; Williams, Matthew; Williams, Mike; Williams, Timothy; Wilson, Fergus; Wimberley, Jack; Wishahi, Julian; Wislicki, Wojciech; Witek, Mariusz; Wormser, Guy; Wotton, Stephen; Wright, Simon; Wyllie, Kenneth; Xie, Yuehong; Xu, Zhirui; Yang, Zhenwei; Yu, Jiesheng; Yuan, Xuhao; Yushchenko, Oleg; Zangoli, Maria; Zavertyaev, Mikhail; Zhang, Liming; Zhang, Yanxi; Zhelezov, Alexey; Zhokhov, Anatoly; Zhong, Liang; Zhukov, Valery; Zucchelli, Stefano

    2016-01-01

    Two-particle angular correlations are studied in proton-lead collisions at a nucleon-nucleon centre-of-mass energy of $\\sqrt{s_{\\text{NN}}}=5$TeV, collected with the LHCb detector at the LHC. The analysis is based on data recorded in two beam configurations, in which either the direction of the proton or that of the lead ion is analysed. The correlations are measured as a function of relative pseudorapidity, $\\Delta\\eta$, and relative azimuthal angle, $\\Delta\\phi$, for events in different classes of event activity and for different bins of particle transverse momentum. In high-activity events a long-range correlation on the near side, $\\Delta\\phi \\approx 0$, is observed in the pseudorapidity range $2.0<\\eta<4.9$. This measurement of long-range correlations on the near side in proton-lead collisions extends previous observations into the forward region up to $\\eta=4.9$. The correlation increases with growing event activity and is found to be more pronounced in the direction of the lead beam. However, the...

  15. Regional Studies Program. Extraction of North Dakota lignite: environmental and reclamation issues

    Energy Technology Data Exchange (ETDEWEB)

    LaFevers, J.R.; Johnson, D.O.; Dvorak, A.J.

    1976-12-01

    This study, sponsored by the U.S. Energy Research and Development Administration, addresses the environmental implications of extraction of coal in North Dakota. These implications are supported by details of the geologic and historical background of the area of focus, the lignite resources in the Fort Union coalfield portion. The particular concentration is on the four-county area of Mercer, Dunn, McLean, and Oliver where substantial coal reserves exist and a potential gasification plant site has been identified. The purposes of this extensive study are to identify the land use and environmental problems and issues associated with extraction; to provide a base of information for assessing the impacts of various levels of extraction; to examine the economics and feasibility of reclamation; and to identify research that needs to be undertaken to evaluate and to improve reclamation practices. The study also includes a description of the physical and chemical soil characteristics and hydrological and climatic factors entailed in extraction, revegetation, and reclamation procedures.

  16. Antioxidant Activities of Ribes diacanthum Pall Extracts in the Northern Region of Mongolia

    Science.gov (United States)

    Birasuren, Bayarmaa; Oh, Hye Lim; Kim, Cho Rong; Kim, Na Yeon; Jeon, Hye Lyun; Kim, Mee Ree

    2012-01-01

    Ribes diacanthum Pall (RDP) is a member of the Saxifragaceae family. The plant is traditionally used in Mongolia for the treatment of various ailments associated with kidney and bladder’s diseases, cystitis, kidney stone, and edema. This study was aimed to investigate antioxidant activities of different solvent extracts of whole Pall plants, based on ferric-reducing antioxidant potential (FRAP), 2,2′-azinobis(3-ethybenzothiazoline-6-sulfonic acid) (ABTS· +) radical scavenging activity, 1,1-diphenyl-2-picrydrazyl (DPPH·), and hydroxyl (·OH) radical scavenging activities. Additionally, total flavonoids and phenolic contents (TPC) were also determined. The ethyl acetate extract of RDP (EARDP) had a remarkable radical scavenging capacity with an IC50 value of 0.1482 mg/mL. In addition, EARDP was shown to be higher in total phenolic and flavonoid contents than the methanol extract of RDP (MRDP). Moreover, the EARDP had the predominant antioxidant capacity, DPPH, hydroxyl, and ABTS radical scavenging activities and ferric reducing power. These results suggest a potential for R. diacanthum Pall extract as a functional medicinal material against free-radical-associated oxidative damage. PMID:24471094

  17. Multilingual Text Detection with Nonlinear Neural Network

    Directory of Open Access Journals (Sweden)

    Lin Li

    2015-01-01

    Full Text Available Multilingual text detection in natural scenes is still a challenging task in computer vision. In this paper, we apply an unsupervised learning algorithm to learn language-independent stroke feature and combine unsupervised stroke feature learning and automatically multilayer feature extraction to improve the representational power of text feature. We also develop a novel nonlinear network based on traditional Convolutional Neural Network that is able to detect multilingual text regions in the images. The proposed method is evaluated on standard benchmarks and multilingual dataset and demonstrates improvement over the previous work.

  18. Regional Oil Extraction and Consumption: A simple production model for the next 35 years Part I

    CERN Document Server

    Dittmar, Michael

    2016-01-01

    The growing conflicts in and about oil exporting regions and speculations about volatile oil prices during the last decade have renewed the public interest in predictions for the near future oil production and consumption. Unfortunately, studies from only 10 years ago, which tried to forecast the oil production during the next 20-30 years, failed to make accurate predictions for today's global oil production and consumption. Forecasts using economic growth scenarios, overestimated the actual oil production, while models which tried to estimate the maximum future oil production/year, using the official country oil reserve data, predicted a too low production. In this paper, a new approach to model the maximal future regional and thus global oil production (part I) and consumption (part II) during the next decades is proposed. Our analysis of the regional oil production data during past decades shows that, in contrast to periods when production was growing and growth rates varied greatly from one country to ano...

  19. Microproteomics by liquid extraction surface analysis: application to FFPE tissue to study the fimbria region of tubo-ovarian cancer.

    Science.gov (United States)

    Wisztorski, Maxence; Fatou, Benoit; Franck, Julien; Desmons, Annie; Farré, Isabelle; Leblanc, Eric; Fournier, Isabelle; Salzet, Michel

    2013-04-01

    We have developed a new method for rapid analysis of a specific region on formalin fixed and paraffin embedded (FFPE) tissue sections. This method combines advantages of direct tissue MS analysis keeping histological information and conventional proteomics approaches for confident identification of proteins in complex sample. After histological annotation, heat-induced antigen retrieval is performed on FFPE tissue. Using a chemical inkjet printer, trypsin is deposited on discrete regions of less than 1 mm². After protein digestion, a liquid extraction is performed to retrieve all the peptides. Data coming from identification of proteins in cancer and benign region are compared. In total, 3649 unique peptides were identified (with a peptide strict false discovery rate less than 1%) corresponding to 983 and 792 nonredundant protein groups identified in benign and cancer region, respectively. A total of 123 protein groups are found only in cancer region and 315 are specific to the benign part. From these data, it has been possible to obtain different important signaling pathways involved in cancer processes and some proteins already known as biomarkers. This new approach using a combination of localized on-tissue protein digestion and liquid microextraction followed by LC-MS/MS analysis is useful for advancing our understanding of cancer biology. It is a rapid and innovative technique that will contribute positively to clinical proteomics. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Composition and sources of winter haze in the Bakken oil and gas extraction region

    Science.gov (United States)

    Evanoski-Cole, A. R.; Gebhart, K. A.; Sive, B. C.; Zhou, Y.; Capps, S. L.; Day, D. E.; Prenni, A. J.; Schurman, M. I.; Sullivan, A. P.; Li, Y.; Hand, J. L.; Schichtel, B. A.; Collett, J. L.

    2017-05-01

    In the past decade increased use of hydraulic fracturing and horizontal drilling has dramatically expanded oil and gas production in the Bakken formation region. Long term monitoring sites have indicated an increase in wintertime aerosol nitrate and sulfate in this region from particulate matter (PM2.5) measurements collected between 2000 and 2010. No previous intensive air quality field campaign has been conducted in this region to assess impacts from oil and gas development on regional fine particle concentrations. The research presented here investigates wintertime PM2.5 concentrations and composition as part of the Bakken Air Quality Study (BAQS). Measurements from BAQS took place over two wintertime sampling periods at multiple sites in the United States portion of the Bakken formation and show regionally elevated episodes of PM2.5 during both study periods. Ammonium nitrate was a major contributor to haze episodes. Periods of air stagnation or recirculation were associated with rapid increases in PM2.5 concentrations. Volatile organic compound (VOC) signatures suggest that air masses during these episodes were dominated by emissions from the Bakken region itself. Formation rates of alkyl nitrates from alkanes revealed an air mass aging timescale of typically less than a day for periods with elevated PM2.5. A thermodynamic inorganic aerosol model (ISORROPIA) was used to investigate gas-particle partitioning and to examine the sensitivity of PM2.5 concentrations to aerosol precursor concentrations. Formation of ammonium nitrate, the dominant component, was most sensitive to ammonia concentrations during winter and to nitric acid concentrations during early spring when ammonia availability increases. The availability of excess ammonia suggests capacity for further ammonium nitrate formation if nitrogen oxide emissions increase in the future and lead to additional secondary formation of nitric acid.

  1. Antioxidant activities and total phenol content of Inula viscosa extracts selected from three regions of Morocco

    Directory of Open Access Journals (Sweden)

    Naima Chahmi

    2015-03-01

    Conclusions: Our results of antioxidant assays were justified and partially supported the popular usage of the tested plants. The high antioxidant activity found in the plant from Sefrou and its great biomass in this region suggested that Inula viscosa is a good source of natural antioxidants compounds which might have benefits for health.

  2. Progressive region-based colon extraction for computer-aided detection and quantitative imaging in cathartic and non-cathartic CT colonography

    Science.gov (United States)

    Näppi, Janne J.; Ryu, Yasuji; Yoshida, Hiroyuki

    2014-03-01

    Automated colon extraction is an important first step for applications of computer-aided detection (CADe) and quantitative imaging in computed tomographic colonography (CTC). However, previously developed colon extraction algorithms have various limitations. We developed a new fully automated progressive region-based (PRB) method to extract the complete region of colon from CTC images while minimizing the presence of extra-colonic components. The method can be used to provide the target region for CADe as well as to provide quantitative imaging information about the colon. In the method, extra-colonic components are excluded from the abdominal region by use of anatomy-based extraction methods, visible lumen regions of the colon and small bowel are decomposed into material-based subregions, a colon pathway is tracked from the anus to cecum by use of algorithms of progressively increasing complexity using anatomy-based landmarks, segmental features, and region-based tracking algorithms, and the extracted lumen region is perfected into a complete lumen by use of a discrete level-set algorithm. The method was tested with 15 challenging cathartic and non-cathartic fecaltagging CTC cases. Preliminary results indicate that the PRB method can outperform our previously developed centerline-based tracking method in colon extraction.

  3. Remote ischaemic conditioning decreases blood flow and improves oxygen extraction in patients with early complex regional pain syndrome.

    Science.gov (United States)

    Hegelmaier, T; Kumowski, N; Mainka, T; Vollert, J; Goertz, O; Lehnhardt, M; Zahn, P K; Maier, C; Kolbenschlag, J

    2017-09-01

    Remote ischaemic conditioning (RIC) is the cyclic application of non-damaging ischaemia leading to an increased tissue perfusion, among others triggered by NO (monoxide). Complex regional pain syndrome (CRPS) is known to have vascular alterations such as increased blood shunting and decreased NO blood-levels, which in turn lead to decreased tissue perfusion. We therefore hypothesized that RIC could improve tissue perfusion in CRPS. In this proof-of-concept study, RIC was applied in the following groups: in 21 patients with early CRPS with a clinical history less than a year, in 20 age/sex-matched controls and in 12 patients with unilateral nerve lesions via a tourniquet on the unaffected/non-dominant upper limb. Blood flow and tissue oxygen saturation (StO2 ) were assessed before, during and after RIC via laser Doppler and tissue spectroscopy on the affected extremity. The oxygen extraction fraction was calculated. After RIC, blood flow declined in CRPS (p CRPS and healthy controls (p CRPS, the oxygen extraction fraction correlated negatively with the decreasing blood flow (p CRPS, which led to a revised hypothesis: the decrease of blood flow might be due to an anti-inflammatory effect that attenuates vascular disturbances and reduces blood shunting, thus improving oxygen extraction. Further studies could determine whether a repeated application of RIC leads to a reduced hypoxia in chronic CRPS. Remote ischaemic conditioning leads to a decrease of blood flow. This decrease inversely correlates with the oxygen extraction in patients with CRPS. © 2017 European Pain Federation - EFIC®.

  4. Unsafe Practice of Extracting Potable Water From Aquifers in the Southwestern Coastal Region of Bangladesh

    Science.gov (United States)

    Chowdhury, S. H.; Ahmed, A. U.; Iqbal, M. Z.

    2009-05-01

    The groundwater resource is of paramount importance to the lives and livelihoods of the millions of people in Bangladesh. Unfortunately, high levels of arsenic have been found in groundwater in many parts of Bangladesh. Besides, the salinity in water systems in the coastal areas has increased as a consequence of the flow diversion from the upper reaches of Ganges River by the neighboring country India. Since hand- pumped groundwater (tube) wells are the only viable sources of drinking water, maintaining drinking water security for over 6 million people in the south-west (SW) region has been a major challenge for the Bangladesh Government. Due to rapid exploitation of groundwater resources in excess of recharge capacity, non-saline water sources in the SW region have already been depleted and the hand tube wells have mostly been abandoned. Meanwhile, shrimp farming has resulted in saline water infiltration into the perched aquifer system in many areas. A recent survey covering123 wells out of 184, extending to a depth of 330 m, showed high salinity in water. Combined factors of rapid exploitation of shallow groundwater, depletion of the deep aquifers and the subsequent saline water intrusion into these aquifers have put long-term sustainability of the remaining fresh groundwater resource into jeopardy. Very high concentrations of nitrite are found in this study in many tube wells in the area where samples have been drawn from aquifer systems up to 244 m deep. Nitrite concentrations in 35 wells randomly sampled in this study range from 16.98 to 43.11 mg/L, averaging 27.55 mg/L. This is much higher than the Maximum Contaminant Level (MCL) of 1 mg/L set by the U.S. EPA for human consumption. Simultaneously, dissolved oxygen (DO) is found to be very low (0.1 to 2 mg/L). There are numerous reports and anecdotal evidences of "Blue Baby Syndrome" (methemoglobinemia) in the region, which is generally due to gradual suffocation caused by poor transport of oxygen from the

  5. Microbiological quality of honey from the Pampas Region (Argentina) throughout the extraction process.

    Science.gov (United States)

    Fernández, Leticia A; Ghilardi, Carolina; Hoffmann, Betiana; Busso, Carlos; Gallez, Liliana M

    The microbiological quality of honey obtained from different processing points and the environmental quality within honey houses were assessed in the Pampas Region (Argentina). Mold and yeast (MY), culturable heterotrophic mesophilic bacteria (CHMB), the number of spore-forming bacteria as well as the presence of Shigella spp., Salmonella spp. and fecal coliforms were evaluated in 163 samples. These samples were taken from eight honey houses. Results showed that 89 samples had ≤10CFU of MY/g honey, 69 ranged from 10 to 50CFU/g and two reached 65.5CFU/g. Eighty one percent of the samples showed ≤30CFU of CHMB/g honey and only seven samples had between 50 and 54.25CFU/g. Thirty six honey samples were obtained from drums: in 25 samples (69.4%) CHMB counts were less than ≤30CFU/g of honey; in 20 samples (55.5%) the values of MY were between 10 and 50CFU/g honey and total coliforms were only detected in 20 samples. Fecal coliforms, spores of clostridia as well as Salmonella spp. and Shigella spp were not detected and less than 50 spores of Bacillus spp. per g were observed in the honey from drums. Therefore, the microbiological honey quality within the honey houses did not show any sanitary risks. Our results were reported to honey house owners to help them understand the need to reinforce proper honey handling and sanitation practices. Copyright © 2016 Asociación Argentina de Microbiología. Publicado por Elsevier España, S.L.U. All rights reserved.

  6. Text location in color documents

    Science.gov (United States)

    Jain, Anil K.; Namboodiri, Anoop M.; Jung, Keechul

    2003-01-01

    Many document images contain both text and non-text (images, line drawings, etc.) regions. An automatic segmentation of such an image into text and non-text regions is extremely useful in a variety of applications. Identification of text regions helps in text recognition applications, while the classification of an image into text and non-text regions helps in processing the individual regions differently in applications like page reproduction and printing. One of the main approaches to text detection is based on modeling the text as a texture. We present a method based on a combination of neural networks (texture-based) and connected component analysis to detect text in color documents with busy foreground and background. The proposed method achieves an accuracy of 96% (by area) on a test set of 40 documents.

  7. POTASH EXTRACTION AND HISTORICAL ENVIRONMENTAL CONFLICT IN THE BAGES REGION (SPAIN

    Directory of Open Access Journals (Sweden)

    Santiago Gorostiza Langa

    2014-01-01

    Full Text Available La extracción de potasa en la región del Bages (España ha sido la causa de importantes impactos ambientales a lo largo de la historia reciente, como muestra la progresiva salinización de los ríos Cardener y Llobregat. Recientemente, varios proyectos que aumentarán la producción de salmueras y las escombreras de residuos salinos han sido anunciados. Siguiendo a Martínez-Alier, en el presente artículo caracterizo los conflictos alrededor de la extracción de potasa y sus impactos socio-ambientales como conflictos de distribución ecológica, y propongo un acercamiento histórico que tenga en cuenta los flujos de agua, potasio y cloro. Pese a la importancia del potasio como un nutriente imprescindible para el crecimiento de los vegetales, junto al fósforo y el nitrógeno, los conflictos relacionados con la extracción de sales potásicas han recibido relativamente poca atención. Para el presente caso de estudio, se utilizan datos estadísticos y fuentes de archivo para mostrar la extracción de potasa en relación con el aumento de la salinidad del agua en Barcelona a lo largo del siglo XX. Se dedica especial atención a las infraestructuras tecnológicas desarrolladas para dar una solución técnica al problema de la salinización de las aguas, como el colector de salmueras o los filtros de ósmosis inversa, a la vez que se destacan las relaciones de poder detrás de la elección de estas tecnologías. El acercamiento histórico a este caso de estudio muestra que la definición de las externalidades como éxitos en la transferencia de costes, defendida por Martínez-Alier, resulta adecuada para los costes económicos relacionados con la remediación ambiental de las minas del Bages, básicamente cubiertos por fondos públicos.

  8. Weaving with text

    DEFF Research Database (Denmark)

    Hagedorn-Rasmussen, Peter

    This paper explores how a school principal by means of practical authorship creates reservoirs of language that provide a possible context for collective sensemaking. The paper draws upon a field study in which a school principal, and his managerial team, was shadowed in a period of intensive cha...... changes. The paper explores how the manager weaves with text, extracted from stakeholders, administration, politicians, employees, public discourse etc., as a means of creating a new fabric, a texture, of diverse perspectives that aims for collective sensemaking....

  9. Comparing a knowledge-driven approach to a supervised machine learning approach in large-scale extraction of drug-side effect relationships from free-text biomedical literature.

    Science.gov (United States)

    Xu, Rong; Wang, QuanQiu

    2015-01-01

    Systems approaches to studying drug-side-effect (drug-SE) associations are emerging as an active research area for both drug target discovery and drug repositioning. However, a comprehensive drug-SE association knowledge base does not exist. In this study, we present a novel knowledge-driven (KD) approach to effectively extract a large number of drug-SE pairs from published biomedical literature. For the text corpus, we used 21,354,075 MEDLINE records (119,085,682 sentences). First, we used known drug-SE associations derived from FDA drug labels as prior knowledge to automatically find SE-related sentences and abstracts. We then extracted a total of 49,575 drug-SE pairs from MEDLINE sentences and 180,454 pairs from abstracts. On average, the KD approach has achieved a precision of 0.335, a recall of 0.509, and an F1 of 0.392, which is significantly better than a SVM-based machine learning approach (precision: 0.135, recall: 0.900, F1: 0.233) with a 73.0% increase in F1 score. Through integrative analysis, we demonstrate that the higher-level phenotypic drug-SE relationships reflects lower-level genetic, genomic, and chemical drug mechanisms. In addition, we show that the extracted drug-SE pairs can be directly used in drug repositioning. In summary, we automatically constructed a large-scale higher-level drug phenotype relationship knowledge, which can have great potential in computational drug discovery.

  10. In Vitro Pharmacological Activities and GC-MS Analysis of Different Solvent Extracts of Lantana camara Leaves Collected from Tropical Region of Malaysia

    Directory of Open Access Journals (Sweden)

    Mallappa Kumara Swamy

    2015-01-01

    Full Text Available We investigated the effect of different solvents (ethyl acetate, methanol, acetone, and chloroform on the extraction of phytoconstituents from Lantana camara leaves and their antioxidant and antibacterial activities. Further, GC-MS analysis was carried out to identify the bioactive chemical constituents occurring in the active extract. The results revealed the presence of various phytocompounds in the extracts. The methanol solvent recovered higher extractable compounds (14.4% of yield and contained the highest phenolic (92.8 mg GAE/g and flavonoid (26.5 mg RE/g content. DPPH radical scavenging assay showed the IC50 value of 165, 200, 245, and 440 μg/mL for methanol, ethyl acetate, acetone, and chloroform extracts, respectively. The hydroxyl scavenging activity test showed the IC50 value of 110, 240, 300, and 510 μg/mL for methanol, ethyl acetate, acetone, and chloroform extracts, respectively. Gram negative bacterial pathogens (E. coli and K. pneumoniae were more susceptible to all extracts compared to Gram positive bacteria (M. luteus, B. subtilis, and S. aureus. Methanol extract had the highest inhibition activity against all the tested microbes. Moreover, methanolic extract of L. camara contained 32 bioactive components as revealed by GC-MS study. The identified major compounds included hexadecanoic acid (5.197%, phytol (4.528%, caryophyllene oxide (4.605%, and 9,12,15-octadecatrienoic acid, methyl ester, (Z,Z,Z- (3.751%.

  11. Atlantic Sturgeon Spatial and Temporal Distribution in Minas Passage, Nova Scotia, Canada, a Region of Future Tidal Energy Extraction.

    Science.gov (United States)

    Stokesbury, Michael J W; Logan-Chesney, Laura M; McLean, Montana F; Buhariwalla, Colin F; Redden, Anna M; Beardsall, Jeffrey W; Broome, Jeremy E; Dadswell, Michael J

    2016-01-01

    In the Bay of Fundy, Atlantic sturgeon from endangered and threatened populations in the USA and Canada migrate through Minas Passage to enter and leave Minas Basin. A total of 132 sub-adult and adult Atlantic sturgeon were tagged in Minas Basin during the summers of 2010-2014 using pressure measuring, uniquely coded, acoustic transmitters with a four or eight year life span. The aim of this study was to examine spatial and seasonal distribution of sturgeon in Minas Passage during 2010-2014 and test the hypothesis that, when present, Atlantic sturgeon were evenly distributed from north to south across Minas Passage. This information is important as tidal energy extraction using in-stream, hydrokinetic turbines is planned for only the northern portion of Minas Passage. Electronic tracking data from a total of 740 sturgeon days over four years demonstrated that Atlantic sturgeon used the southern portion of Minas Passage significantly more than the northern portion. Sturgeon moved through Minas Passage at depths mostly between 15 and 45 m (n = 10,116; mean = 31.47 m; SD = 14.88). Sturgeon mean swimming depth was not significantly related to bottom depth and in deeper regions they swam pelagically. Sturgeon predominately migrated inward through Minas Passage during spring, and outward during late summer-autumn. Sturgeon were not observed in Minas Passage during winter 2012-2013 when monitoring receivers were present. This information will enable the estimation of encounters of Atlantic sturgeon with in-stream hydrokinetic turbines.

  12. Atlantic Sturgeon Spatial and Temporal Distribution in Minas Passage, Nova Scotia, Canada, a Region of Future Tidal Energy Extraction

    Science.gov (United States)

    Stokesbury, Michael J. W.; Logan-Chesney, Laura M.; McLean, Montana F.; Buhariwalla, Colin F.; Redden, Anna M.; Beardsall, Jeffrey W.; Broome, Jeremy E.; Dadswell, Michael J.

    2016-01-01

    In the Bay of Fundy, Atlantic sturgeon from endangered and threatened populations in the USA and Canada migrate through Minas Passage to enter and leave Minas Basin. A total of 132 sub-adult and adult Atlantic sturgeon were tagged in Minas Basin during the summers of 2010–2014 using pressure measuring, uniquely coded, acoustic transmitters with a four or eight year life span. The aim of this study was to examine spatial and seasonal distribution of sturgeon in Minas Passage during 2010–2014 and test the hypothesis that, when present, Atlantic sturgeon were evenly distributed from north to south across Minas Passage. This information is important as tidal energy extraction using in-stream, hydrokinetic turbines is planned for only the northern portion of Minas Passage. Electronic tracking data from a total of 740 sturgeon days over four years demonstrated that Atlantic sturgeon used the southern portion of Minas Passage significantly more than the northern portion. Sturgeon moved through Minas Passage at depths mostly between 15 and 45 m (n = 10,116; mean = 31.47 m; SD = 14.88). Sturgeon mean swimming depth was not significantly related to bottom depth and in deeper regions they swam pelagically. Sturgeon predominately migrated inward through Minas Passage during spring, and outward during late summer-autumn. Sturgeon were not observed in Minas Passage during winter 2012–2013 when monitoring receivers were present. This information will enable the estimation of encounters of Atlantic sturgeon with in-stream hydrokinetic turbines. PMID:27383274

  13. International Conference on Harmonisation; guidance on Q4B Evaluation and Recommendation of Pharmacopoeial texts for use in the International Conference on Harmonisation Regions; Annex 12 on Analytical Sieving General Chapter; availability. Notice.

    Science.gov (United States)

    2010-09-02

    The Food and Drug Administration (FDA) is announcing the availability of a guidance entitled "Q4B Evaluation and Recommendation of Pharmacopoeial Texts for Use in the ICH Regions; Annex 12: Analytical Sieving General Chapter." The guidance was prepared under the auspices of the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH). The guidance provides the results of the ICH Q4B evaluation of the Analytical Sieving General Chapter harmonized text from each of the three pharmacopoeias (United States, European, and Japanese) represented by the Pharmacopoeial Discussion Group (PDG). The guidance conveys recognition of the three pharmacopoeial methods by the three ICH regulatory regions and provides specific information regarding the recognition. The guidance is intended to recognize the interchangeability between the local regional pharmacopoeias, thus avoiding redundant testing in favor of a common testing strategy in each regulatory region. This guidance is in the form of an annex to the core guidance on the Q4B process entitled "Q4B Evaluation and Recommendation of Pharmacopoeial Texts for Use in the ICH Regions" (the core ICH Q4B guidance).

  14. International Conference on Harmonisation; Guidance on Q4B Evaluation and Recommendation of Pharmacopoeial Texts for Use in the International Conference on Harmonisation Regions; Annex 14 on Bacterial Endotoxins Test General Chapter; availability. Notice.

    Science.gov (United States)

    2013-10-23

    The Food and Drug Administration (FDA) is announcing the availability of a guidance entitled "Q4B Evaluation and Recommendation of Pharmacopoeial Texts for Use in the International Conference on Harmonisation Regions; Annex 14: Bacterial Endotoxins Test General Chapter.'' The guidance was prepared under the auspices of the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH). The guidance provides the results of the ICH Q4B evaluation of the Bacterial Endotoxins Test General Chapter harmonized text from each of the three pharmacopoeias (United States, European, and Japanese) represented by the Pharmacopoeial Discussion Group (PDG). The guidance conveys recognition of the three pharmacopoeial methods by the three ICH regulatory regions and provides specific information regarding the recognition. The guidance is intended to recognize the interchangeability between the local regional pharmacopoeias, thus avoiding redundant testing in favor of a common testing strategy in each regulatory region. The guidance is in the form of an annex to the core guidance on the Q4B process entitled "Q4B Evaluation and Recommendation of Pharmacopoeial Texts for Use in the ICH Regions (core ICH Q4B guidance).

  15. Plasma bioavailability and regional brain distribution of polyphenols from apple/grape seed and bilberry extracts in a young swine model.

    Science.gov (United States)

    Chen, Tzu-Ying; Kritchevsky, Janice; Hargett, Katherine; Feller, Kathryn; Klobusnik, Ryan; Song, Brian J; Cooper, Bruce; Jouni, Zeina; Ferruzzi, Mario G; Janle, Elsa M

    2015-12-01

    The pharmacokinetics, bioavailability, and regional brain distribution of polyphenols from apple-grape seed extract (AGSE) mixture and bilberry extract were studied after 3 weeks of dosing in weanling pigs. Weanling piglets were treated for 3 weeks with extracts of (AGSE) or bilberry extracts, using a physiological (27.5 mg/kg) or supplement (82.5 mg/kg) dose. A 24-h pharmacokinetic study was conducted and brain tissue was harvested. Major flavan-3-ol and flavonol metabolites including catechin-O-β-glucuronide, epicatechin-O-β-glucuronide, 3'O-methyl-catechin-O-β-glucuronide, 3'O-methyl-epicatechin-O-β-glucuronide, quercetin-O-β-glucuronide, and O-methyl-quercetin-O-β-glucuronide were analyzed in plasma, urine, and regional brain extracts from AGSE groups. Anthocyanidin-O-galactosides and O-glucosides of delphinidin (Del), cyanidin (Cyn), petunidin (Pet), peonidin (Peo), and malvidin (Mal) were analyzed in plasma, urine, and brain extracts from bilberry groups. Significant plasma dose-dependence was observed in flavan-3-ol metabolites of the AGSE group and in Mal, Del and Cyn galactosides and Pet, Peo, and Cyn glucosides of the bilberry groups. In the brain, a significant dose dependence was found in the cerebellum and frontal cortex in all major flavan-3-ol metabolites. All anthocyanidin glycosides, except for delphinidin, showed a dose-dependent increase in the cerebellum. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. A methodology for semiautomatic taxonomy of concepts extraction from nuclear scientific documents using text mining techniques; Metodologia para extracao semiautomatica de uma taxonomia de conceitos a partir da producao cientifica da area nuclear utilizando tecnicas de mineracao de textos

    Energy Technology Data Exchange (ETDEWEB)

    Braga, Fabiane dos Reis

    2013-07-01

    This thesis presents a text mining method for semi-automatic extraction of taxonomy of concepts, from a textual corpus composed of scientific papers related to nuclear area. The text classification is a natural human practice and a crucial task for work with large repositories. The document clustering technique provides a logical and understandable framework that facilitates the organization, browsing and searching. Most clustering algorithms using the bag of words model to represent the content of a document. This model generates a high dimensionality of the data, ignores the fact that different words can have the same meaning and does not consider the relationship between them, assuming that words are independent of each other. The methodology presents a combination of a model for document representation by concepts with a hierarchical document clustering method using frequency of co-occurrence concepts and a technique for clusters labeling more representatives, with the objective of producing a taxonomy of concepts which may reflect a structure of the knowledge domain. It is hoped that this work will contribute to the conceptual mapping of scientific production of nuclear area and thus support the management of research activities in this area. (author)

  17. Text Association Analysis and Ambiguity in Text Mining

    Science.gov (United States)

    Bhonde, S. B.; Paikrao, R. L.; Rahane, K. U.

    2010-11-01

    Text Mining is the process of analyzing a semantically rich document or set of documents to understand the content and meaning of the information they contain. The research in Text Mining will enhance human's ability to process massive quantities of information, and it has high commercial values. Firstly, the paper discusses the introduction of TM its definition and then gives an overview of the process of text mining and the applications. Up to now, not much research in text mining especially in concept/entity extraction has focused on the ambiguity problem. This paper addresses ambiguity issues in natural language texts, and presents a new technique for resolving ambiguity problem in extracting concept/entity from texts. In the end, it shows the importance of TM in knowledge discovery and highlights the up-coming challenges of document mining and the opportunities it offers.

  18. The image enhancement and region of interest extraction of lobster-eye X-ray dangerous material inspection system

    Science.gov (United States)

    Zhan, Qi; Wang, Xin; Mu, Baozhong; Xu, Jie; Xie, Qing; Li, Yaran; Chen, Yifan; He, Yanan

    2016-10-01

    Dangerous materials inspection is an important technique to confirm dangerous materials crimes. It has significant impact on the prohibition of dangerous materials-related crimes and the spread of dangerous materials. Lobster-Eye Optical Imaging System is a kind of dangerous materials detection device which mainly takes advantage of backscatter X-ray. The strength of the system is its applicability to access only one side of an object, and to detect dangerous materials without disturbing the surroundings of the target material. The device uses Compton scattered x-rays to create computerized outlines of suspected objects during security detection process. Due to the grid structure of the bionic object glass, which imitate the eye of a lobster, grids contribute to the main image noise during the imaging process. At the same time, when used to inspect structured or dense materials, the image is plagued by superposition artifacts and limited by attenuation and noise. With the goal of achieving high quality images which could be used for dangerous materials detection and further analysis, we developed effective image process methods applied to the system. The first aspect of the image process is the denoising and enhancing edge contrast process, during the process, we apply deconvolution algorithm to remove the grids and other noises. After image processing, we achieve high signal-to-noise ratio image. The second part is to reconstruct image from low dose X-ray exposure condition. We developed a kind of interpolation method to achieve the goal. The last aspect is the region of interest (ROI) extraction process, which could be used to help identifying dangerous materials mixed with complex backgrounds. The methods demonstrated in the paper have the potential to improve the sensitivity and quality of x-ray backscatter system imaging.

  19. An analytical framework for extracting hydrological information from time series of small reservoirs in a semi-arid region

    Science.gov (United States)

    Annor, Frank; van de Giesen, Nick; Bogaard, Thom; Eilander, Dirk

    2013-04-01

    small reservoirs in the Upper East Region of Ghana. Reservoirs without obvious large seepage losses (field survey) were selected. To verify this, stable water isotopic samples are collected from groundwater upstream and downstream from the reservoir. By looking at possible enrichment of downstream groundwater, a good estimate of seepage can be made in addition to estimates on evaporation. We estimated the evaporative losses and compared those with field measurements using eddy correlation measurements. Lastly, we determined the cumulative surface runoff curves for the small reservoirs .We will present this analytical framework for extracting hydrological information from time series of small reservoirs and show the first results for our study region of northern Ghana.

  20. Essential Oils Extracted Using Microwave-Assisted Hydrodistillation from Aerial Parts of Eleven Artemisia Species: Chemical Compositions and Diversities in Different Geographical Regions of Iran

    Directory of Open Access Journals (Sweden)

    Majid Mohammadhosseini

    2017-03-01

    Full Text Available This study aimed to assess the chemical compositions of essential oils (EOs extracted through microwave-assisted hydrodistillation from aerial parts of 11 Artemisia species growing wild in different regions in Northern, Eastern, Western, and Central parts of Iran. The EOs were subsequently analyzed via GC and GC-MS. The percentage yields of the EOs varied over the range of 0.21-0.50 (w/w%. On the basis of these characterizations and spectral assignments, natural compounds including camphor, 1,8-cineole, camphene, α-pinene, β-pinene, β-thujone, and sabinene were the most abundant and frequent constituents among all studied chemical profiles. Accordingly, oxygenated monoterpenes, monoterpene hydrocarbons, and non-terpene hydrocarbons were the dominant groups of natural compounds in the chemical profiles of 13, 4, and 2 samples, respectively. Moreover, five chemotypes were identified using statistical analyses: camphene, α-pinene and β-pinene; 1,8-cineole; camphore and 1,8-cineole; camphore and camphore and β-thujone.

  1. Ozone photochemistry in an oil and natural gas extraction region during winter: simulations of a snow-free season in the Uintah Basin, Utah

    Directory of Open Access Journals (Sweden)

    P. M. Edwards

    2013-09-01

    Full Text Available The Uintah Basin in northeastern Utah, a region of intense oil and gas extraction, experienced ozone (O3 concentrations above levels harmful to human health for multiple days during the winters of 2009–2010 and 2010–2011. These wintertime O3 pollution episodes occur during cold, stable periods when the ground is snow-covered, and have been linked to emissions from the oil and gas extraction process. The Uintah Basin Winter Ozone Study (UBWOS was a field intensive in early 2012, whose goal was to address current uncertainties in the chemical and physical processes that drive wintertime O3 production in regions of oil and gas development. Although elevated O3 concentrations were not observed during the winter of 2011–2012, the comprehensive set of observations tests our understanding of O3 photochemistry in this unusual emissions environment. A box model, constrained to the observations and using the near-explicit Master Chemical Mechanism (MCM v3.2 chemistry scheme, has been used to investigate the sensitivities of O3 production during UBWOS 2012. Simulations identify the O3 production photochemistry to be highly radical limited (with a radical production rate significantly smaller than the NOx emission rate. Production of OH from O3 photolysis (through reaction of O(1D with water vapor contributed only 170 pptv day−1, 8% of the total primary radical source on average (primary radicals being those produced from non-radical precursors. Other radical sources, including the photolysis of formaldehyde (HCHO, 52%, nitrous acid (HONO, 26%, and nitryl chloride (ClNO2, 13% were larger. O3 production was also found to be highly sensitive to aromatic volatile organic compound (VOC concentrations, due to radical amplification reactions in the oxidation scheme of these species. Radical production was shown to be small in comparison to the emissions of nitrogen oxides (NOx, such that NOx acted as the primary radical sink. Consequently, the system was

  2. A Customizable Text Classifier for Text Mining

    Directory of Open Access Journals (Sweden)

    Yun-liang Zhang

    2007-12-01

    Full Text Available Text mining deals with complex and unstructured texts. Usually a particular collection of texts that is specified to one or more domains is necessary. We have developed a customizable text classifier for users to mine the collection automatically. It derives from the sentence category of the HNC theory and corresponding techniques. It can start with a few texts, and it can adjust automatically or be adjusted by user. The user can also control the number of domains chosen and decide the standard with which to choose the texts based on demand and abundance of materials. The performance of the classifier varies with the user's choice.

  3. DC parameter extraction of equivalent circuit model in InGaAsSb heterojunction bipolar transistors including non-ideal effects in the base region

    Science.gov (United States)

    Chang, Yang-Hua; Cheng, Zong-Tai

    2011-07-01

    This paper presents the DC parameter extraction of the equivalent circuit model in an InP-InGaAsSb double heterojunction bipolar transistor (HBT). The non-ideal collector current is modeled by a non-ideal doping distribution in the base region. Then several consequent non-ideal effects, which have always been neglected in typical HBTs, are studied using Medici device simulator. Moreover, the associated DC parameters of VBIC model are extracted accordingly. The equivalent circuit model is in good agreement with the measured data in I C- V CE characteristics.

  4. Gut Microbiota Analysis Results Are Highly Dependent on the 16S rRNA Gene Target Region, Whereas the Impact of DNA Extraction Is Minor.

    Science.gov (United States)

    Rintala, Anniina; Pietilä, Sami; Munukka, Eveliina; Eerola, Erkki; Pursiheimo, Juha-Pekka; Laiho, Asta; Pekkala, Satu; Huovinen, Pentti

    2017-04-01

    Next-generation sequencing (NGS) is currently the method of choice for analyzing gut microbiota composition. As gut microbiota composition is a potential future target for clinical diagnostics, it is of utmost importance to enhance and optimize the NGS analysis procedures. Here, we have analyzed the impact of DNA extraction and selected 16S rDNA primers on the gut microbiota NGS results. Bacterial DNA from frozen stool specimens was extracted with 5 commercially available DNA extraction kits. Special attention was paid to the semiautomated DNA extraction methods that could expedite the analysis procedure, thus being especially suitable for clinical settings. The microbial composition was analyzed with 2 distinct protocols: 1 targeting the V3-V4 and the other targeting the V4-V5 area of the bacterial 16S rRNA gene. The overall effect of DNA extraction on the gut microbiota 16S rDNA profile was relatively small, whereas the 16S rRNA gene target region had an immense impact on the results. Furthermore, semiautomated DNA extraction methods clearly appeared suitable for NGS procedures, proposing that application of these methods could importantly reduce hands-on time and human errors without compromising the validity of results.

  5. Text mining for systems biology.

    Science.gov (United States)

    Fluck, Juliane; Hofmann-Apitius, Martin

    2014-02-01

    Scientific communication in biomedicine is, by and large, still text based. Text mining technologies for the automated extraction of useful biomedical information from unstructured text that can be directly used for systems biology modelling have been substantially improved over the past few years. In this review, we underline the importance of named entity recognition and relationship extraction as fundamental approaches that are relevant to systems biology. Furthermore, we emphasize the role of publicly organized scientific benchmarking challenges that reflect the current status of text-mining technology and are important in moving the entire field forward. Given further interdisciplinary development of systems biology-orientated ontologies and training corpora, we expect a steadily increasing impact of text-mining technology on systems biology in the future. Copyright © 2013 Elsevier Ltd. All rights reserved.

  6. Uniform distributions of glucose oxidation and oxygen extraction in gray matter of normal human brain: No evidence of regional differences of aerobic glycolysis

    Science.gov (United States)

    Herman, Peter; Bailey, Christopher J; Møller, Arne; Globinsky, Ronen; Fulbright, Robert K; Rothman, Douglas L; Gjedde, Albert

    2016-01-01

    Regionally variable rates of aerobic glycolysis in brain networks identified by resting-state functional magnetic resonance imaging (R-fMRI) imply regionally variable adenosine triphosphate (ATP) regeneration. When regional glucose utilization is not matched to oxygen delivery, affected regions have correspondingly variable rates of ATP and lactate production. We tested the extent to which aerobic glycolysis and oxidative phosphorylation power R-fMRI networks by measuring quantitative differences between the oxygen to glucose index (OGI) and the oxygen extraction fraction (OEF) as measured by positron emission tomography (PET) in normal human brain (resting awake, eyes closed). Regionally uniform and correlated OEF and OGI estimates prevailed, with network values that matched the gray matter means, regardless of size, location, and origin. The spatial agreement between oxygen delivery (OEF≈0.4) and glucose oxidation (OGI ≈ 5.3) suggests that no specific regions have preferentially high aerobic glycolysis and low oxidative phosphorylation rates, with globally optimal maximum ATP turnover rates (VATP ≈ 9.4 µmol/g/min), in good agreement with 31P and 13C magnetic resonance spectroscopy measurements. These results imply that the intrinsic network activity in healthy human brain powers the entire gray matter with ubiquitously high rates of glucose oxidation. Reports of departures from normal brain-wide homogeny of oxygen extraction fraction and oxygen to glucose index may be due to normalization artefacts from relative PET measurements. PMID:26755443

  7. Uniform distributions of glucose oxidation and oxygen extraction in gray matter of normal human brain: No evidence of regional differences of aerobic glycolysis.

    Science.gov (United States)

    Hyder, Fahmeed; Herman, Peter; Bailey, Christopher J; Møller, Arne; Globinsky, Ronen; Fulbright, Robert K; Rothman, Douglas L; Gjedde, Albert

    2016-05-01

    Regionally variable rates of aerobic glycolysis in brain networks identified by resting-state functional magnetic resonance imaging (R-fMRI) imply regionally variable adenosine triphosphate (ATP) regeneration. When regional glucose utilization is not matched to oxygen delivery, affected regions have correspondingly variable rates of ATP and lactate production. We tested the extent to which aerobic glycolysis and oxidative phosphorylation power R-fMRI networks by measuring quantitative differences between the oxygen to glucose index (OGI) and the oxygen extraction fraction (OEF) as measured by positron emission tomography (PET) in normal human brain (resting awake, eyes closed). Regionally uniform and correlated OEF and OGI estimates prevailed, with network values that matched the gray matter means, regardless of size, location, and origin. The spatial agreement between oxygen delivery (OEF≈0.4) and glucose oxidation (OGI ≈ 5.3) suggests that no specific regions have preferentially high aerobic glycolysis and low oxidative phosphorylation rates, with globally optimal maximum ATP turnover rates (VATP ≈ 9.4 µmol/g/min), in good agreement with (31)P and (13)C magnetic resonance spectroscopy measurements. These results imply that the intrinsic network activity in healthy human brain powers the entire gray matter with ubiquitously high rates of glucose oxidation. Reports of departures from normal brain-wide homogeny of oxygen extraction fraction and oxygen to glucose index may be due to normalization artefacts from relative PET measurements. © The Author(s) 2016.

  8. Geological lineament mapping in arid area by semi-automatic extraction from satellite images: example at the El Kseïbat region (Algerian Sahara)

    Energy Technology Data Exchange (ETDEWEB)

    Hammad, N.; Djidel, M.; Maabedi, N.

    2016-07-01

    Geologists in charge of a detailed lineament mapping in arid and desert area, face the extent of the land and the abundance of eolian deposits. This study presents a semi-automatic approach of extraction of lineament, different from other methods, such as the automatic extraction and manual extraction, by being both fast and objective. It consists of a series of digital processing (textural and spatial filtering, binarization by thresholding and mathematic morphology ... etc.) applied to a Landsat 7 ETM+scene. This semi-automatic approach has produced a detailed map of lineaments, while taking account of tectonic directions recognized in the region. It helps mitigate the effect of dune deposits meet the specifications of arid environment. The visual validation of these linear structures, by geoscientists and field data, allowed the identification of the majority of structural lineaments or at least those tried geological. (Author)

  9. Text mining by Tsallis entropy

    Science.gov (United States)

    Jamaati, Maryam; Mehri, Ali

    2018-01-01

    Long-range correlations between the elements of natural languages enable them to convey very complex information. Complex structure of human language, as a manifestation of natural languages, motivates us to apply nonextensive statistical mechanics in text mining. Tsallis entropy appropriately ranks the terms' relevance to document subject, taking advantage of their spatial correlation length. We apply this statistical concept as a new powerful word ranking metric in order to extract keywords of a single document. We carry out an experimental evaluation, which shows capability of the presented method in keyword extraction. We find that, Tsallis entropy has reliable word ranking performance, at the same level of the best previous ranking methods.

  10. Monitoring interaction and collective text production through text mining

    Directory of Open Access Journals (Sweden)

    Macedo, Alexandra Lorandi

    2014-04-01

    Full Text Available This article presents the Concepts Network tool, developed using text mining technology. The main objective of this tool is to extract and relate terms of greatest incidence from a text and exhibit the results in the form of a graph. The Network was implemented in the Collective Text Editor (CTE which is an online tool that allows the production of texts in synchronized or non-synchronized forms. This article describes the application of the Network both in texts produced collectively and texts produced in a forum. The purpose of the tool is to offer support to the teacher in managing the high volume of data generated in the process of interaction amongst students and in the construction of the text. Specifically, the aim is to facilitate the teacher’s job by allowing him/her to process data in a shorter time than is currently demanded. The results suggest that the Concepts Network can aid the teacher, as it provides indicators of the quality of the text produced. Moreover, messages posted in forums can be analyzed without their content necessarily having to be pre-read.

  11. Assessing Mobile Phone Access and Perceptions for Texting-Based mHealth Interventions Among Expectant Mothers and Child Caregivers in Remote Regions of Northern Kenya: A Survey-Based Descriptive Study.

    Science.gov (United States)

    Kazi, Abdul Momin; Carmichael, Jason-Louis; Hapanna, Galgallo Waqo; Wangoo, Patrick Gikaria; Karanja, Sarah; Wanyama, Denis; Muhula, Samuel Opondo; Kyomuhangi, Lennie Bazira; Loolpapit, Mores; Wangalwa, Gilbert Bwire; Kinagwi, Koki; Lester, Richard Todd

    2017-01-30

    With a dramatic increase in mobile phone use in low- and middle-income countries, mobile health (mHealth) has great potential to connect health care services directly to participants enrolled and improve engagement of care. Rural and remote global settings may pose both significant challenges and opportunities. The objective of our study was to understand the demographics, phone usage and ownership characteristics, and feasibility among patients in rural and remote areas of Kenya of having text messaging (short messaging service, SMS)-based mHealth intervention for improvements in antenatal care attendance and routine immunization among children in Northern Kenya. A survey-based descriptive study was conducted between October 2014 and February 2015 at 8 health facilities in Northern Kenya as part of a program to scale up an mHealth service in rural and remote regions. The study was conducted at 6 government health facilities in Isiolo, Marsabit, and Samburu counties in remote and northern arid lands (NAL). Two less remote health facilities in Laikipia and Meru counties in more populated central highlands were included as comparison sites. A total of 284 participants were surveyed; 63.4% (180/284) were from NAL clinics, whereas 36.6% (104/284) were from adjacent central highland clinics. In the NAL, almost half (48.8%, 88/180) reported no formal education and 24.4% (44/180) self-identified as nomads. The majority of participants from both regions had access to mobile phone: 99.0% (103/104) of participants from central highlands and 82.1% (147/180) of participants from NAL. Among those who had access to a phone, there were significant differences in network challenges and technology literacy between the 2 regions. However, there was no significant difference in the proportion of participants from NAL and central highlands who indicated that they would like to receive a weekly SMS text message from their health care provider (90.0% vs 95.0%; P=.52). Overall, 92

  12. Extraction of Active Regions and Coronal Holes from EUV Images Using the Unsupervised Segmentation Method in the Bayesian Framework

    Science.gov (United States)

    Arish, S.; Javaherian, M.; Safari, H.; Amiri, A.

    2016-04-01

    The solar corona is the origin of very dynamic events that are mostly produced in active regions (AR) and coronal holes (CH). The exact location of these large-scale features can be determined by applying image-processing approaches to extreme-ultraviolet (EUV) data.

  13. download full text

    African Journals Online (AJOL)

    With these examples, the author depicts the horror associated with abortion, an illegal act. Extract 5 expresses Dr. Abe's excitement over the arrest of Cosmos, the wanted leader of an armed robbery gang that has succeeded in robbing the bank in Gom, carting away millions of naira. Soon after the robbery incident, Cosmos ...

  14. XML and Free Text.

    Science.gov (United States)

    Riggs, Ken Roger

    2002-01-01

    Discusses problems with marking free text, text that is either natural language or semigrammatical but unstructured, that prevent well-formed XML from marking text for readily available meaning. Proposes a solution to mark meaning in free text that is consistent with the intended simplicity of XML versus SGML. (Author/LRW)

  15. Contextual Text Mining

    Science.gov (United States)

    Mei, Qiaozhu

    2009-01-01

    With the dramatic growth of text information, there is an increasing need for powerful text mining systems that can automatically discover useful knowledge from text. Text is generally associated with all kinds of contextual information. Those contexts can be explicit, such as the time and the location where a blog article is written, and the…

  16. SparkText: Biomedical Text Mining on Big Data Framework.

    Directory of Open Access Journals (Sweden)

    Zhan Ye

    Full Text Available Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment.In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM, and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes.This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.

  17. Bone Ceramic® at Implants Installed Immediately into Extraction Sockets in the Molar Region: An Experimental Study in Dogs.

    Science.gov (United States)

    Pereira, Flávia Priscila; Hochuli-Vieira, Eduardo; Maté Sánchez de Val, José E; De Santis, Enzo; Salata, Luiz Antonio; Botticelli, Daniele

    2016-04-01

    The aim of this paper was to study the healing of 1-1.4 mm wide buccal defects at implants placed immediately into extraction sockets (IPIES) filled with a mixture of synthetic hydroxyapatite (HA) 60% and beta-tricalciumphosphate (TCP) 40% or left with the clot alone and both covered with collagen membranes. Eight Labrador dogs were used and implants were placed immediately into the extraction sockets of the first molar bilaterally. A mixture of synthetic HA 60% and beta-TCP 40% at the test or the clot alone at the control sites were used to fill the defects. All surgical sites were subsequently covered by a resorbable collagen membrane and a non-submerged healing was allowed. After 4 months, the animals were euthanized, biopsies harvested and processed for histomorphometric analysis. At the time of installation, residual buccal defects occurred that were 1.1 mm and 1.4 mm wide and 3 mm and 4 mm deep at the control and test sites, respectively. After 4 months of healing, the top of the bony crest and the coronal level of osseointegration were located respectively at 0.1 ± 1.8 mm and 1.5 ± 1.8 mm at the test, and 0.6 ± 1.6 mm and 1.2 ± 0.7 mm at the control sites apically to the implant shoulder. Bone-to-implant contact at the buccal aspect was 34.9 ± 25.9% and 36.4 ± 17.3% at the test and control sites, respectively. No statistically significant differences were found between test and control sites for any of the variables analyzed at the buccal aspects. The use of a mixture of synthetic HA 60% and beta-TCP 40% to fill residual buccal defects 1-1.4 mm wide at IPIES did not improve significantly the results of healing. © 2015 Wiley Periodicals, Inc.

  18. E-text

    DEFF Research Database (Denmark)

    Finnemann, Niels Ole

    2018-01-01

    the print medium, rather than written text or speech. In late 20th century, the notion of text was subject to increasing criticism as in the question raised within literary text theory: is there a text in this class? At the same time, the notion was expanded by including extra linguistic sign modalities...... (images, videos). Thus, a basic question is this: should electronic text be included in the expanded notion of text as a new digital sign modality added to the repertoire of modalities, or should it be included as a sign modality, which is both an independent modality and a container in which other...

  19. SparkText: Biomedical Text Mining on Big Data Framework.

    Science.gov (United States)

    Ye, Zhan; Tafti, Ahmad P; He, Karen Y; Wang, Kai; He, Max M

    Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.

  20. SparkText: Biomedical Text Mining on Big Data Framework

    Science.gov (United States)

    He, Karen Y.; Wang, Kai

    2016-01-01

    Background Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. Results In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. Conclusions This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research. PMID:27685652

  1. Searching for text documents

    NARCIS (Netherlands)

    Hiemstra, Djoerd; Blanken, Henk; de Vries, A.P.; Blok, H.E.; Feng, L.

    2007-01-01

    Many documents contain, besides text, also images, tables, and so on. This chapter concentrates on the text part only. Traditionally, systems handling text documents are called information storage and retrieval systems. Before the World-Wide Web emerged, such systems were almost exclusively used by

  2. A new method for extracting the ENSO-independent Indian Ocean Dipole: application to Australian region tropical cyclone counts

    Energy Technology Data Exchange (ETDEWEB)

    Werner, Angelika; Maharaj, Angela M. [Macquarie University, Department of Environment and Geography, Sydney, NSW (Australia); Holbrook, Neil J. [University of Tasmania, School of Geography and Environmental Studies, Hobart, TAS (Australia)

    2012-06-15

    We introduce a simple but effective means of removing ENSO-related variations from the Indian Ocean Dipole (IOD) in order to better evaluate the ENSO-independent IOD contribution to Australian climate - specifically here interannual variations in Australian region tropical cyclogensis (TCG) counts. The ENSO time contribution is removed from the Indian Ocean Dipole Mode index (DMI) by first calculating the lagged regression of the DMI on the sea surface temperature anomaly (SSTA) index NINO3.4 to maximum lags of 8 months, and then removing this ENSO portion. The new ENSO-independent time series, DMI{sub NOENSO}, correlates strongly with the original DMI at r = 0.87 (significant at >99% level). Despite the strength of the correlation between these series, the IOD events classified based on DMI{sub NOENSO} provide important differences from previously identified IOD events, which are more closely aligned with ENSO phases. IOD event composite maps of SSTAs regressed on DMI{sub NOENSO} reveal a much greater ENSO-independence than the original DMI-related SSTA pattern. This approach is used to explore relationships between Australian region TCG and IOD from 1968 to 2007. While we show that both the DMI and DMI{sub NOENSO} have significant hindcast skill (on the 95% level) when used as predictors in a multiple linear regression model for Australian region annual TCG counts, the IOD does not add any significant hindcast skill over an ENSO-only predictor model, based on NINO4. Correlations between the time series of annual TCG count observations and ENSO + IOD model cross-validated hindcasts achieve r = 0.68 (significant at the 99% level). (orig.)

  3. Using object-based image analysis to conduct high-resolution conifer extraction at regional spatial scales

    Science.gov (United States)

    Coates, Peter S.; Gustafson, K. Benjamin; Roth, Cali L.; Chenaille, Michael P.; Ricca, Mark A.; Mauch, Kimberly; Sanchez-Chopitea, Erika; Kroger, Travis J.; Perry, William M.; Casazza, Michael L.

    2017-08-10

    The distribution and abundance of pinyon (Pinus monophylla) and juniper (Juniperus osteosperma, J. occidentalis) trees (hereinafter, "pinyon-juniper") in sagebrush (Artemisia spp.) ecosystems of the Great Basin in the Western United States has increased substantially since the late 1800s. Distributional expansion and infill of pinyon-juniper into sagebrush ecosystems threatens the ecological function and economic viability of these ecosystems within the Great Basin, and is now a major contemporary challenge facing land and wildlife managers. Particularly, pinyon-juniper encroachment into intact sagebrush ecosystems has been identified as a primary threat facing populations of greater sage-grouse (Centrocercus urophasianus; hereinafter, "sage-grouse"), which is a sagebrush obligate species. Even seemingly innocuous scatterings of isolated pinyon-juniper in an otherwise intact sagebrush landscape can negatively affect survival and reproduction of sage-grouse. Therefore, accurate and high-resolution maps of pinyon-juniper distribution and abundance (indexed by canopy cover) across broad geographic extents would help guide land management decisions that better target areas for pinyon-juniper removal projects (for example, fuel reduction, habitat improvement for sage-grouse, and other sagebrush species) and facilitate science that further quantifies ecological effects of pinyon-juniper encroachment on sage-grouse populations and sagebrush ecosystem processes. Hence, we mapped pinyon-juniper (referred to as conifers for actual mapping) at a 1 × 1-meter (m) high resolution across the entire range of previously mapped sage-grouse habitat in Nevada and northeastern California.We used digital orthophoto quad tiles from National Agriculture Imagery Program (2010, 2013) as base imagery, and then classified conifers using automated feature extraction methodology with the program Feature Analyst™. This method relies on machine learning algorithms that extract features from

  4. Vocabulary Constraint on Texts

    Directory of Open Access Journals (Sweden)

    C. Sutarsyah

    2008-01-01

    Full Text Available This case study was carried out in the English Education Department of State University of Malang. The aim of the study was to identify and describe the vocabulary in the reading text and to seek if the text is useful for reading skill development. A descriptive qualitative design was applied to obtain the data. For this purpose, some available computer programs were used to find the description of vocabulary in the texts. It was found that the 20 texts containing 7,945 words are dominated by low frequency words which account for 16.97% of the words in the texts. The high frequency words occurring in the texts were dominated by function words. In the case of word levels, it was found that the texts have very limited number of words from GSL (General Service List of English Words (West, 1953. The proportion of the first 1,000 words of GSL only accounts for 44.6%. The data also show that the texts contain too large proportion of words which are not in the three levels (the first 2,000 and UWL. These words account for 26.44% of the running words in the texts.  It is believed that the constraints are due to the selection of the texts which are made of a series of short-unrelated texts. This kind of text is subject to the accumulation of low frequency words especially those of content words and limited of words from GSL. It could also defeat the development of students' reading skills and vocabulary enrichment.

  5. Text Mining for Protein Docking.

    Directory of Open Access Journals (Sweden)

    Varsha D Badal

    2015-12-01

    Full Text Available The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking. Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu. The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound

  6. The Vicissitudes of Text

    Directory of Open Access Journals (Sweden)

    Jonathan CULLER

    2003-06-01

    Full Text Available The concept of text, which has been central to literary studies, has undergone many mutations, as it has traveled from the work of classical philologists, for whom it was and is the object of a powerful disciplinary formation, to postmodern theorists of the text, for whom, the concept might be summed up by the title of a fine book by John Mowatt: Text: the Genealogy of an Antidisciplinary Object. Of course, the interesting thing about a travelling concept is not that it travels — travelers, t...

  7. Instant Sublime Text starter

    CERN Document Server

    Haughee, Eric

    2013-01-01

    A starter which teaches the basic tasks to be performed with Sublime Text with the necessary practical examples and screenshots. This book requires only basic knowledge of the Internet and basic familiarity with any one of the three major operating systems, Windows, Linux, or Mac OS X. However, as Sublime Text 2 is primarily a text editor for writing software, many of the topics discussed will be specifically relevant to software development. That being said, the Sublime Text 2 Starter is also suitable for someone without a programming background who may be looking to learn one of the tools of

  8. E-text

    DEFF Research Database (Denmark)

    Finnemann, Niels Ole

    2018-01-01

    the print medium, rather than written text or speech. In late 20th century, the notion of text was subject to increasing criticism as in the question raised within literary text theory: is there a text in this class? At the same time, the notion was expanded by including extra linguistic sign modalities....... This wider notion would include, for instance, all sorts of scanning results, whether of the outer cosmos or the inner geographies of our bodies, and of digital traces of other processes in between these (machine readings included). Since alphabets, like the genetic alphabet, and all sorts of images may...

  9. Systematic text condensation

    DEFF Research Database (Denmark)

    Malterud, Kirsti

    2012-01-01

    To present background, principles, and procedures for a strategy for qualitative analysis called systematic text condensation and discuss this approach compared with related strategies.......To present background, principles, and procedures for a strategy for qualitative analysis called systematic text condensation and discuss this approach compared with related strategies....

  10. Linguistics in Text Interpretation

    DEFF Research Database (Denmark)

    Togeby, Ole

    2011-01-01

    A model for how text interpretation proceeds from what is pronounced, through what is said to what is comunicated, and definition of the concepts 'presupposition' and 'implicature'.......A model for how text interpretation proceeds from what is pronounced, through what is said to what is comunicated, and definition of the concepts 'presupposition' and 'implicature'....

  11. YORUBA, INTERMEDIATE TEXTS.

    Science.gov (United States)

    MCCLURE, H. DAVID; OYEWALE, JOHN O.

    THIS COURSE IS BASED ON A SERIES OF BRIEF MONOLOGUES RECORDED BY A WESTERN-EDUCATED NATIVE SPEAKER OF YORUBA FROM THE OYO AREA. THE TAPES CONSTITUTE THE CENTRAL PART OF THE COURSE, WITH THE TEXT INTENDED AS SUPPLEMENTARY AND AUXILIARY MATERIAL. THE TEXT TOPICS WERE CHOSEN FOR THEIR SPECIAL RELEVANCE TO PEACE CORPS VOLUNTEERS WHO EXPECT TO USE…

  12. Making Sense of Texts

    Science.gov (United States)

    Harper, Rebecca G.

    2014-01-01

    This article addresses the triadic nature regarding meaning construction of texts. Grounded in Rosenblatt's (1995; 1998; 2004) Transactional Theory, research conducted in an undergraduate Language Arts curriculum course revealed that when presented with unfamiliar texts, students used prior experiences, social interactions, and literary strategies…

  13. download full text

    African Journals Online (AJOL)

    Oita Etyang

    The paper uses historical trajectory to demonstrate how patronage, ethnicity, electoral authoritarianism and extension of presidential term limit erodes democratic gains in Africa. The paper ..... International, regional and national efforts need to go beyond rhetoric to safe guard the sanctity of constitutional provisions. African ...

  14. Knowledge discovery data and text mining

    CERN Document Server

    Olmer, Petr

    2008-01-01

    Data mining and text mining refer to techniques, models, algorithms, and processes for knowledge discovery and extraction. Basic de nitions are given together with the description of a standard data mining process. Common models and algorithms are presented. Attention is given to text clustering, how to convert unstructured text to structured data (vectors), and how to compute their importance and position within clusters.

  15. The Vicissitudes of Text

    OpenAIRE

    Culler, Jonathan

    2011-01-01

    The concept of text, which has been central to literary studies, has undergone many mutations, as it has traveled from the work of classical philologists, for whom it was and is the object of a powerful disciplinary formation, to postmodern theorists of the text, for whom, the concept might be summed up by the title of a fine book by John Mowatt: Text: the Genealogy of an Antidisciplinary Object. Of course, the interesting thing about a travelling concept is not that it travels — travelers, t...

  16. The Vicissitudes of Text

    OpenAIRE

    Culler, Jonathan

    2003-01-01

    The concept of text, which has been central to literary studies, has undergone many mutations, as it has traveled from the work of classical philologists, for whom it was and is the object of a powerful disciplinary formation, to postmodern theorists of the text, for whom, the concept might be summed up by the title of a fine book by John Mowatt: Text: the Genealogy of an Antidisciplinary Object. Of course, the interesting thing about a travelling concept is not that it travels — travelers, t...

  17. Text, Hypertext, and Hyperfiction

    Directory of Open Access Journals (Sweden)

    Ladan Modir

    2014-03-01

    Full Text Available This article briefly surveys the changing theoretical perspectives on text from structuralism to poststructuralism and how they are subsequently accounted for by hypertext theorists to comprehend the emerging genre called hypertext fiction. Some theoretical issues concerning the reading of this genre also will be discussed. The purpose of this study is to illustrate that the radical promises and challenges of digital novels to readers would prove reading and interpretation of conventional texts are far more participatory. This will be accomplished by tracing the evolution of poststructuralists’ concepts of intertextuality, multivocality, decentering, multilinearity, disorientation, and interactivity to find a way out of constant notions of conventional principles of reading.

  18. Texting on the Move

    Science.gov (United States)

    ... about when and where we text. What's the Big Deal? The problem is multitasking. No matter how ... person again. Reviewed by: Mary L. Gavin, MD Date reviewed: October 2013 More on this topic for: ...

  19. Text Mining for Protein Docking.

    Science.gov (United States)

    Badal, Varsha D; Kundrotas, Petras J; Vakser, Ilya A

    2015-12-01

    The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking). Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu). The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features) approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound benchmark set

  20. Machine Translation from Text

    Science.gov (United States)

    Habash, Nizar; Olive, Joseph; Christianson, Caitlin; McCary, John

    Machine translation (MT) from text, the topic of this chapter, is perhaps the heart of the GALE project. Beyond being a well defined application that stands on its own, MT from text is the link between the automatic speech recognition component and the distillation component. The focus of MT in GALE is on translating from Arabic or Chinese to English. The three languages represent a wide range of linguistic diversity and make the GALE MT task rather challenging and exciting.

  1. Plagiarism in Academic Texts

    Directory of Open Access Journals (Sweden)

    Marta Eugenia Rojas-Porras

    2012-08-01

    Full Text Available The ethical and social responsibility of citing the sources in a scientific or artistic work is undeniable. This paper explores, in a preliminary way, academic plagiarism in its various forms. It includes findings based on a forensic analysis. The purpose of this paper is to raise awareness on the importance of considering these details when writing and publishing a text. Hopefully, this analysis may put the issue under discussion.

  2. Psychologically Motivated Text Mining

    OpenAIRE

    Shutova, Ekaterina; Lichtenstein, Patricia

    2016-01-01

    Natural language processing techniques are increasingly applied to identify social trends and predict behavior based on large text collections. Existing methods typically rely on surface lexical and syntactic information. Yet, research in psychology shows that patterns of human conceptualisation, such as metaphorical framing, are reliable predictors of human expectations and decisions. In this paper, we present a method to learn patterns of metaphorical framing from large text collections, us...

  3. Translation of Quantum Texts

    OpenAIRE

    Espinoza, Randall; Imbo, Tom; Lopata, Paul

    2004-01-01

    In the companion to this paper, we described a generalization of the deterministic quantum cloning process, called enscription, which utilizes entanglement in order to achieve the "copying" of (certain) sets of distinct quantum states which are not orthogonal, called texts. Here we provide a further generalization, called translation, which allows us to completely determine all translatable texts, and which displays an intimate relationship to the mathematical theory of graphs.

  4. Locative inferences in medical texts.

    Science.gov (United States)

    Mayer, P S; Bailey, G H; Mayer, R J; Hillis, A; Dvoracek, J E

    1987-06-01

    Medical research relies on epidemiological studies conducted on a large set of clinical records that have been collected from physicians recording individual patient observations. These clinical records are recorded for the purpose of individual care of the patient with little consideration for their use by a biostatistician interested in studying a disease over a large population. Natural language processing of clinical records for epidemiological studies must deal with temporal, locative, and conceptual issues. This makes text understanding and data extraction of clinical records an excellent area for applied research. While much has been done in making temporal or conceptual inferences in medical texts, parallel work in locative inferences has not been done. This paper examines the locative inferences as well as the integration of temporal, locative, and conceptual issues in the clinical record understanding domain by presenting an application that utilizes two key concepts in its parsing strategy--a knowledge-based parsing strategy and a minimal lexicon.

  5. Automatic inpainting scheme for video text detection and removal.

    Science.gov (United States)

    Mosleh, Ali; Bouguila, Nizar; Ben Hamza, Abdessamad

    2013-11-01

    We present a two stage framework for automatic video text removal to detect and remove embedded video texts and fill-in their remaining regions by appropriate data. In the video text detection stage, text locations in each frame are found via an unsupervised clustering performed on the connected components produced by the stroke width transform (SWT). Since SWT needs an accurate edge map, we develop a novel edge detector which benefits from the geometric features revealed by the bandlet transform. Next, the motion patterns of the text objects of each frame are analyzed to localize video texts. The detected video text regions are removed, then the video is restored by an inpainting scheme. The proposed video inpainting approach applies spatio-temporal geometric flows extracted by bandlets to reconstruct the missing data. A 3D volume regularization algorithm, which takes advantage of bandlet bases in exploiting the anisotropic regularities, is introduced to carry out the inpainting task. The method does not need extra processes to satisfy visual consistency. The experimental results demonstrate the effectiveness of both our proposed video text detection approach and the video completion technique, and consequently the entire automatic video text removal and restoration process.

  6. The earliest medical texts.

    Science.gov (United States)

    Frey, E F

    The first civilization known to have had an extensive study of medicine and to leave written records of its practices and procedures was that of ancient Egypt. The oldest extant Egyptian medical texts are six papyri from the period between 2000 B.C. and 1500 B.C.: the Kahun Medical Papyrus, the Ramesseum IV and Ramesseum V Papyri, the Edwin Smith Surgical Papyrus, The Ebers Medical Papyrus and the Hearst Medical Papyrus. These texts, most of them based on older texts dating possibly from 3000 B.C., are comparatively free of the magician's approach to treating illness. Egyptian medicine influenced the medicine of neighboring cultures, including the culture of ancient Greece. From Greece, its influence spread onward, thereby affecting Western civilization significantly.

  7. Stemming Malay Text and Its Application in Automatic Text Categorization

    Science.gov (United States)

    Yasukawa, Michiko; Lim, Hui Tian; Yokoo, Hidetoshi

    In Malay language, there are no conjugations and declensions and affixes have important grammatical functions. In Malay, the same word may function as a noun, an adjective, an adverb, or, a verb, depending on its position in the sentence. Although extensively simple root words are used in informal conversations, it is essential to use the precise words in formal speech or written texts. In Malay, to make sentences clear, derivative words are used. Derivation is achieved mainly by the use of affixes. There are approximately a hundred possible derivative forms of a root word in written language of the educated Malay. Therefore, the composition of Malay words may be complicated. Although there are several types of stemming algorithms available for text processing in English and some other languages, they cannot be used to overcome the difficulties in Malay word stemming. Stemming is the process of reducing various words to their root forms in order to improve the effectiveness of text processing in information systems. It is essential to avoid both over-stemming and under-stemming errors. We have developed a new Malay stemmer (stemming algorithm) for removing inflectional and derivational affixes. Our stemmer uses a set of affix rules and two types of dictionaries: a root-word dictionary and a derivative-word dictionary. The use of set of rules is aimed at reducing the occurrence of under-stemming errors, while that of the dictionaries is believed to reduce the occurrence of over-stemming errors. We performed an experiment to evaluate the application of our stemmer in text mining software. For the experiment, text data used were actual web pages collected from the World Wide Web to demonstrate the effectiveness of our Malay stemming algorithm. The experimental results showed that our stemmer can effectively increase the precision of the extracted Boolean expressions for text categorization.

  8. New mathematical cuneiform texts

    CERN Document Server

    Friberg, Jöran

    2016-01-01

    This monograph presents in great detail a large number of both unpublished and previously published Babylonian mathematical texts in the cuneiform script. It is a continuation of the work A Remarkable Collection of Babylonian Mathematical Texts (Springer 2007) written by Jöran Friberg, the leading expert on Babylonian mathematics. Focussing on the big picture, Friberg explores in this book several Late Babylonian arithmetical and metro-mathematical table texts from the sites of Babylon, Uruk and Sippar, collections of mathematical exercises from four Old Babylonian sites, as well as a new text from Early Dynastic/Early Sargonic Umma, which is the oldest known collection of mathematical exercises. A table of reciprocals from the end of the third millennium BC, differing radically from well-documented but younger tables of reciprocals from the Neo-Sumerian and Old-Babylonian periods, as well as a fragment of a Neo-Sumerian clay tablet showing a new type of a labyrinth are also discussed. The material is presen...

  9. Text analysis in R

    NARCIS (Netherlands)

    Welbers, K.; van Atteveldt, W.H.; Benoit, K.

    2017-01-01

    Computational text analysis has become an exciting research field with many applications in communication research. It can be a difficult method to apply, however, because it requires knowledge of various techniques, and the software required to perform most of these techniques is not readily

  10. Text Induced Spelling Correction

    NARCIS (Netherlands)

    Reynaert, M.W.C.

    2004-01-01

    We present TISC, a language-independent and context-sensitive spelling checking and correction system designed to facilitate the automatic removal of non-word spelling errors in large corpora. Its lexicon is derived from a very large corpus of raw text, without supervision, and contains word

  11. Texts On-Line.

    Science.gov (United States)

    Thomas, Jean-Jacques

    1993-01-01

    Maintains that the study of signs is divided between those scholars who use the Saussurian binary sign (semiology) and those who prefer the Peirce tripartite sign (semiotics). Concludes that neither the Saussurian nor Peircian analysis methods can produce a semiotic interpretation based on a hierarchy of the text's various components. (CFR)

  12. Dictionaries for text production

    DEFF Research Database (Denmark)

    Fuertes-Olivera, Pedro; Bergenholtz, Henning

    2018-01-01

    and free online dictionaries. The Diccionario español para la producción de textos is an example of a general text production dictionary that makes use of internet technologies, is based on a lexicographic theory, contains all the lexicographic data that users need in a production situation, and aims...

  13. Content Based Text Handling.

    Science.gov (United States)

    Schwarz, Christoph

    1990-01-01

    Gives an overview of various linguistic software tools in the field of intelligent text handling that are being developed in Germany utilizing artificial intelligence techniques in the field of natural language processing. Syntactical analysis of documents is described and application areas are discussed. (10 references) (LRW)

  14. Text, Hypertext, and Hyperfiction

    OpenAIRE

    Ladan Modir; Ling C Guan; Sohaimi Bin Abdul Aziz

    2014-01-01

    This article briefly surveys the changing theoretical perspectives on text from structuralism to poststructuralism and how they are subsequently accounted for by hypertext theorists to comprehend the emerging genre called hypertext fiction. Some theoretical issues concerning the reading of this genre also will be discussed. The purpose of this study is to illustrate that the radical promises and challenges of digital n...

  15. Recognition of pornographic web pages by classifying texts and images.

    Science.gov (United States)

    Hu, Weiming; Wu, Ou; Chen, Zhouyao; Fu, Zhouyu; Maybank, Steve

    2007-06-01

    With the rapid development of the World Wide Web, people benefit more and more from the sharing of information. However, Web pages with obscene, harmful, or illegal content can be easily accessed. It is important to recognize such unsuitable, offensive, or pornographic Web pages. In this paper, a novel framework for recognizing pornographic Web pages is described. A C4.5 decision tree is used to divide Web pages, according to content representations, into continuous text pages, discrete text pages, and image pages. These three categories of Web pages are handled, respectively, by a continuous text classifier, a discrete text classifier, and an algorithm that fuses the results from the image classifier and the discrete text classifier. In the continuous text classifier, statistical and semantic features are used to recognize pornographic texts. In the discrete text classifier, the naive Bayes rule is used to calculate the probability that a discrete text is pornographic. In the image classifier, the object's contour-based features are extracted to recognize pornographic images. In the text and image fusion algorithm, the Bayes theory is used to combine the recognition results from images and texts. Experimental results demonstrate that the continuous text classifier outperforms the traditional keyword-statistics-based classifier, the contour-based image classifier outperforms the traditional skin-region-based image classifier, the results obtained by our fusion algorithm outperform those by either of the individual classifiers, and our framework can be adapted to different categories of Web pages.

  16. E-text

    DEFF Research Database (Denmark)

    Finnemann, Niels Ole

    2017-01-01

    of “text” or “printed text” as the point of departure. On the other hand, electronic text can be defined by taking as point of departure the digital format in which everything is represented in the binary alphabet. While the notion of text, in most cases, lends itself to be independent of medium......) processing rules as binary sequences manifested in the binary alphabet. This wider notion would include, for instance, all sorts of scanning results, whether of the outer cosmos or the inner geographies of our bodies, and of digital traces of other processes in between these (machine readings included......). Since alphabets, like the genetic alphabet, and all sorts of images may be represented in the binary alphabet, such materials will also belong to the textual universe within this definition. A more intriguing implication is that digital born materials may also include scripts and interactive features...

  17. Strategy as Texts

    DEFF Research Database (Denmark)

    Obed Madsen, Søren

    of the strategy into four categories. Second, the managers produce new texts based on the original strategy document by using four different ways of translation models. The study’s findings contribute to three areas. Firstly, it shows that translation is more than a sociological process. It is also......This article shows empirically how managers translate a strategy plan at an individual level. By analysing how managers in three organizations translate strategies, it identifies that the translation happens in two steps: First, the managers decipher the strategy by coding the different parts...... a craftsmanship that requires knowledge and skills, which unfortunately seems to be overlooked in both the literature and in practice. Secondly, it shows that even though a strategy text is in singular, the translation makes strategy plural. Thirdly, the article proposes a way to open up the black box of what...

  18. Wisdom Texts and Philosophy

    Directory of Open Access Journals (Sweden)

    Anthony Preus

    2013-11-01

    Full Text Available The last essay of this issue concerns to a more "technical" subject: in many ancient cultures, literary monuments are mainly "wisdom literature". In these early works. Philosophy and Literature are more closely related than in many contemporary approaches. The author here tries to sketch the relationships between the ancient wisdom literatures of Egipt, Greece and Israel, and to show how this literary genre precedes "philosophy".

  19. Knowledge Based Text Generation

    Science.gov (United States)

    1989-08-01

    from data bases, so Kukich [1984] developed a system, ANA , which generates stock reports from a knowledge base of daily trading on the Dow Jones stock...MACHIAVELLI (topic organization and phraseology), CICERO (realization), FREUD (monitoring the origins of rhetorical plans), and LEIBNITZ (a "concept...68 Bossie and Mani 8 Alla Fiera dell’est 37 brain 2 frame 29 Alshawi 49 Brown and Yule 51 amplification 38 Cambridge University 40 ANA 15 canned text 7

  20. Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application

    Directory of Open Access Journals (Sweden)

    Leon eFrench

    2015-05-01

    Full Text Available We describe the WhiteText project, and its progress towards automatically extracting statements of neuroanatomical connectivity from text. We review progress to date on the three main steps of the project: recognition of brain region mentions, standardization of brain region mentions to neuroanatomical nomenclature, and connectivity statement extraction. We further describe a new version of our manually curated corpus that adds 2,111 connectivity statements from 1,828 additional abstracts. Cross-validation classification within the new corpus replicates results on our original corpus, recalling 51% of connectivity statements at 67% precision. The resulting merged corpus provides 5,208 connectivity statements that can be used to seed species-specific connectivity matrices and to better train automated techniques. Finally, we present a new web application that allows fast interactive browsing of the over 70,000 sentences indexed by the system, as a tool for accessing the data and assisting in further curation. Software and data are freely available at http://www.chibi.ubc.ca/WhiteText/.

  1. Ozone photochemistry in an oil and natural gas extraction region during winter: simulations of a snow-free season in the Uintah Basin, Utah

    Science.gov (United States)

    Edwards, P. M.; Young, C. J.; Aikin, K.; deGouw, J.; Dubé, W. P.; Geiger, F.; Gilman, J.; Helmig, D.; Holloway, J. S.; Kercher, J.; Lerner, B.; Martin, R.; McLaren, R.; Parrish, D. D.; Peischl, J.; Roberts, J. M.; Ryerson, T. B.; Thornton, J.; Warneke, C.; Williams, E. J.; Brown, S. S.

    2013-09-01

    The Uintah Basin in northeastern Utah, a region of intense oil and gas extraction, experienced ozone (O3) concentrations above levels harmful to human health for multiple days during the winters of 2009-2010 and 2010-2011. These wintertime O3 pollution episodes occur during cold, stable periods when the ground is snow-covered, and have been linked to emissions from the oil and gas extraction process. The Uintah Basin Winter Ozone Study (UBWOS) was a field intensive in early 2012, whose goal was to address current uncertainties in the chemical and physical processes that drive wintertime O3 production in regions of oil and gas development. Although elevated O3 concentrations were not observed during the winter of 2011-2012, the comprehensive set of observations tests our understanding of O3 photochemistry in this unusual emissions environment. A box model, constrained to the observations and using the near-explicit Master Chemical Mechanism (MCM) v3.2 chemistry scheme, has been used to investigate the sensitivities of O3 production during UBWOS 2012. Simulations identify the O3 production photochemistry to be highly radical limited (with a radical production rate significantly smaller than the NOx emission rate). Production of OH from O3 photolysis (through reaction of O(1D) with water vapor) contributed only 170 pptv day-1, 8% of the total primary radical source on average (primary radicals being those produced from non-radical precursors). Other radical sources, including the photolysis of formaldehyde (HCHO, 52%), nitrous acid (HONO, 26%), and nitryl chloride (ClNO2, 13%) were larger. O3 production was also found to be highly sensitive to aromatic volatile organic compound (VOC) concentrations, due to radical amplification reactions in the oxidation scheme of these species. Radical production was shown to be small in comparison to the emissions of nitrogen oxides (NOx), such that NOx acted as the primary radical sink. Consequently, the system was highly VOC

  2. Ways students read texts

    Science.gov (United States)

    Wandersee, James H.

    College students responding to the Preferred Method of Study (PMOS) questionnaire explained how they approach reading a new textbook chapter for comprehension. Results indicated that a significant positive correlation exists between the number of passes a student makes at new textbook material and his/her college grade-point average. Women showed a significant preference for adopting a single method of study. Less than half of the students queried construct organizational tools such as outlines or diagrams as they study a textbook. Students said they would alter their textbook strategies in response to the type of test they expected significantly more often than they would for the type of subject matter being studied. Only 6% of the students said they make a conscious effort to link the new concepts in the text to prior knowledge. There was no discernable relationship between the study strategies undergraduate college students employ and their college grade level (freshman through senior).

  3. Interconnectedness und digitale Texte

    Directory of Open Access Journals (Sweden)

    Detlev Doherr

    2013-04-01

    Full Text Available Zusammenfassung Die multimedialen Informationsdienste im Internet werden immer umfangreicher und umfassender, wobei auch die nur in gedruckter Form vorliegenden Dokumente von den Bibliotheken digitalisiert und ins Netz gestellt werden. Über Online-Dokumentenverwaltungen oder Suchmaschinen können diese Dokumente gefunden und dann in gängigen Formaten wie z.B. PDF bereitgestellt werden. Dieser Artikel beleuchtet die Funktionsweise der Humboldt Digital Library, die seit mehr als zehn Jahren Dokumente von Alexander von Humboldt in englischer Übersetzung im Web als HDL (Humboldt Digital Library kostenfrei zur Verfügung stellt. Anders als eine digitale Bibliothek werden dabei allerdings nicht nur digitalisierte Dokumente als Scan oder PDF bereitgestellt, sondern der Text als solcher und in vernetzter Form verfügbar gemacht. Das System gleicht damit eher einem Informationssystem als einer digitalen Bibliothek, was sich auch in den verfügbaren Funktionen zur Auffindung von Texten in unterschiedlichen Versionen und Übersetzungen, Vergleichen von Absätzen verschiedener Dokumente oder der Darstellung von Bilden in ihrem Kontext widerspiegelt. Die Entwicklung von dynamischen Hyperlinks auf der Basis der einzelnen Textabsätze der Humboldt‘schen Werke in Form von Media Assets ermöglicht eine Nutzung der Programmierschnittstelle von Google Maps zur geographischen wie auch textinhaltlichen Navigation. Über den Service einer digitalen Bibliothek hinausgehend, bietet die HDL den Prototypen eines mehrdimensionalen Informationssystems, das mit dynamischen Strukturen arbeitet und umfangreiche thematische Auswertungen und Vergleiche ermöglicht. Summary The multimedia information services on Internet are becoming more and more comprehensive, even the printed documents are digitized and republished as digital Web documents by the libraries. Those digital files can be found by search engines or management tools and provided as files in usual formats as

  4. Scene Text Detection and Segmentation based on Cascaded Convolution Neural Networks.

    Science.gov (United States)

    Tang, Youbao; Wu, Xiangqian

    2017-01-20

    Scene text detection and segmentation are two important and challenging research problems in the field of computer vision. This paper proposes a novel method for scene text detection and segmentation based on cascaded convolution neural networks (CNNs). In this method, a CNN based text-aware candidate text region (CTR) extraction model (named detection network, DNet) is designed and trained using both the edges and the whole regions of text, with which coarse CTRs are detected. A CNN based CTR refinement model (named segmentation network, SNet) is then constructed to precisely segment the coarse CTRs into text to get the refined CTRs. With DNet and SNet, much fewer CTRs are extracted than with traditional approaches while more true text regions are kept. The refined CTRs are finally classified using a CNN based CTR classification model (named classification network, CNet) to get the final text regions. All of these CNN based models are modified from VGGNet-16. Extensive experiments on three benchmark datasets demonstrate that the proposed method achieves state-of-the-art performance and greatly outperforms other scene text detection and segmentation approaches.

  5. Texts of presentation

    Energy Technology Data Exchange (ETDEWEB)

    Magnin, G.; Vidolov, K.; Dufour-Fallot, B.; Dewarrat, Th.; Rose, T.; Favatier, A.; Gazeley, D.; Pujol, T.; Worner, D.; Van de Wel, E.; Revaz, J.M.; Clerfayt, G.; Creedy, A.; Moisan, F.; Geissler, M.; Isbell, P.; Macaluso, M.; Litzka, V.; Gillis, W.; Jarvis, I.; Gorg, M.; Bebie, B.

    2004-07-01

    Implementing a sustainable local energy policy involves a long term reflection on the general interest, energy efficiency, distributed generation and environmental protection. Providing services on a market involves looking for activities that are profitable, if possible in the 'short-term'. The aim of this conference is to analyse the possibility of reconciling these apparently contradictory requirements and how this can be achieved. This conference brings together the best specialists from European municipalities as well as important partners for local authorities (energy agencies, service companies, institutions, etc.) in order to discuss the public-private partnerships concerning the various functions that municipalities may perform in the energy field as consumers and customers, planners and organizers of urban space and rousers as regards inhabitants and economic players of their areas. This document contains the summaries of the following presentations: 1 - Performance contracting: Bulgarian municipalities use private capital for energy efficiency improvement (K. VIDOLOV, Varna (BG)), Contracting experiences in Swiss municipalities: consistent energy policy thanks to the Energy-city label (B. DUFOUR-FALLOT and T. DEWARRAT (CH)), Experience of contracting in the domestic sector (T. ROSE (GB)); 2 - Public procurement: Multicolor electricity (A. FAVATIER (CH)), Tendering for new green electricity capacity (D. GAZELEY (GB)), The Barcelona solar thermal ordinance (T. PUJOL (ES)); 3 - Urban planning and schemes: Influencing energy issues through urban planning (D. WOERNER (DE)), Tendering for the supply of energy infrastructure (E. VAN DE WEL (NL)), Concessions and public utility warranty (J.M. REVAZ (CH)); 4 - Certificate schemes: the market of green certificates in Wallonia region in a liberalized power market (G. CLERFAYT (BE)), The Carbon Neutral{sup R} project: a voluntary certification scheme with opportunity for implementation in other European

  6. Text mining for metabolic reaction extraction from scientific literature

    NARCIS (Netherlands)

    Risse, J.E.

    2014-01-01

    Science relies on data in all its different forms. In molecular biology and bioinformatics in particular large scale data generation has taken centre stage in the form of high-throughput experiments. In line with this exponential increase of experimental data has been the near exponential growth of

  7. Mining knowledge from text repositories using information extraction ...

    Indian Academy of Sciences (India)

    Department of Computer Science, Shri Shivaji Science and Arts College, Chikhli 443 201, India; Department of Computer Science, S K Porwal College, Kamptee, Nagpur 441 002, India; P G Department of Computer Science and Technology, Degree College of Physical Education, Hanuman Vyayam Prasarak Mandal, ...

  8. Addressing Information Proliferation: Applications of Information Extraction and Text Mining

    Science.gov (United States)

    Li, Jingjing

    2013-01-01

    The advent of the Internet and the ever-increasing capacity of storage media have made it easy to store, deliver, and share enormous volumes of data, leading to a proliferation of information on the Web, in online libraries, on news wires, and almost everywhere in our daily lives. Since our ability to process and absorb this information remains…

  9. Identifying and extracting quantitative data in annotated text

    NARCIS (Netherlands)

    Willems, D.J.M.; Rijgersberg, H.; Top, J.

    2012-01-01

    In science it is difficult to reuse quantitative scientific data. For example, it is not possible to search for quantitative data in papers in a directed way, such as using the query "Select the storage modulus of dairy product A after the temperature has decreased from 90 to 4±C". This is caused by

  10. Extracting Features of Acacia Plantation and Natural Forest in the Mountainous Region of Sarawak, Malaysia by ALOS/AVNIR2 Image

    Science.gov (United States)

    Fadaei, H.; Ishii, R.; Suzuki, R.; Kendawang, J.

    2013-12-01

    The remote sensing technique has provided useful information to detect spatio-temporal changes in the land cover of tropical forests. Land cover characteristics derived from satellite image can be applied to the estimation of ecosystem services and biodiversity over an extensive area, and such land cover information would provide valuable information to global and local people to understand the significance of the tropical ecosystem. This study was conducted in the Acacia plantations and natural forest situated in the mountainous region which has different ecological characteristic from that in flat and low land area in Sarawak, Malaysia. The main objective of this study is to compare extract the characteristic of them by analyzing the ALOS/AVNIR2 images and ground truthing obtained by the forest survey. We implemented a ground-based forest survey at Aacia plantations and natural forest in the mountainous region in Sarawak, Malaysia in June, 2013 and acquired the forest structure data (tree height, diameter at breast height (DBH), crown diameter, tree spacing) and spectral reflectance data at the three sample plots of Acacia plantation that has 10 x 10m area. As for the spectral reflectance data, we measured the spectral reflectance of the end members of forest such as leaves, stems, road surface, and forest floor by the spectro-radiometer. Such forest structure and spectral data were incorporated into the image analysis by support vector machine (SVM) and object-base/texture analysis. Consequently, land covers on the AVNIR2 image were classified into three forest types (natural forest, oil palm plantation and acacia mangium plantation), then the characteristic of each category was examined. We additionally used the tree age data of acacia plantation for the classification. A unique feature was found in vegetation spectral reflectance of Acacia plantations. The curve of the spectral reflectance shows two peaks around 0.3μm and 0.6 - 0.8μm that can be assumed to

  11. Text line Segmentation of Curved Document Images

    Directory of Open Access Journals (Sweden)

    Anusree.M

    2014-05-01

    Full Text Available Document image analysis has been widely used in historical and heritage studies, education and digital library. Document image analytical techniques are mainly used for improving the human readability and the OCR quality of the document. During the digitization, camera captured images contain warped document due perspective and geometric distortions. The main difficulty is text line detection in the document. Many algorithms had been proposed to address the problem of printed document text line detection, but they failed to extract text lines in curved document. This paper describes a segmentation technique that detects the curled text line in camera captured document images.

  12. In Vitro Pharmacological Activities and GC-MS Analysis of Different Solvent Extracts of Lantana camara Leaves Collected from Tropical Region of Malaysia.

    Science.gov (United States)

    Swamy, Mallappa Kumara; Sinniah, Uma Rani; Akhtar, Mohd Sayeed

    2015-01-01

    We investigated the effect of different solvents (ethyl acetate, methanol, acetone, and chloroform) on the extraction of phytoconstituents from Lantana camara leaves and their antioxidant and antibacterial activities. Further, GC-MS analysis was carried out to identify the bioactive chemical constituents occurring in the active extract. The results revealed the presence of various phytocompounds in the extracts. The methanol solvent recovered higher extractable compounds (14.4% of yield) and contained the highest phenolic (92.8 mg GAE/g) and flavonoid (26.5 mg RE/g) content. DPPH radical scavenging assay showed the IC50 value of 165, 200, 245, and 440 μg/mL for methanol, ethyl acetate, acetone, and chloroform extracts, respectively. The hydroxyl scavenging activity test showed the IC50 value of 110, 240, 300, and 510 μg/mL for methanol, ethyl acetate, acetone, and chloroform extracts, respectively. Gram negative bacterial pathogens (E. coli and K. pneumoniae) were more susceptible to all extracts compared to Gram positive bacteria (M. luteus, B. subtilis, and S. aureus). Methanol extract had the highest inhibition activity against all the tested microbes. Moreover, methanolic extract of L. camara contained 32 bioactive components as revealed by GC-MS study. The identified major compounds included hexadecanoic acid (5.197%), phytol (4.528%), caryophyllene oxide (4.605%), and 9,12,15-octadecatrienoic acid, methyl ester, (Z,Z,Z)- (3.751%).

  13. Extraction of Facial Features from Color Images

    Directory of Open Access Journals (Sweden)

    J. Pavlovicova

    2008-09-01

    Full Text Available In this paper, a method for localization and extraction of faces and characteristic facial features such as eyes, mouth and face boundaries from color image data is proposed. This approach exploits color properties of human skin to localize image regions – face candidates. The facial features extraction is performed only on preselected face-candidate regions. Likewise, for eyes and mouth localization color information and local contrast around eyes are used. The ellipse of face boundary is determined using gradient image and Hough transform. Algorithm was tested on image database Feret.

  14. CONAN : Text Mining in the Biomedical Domain

    NARCIS (Netherlands)

    Malik, R.

    2006-01-01

    This thesis is about Text Mining. Extracting important information from literature. In the last years, the number of biomedical articles and journals is growing exponentially. Scientists might not find the information they want because of the large number of publications. Therefore a system was

  15. SAW Classification Algorithm for Chinese Text Classification

    Directory of Open Access Journals (Sweden)

    Xiaoli Guo

    2015-02-01

    Full Text Available Considering the explosive growth of data, the increased amount of text data’s effect on the performance of text categorization forward the need for higher requirements, such that the existing classification method cannot be satisfied. Based on the study of existing text classification technology and semantics, this paper puts forward a kind of Chinese text classification oriented SAW (Structural Auxiliary Word algorithm. The algorithm uses the special space effect of Chinese text where words have an implied correlation between text information mining and text categorization for high-correlation matching. Experiments show that SAW classification algorithm on the premise of ensuring precision in classification, significantly improve the classification precision and recall, obviously improving the performance of information retrieval, and providing an effective means of data use in the era of big data information extraction.

  16. Information extraction system

    Science.gov (United States)

    Lemmond, Tracy D; Hanley, William G; Guensche, Joseph Wendell; Perry, Nathan C; Nitao, John J; Kidwell, Paul Brandon; Boakye, Kofi Agyeman; Glaser, Ron E; Prenger, Ryan James

    2014-05-13

    An information extraction system and methods of operating the system are provided. In particular, an information extraction system for performing meta-extraction of named entities of people, organizations, and locations as well as relationships and events from text documents are described herein.

  17. CRIE: An automated analyzer for Chinese texts.

    Science.gov (United States)

    Sung, Yao-Ting; Chang, Tao-Hsing; Lin, Wei-Chun; Hsieh, Kuan-Sheng; Chang, Kuo-En

    2016-12-01

    Textual analysis has been applied to various fields, such as discourse analysis, corpus studies, text leveling, and automated essay evaluation. Several tools have been developed for analyzing texts written in alphabetic languages such as English and Spanish. However, currently there is no tool available for analyzing Chinese-language texts. This article introduces a tool for the automated analysis of simplified and traditional Chinese texts, called the Chinese Readability Index Explorer (CRIE). Composed of four subsystems and incorporating 82 multilevel linguistic features, CRIE is able to conduct the major tasks of segmentation, syntactic parsing, and feature extraction. Furthermore, the integration of linguistic features with machine learning models enables CRIE to provide leveling and diagnostic information for texts in language arts, texts for learning Chinese as a foreign language, and texts with domain knowledge. The usage and validation of the functions provided by CRIE are also introduced.

  18. Review network for scene text recognition

    Science.gov (United States)

    Li, Shuohao; Han, Anqi; Chen, Xu; Yin, Xiaoqing; Zhang, Jun

    2017-09-01

    Recognizing text in images captured in the wild is a fundamental preprocessing task for many computer vision and machine learning applications and has gained significant attention in recent years. This paper proposes an end-to-end trainable deep review neural network for scene text recognition, which is a combination of feature extraction, feature reviewing, feature attention, and sequence recognition. Our model can generate the predicted text without any segmentation or grouping algorithm. Because the attention model in the feature attention stage lacks global modeling ability, a review network is applied to extract the global context of sequence data in the feature reviewing stage. We perform rigorous experiments across a number of standard benchmarks, including IIIT5K, SVT, ICDAR03, and ICDAR13 datasets. Experimental results show that our model is comparable to or outperforms state-of-the-art techniques.

  19. Text mining patents for biomedical knowledge.

    Science.gov (United States)

    Rodriguez-Esteban, Raul; Bundschus, Markus

    2016-06-01

    Biomedical text mining of scientific knowledge bases, such as Medline, has received much attention in recent years. Given that text mining is able to automatically extract biomedical facts that revolve around entities such as genes, proteins, and drugs, from unstructured text sources, it is seen as a major enabler to foster biomedical research and drug discovery. In contrast to the biomedical literature, research into the mining of biomedical patents has not reached the same level of maturity. Here, we review existing work and highlight the associated technical challenges that emerge from automatically extracting facts from patents. We conclude by outlining potential future directions in this domain that could help drive biomedical research and drug discovery. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Benchmarking infrastructure for mutation text mining

    Science.gov (United States)

    2014-01-01

    Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600

  1. AUTOMATIC LUNG NODULE SEGMENTATION USING AUTOSEED REGION GROWING WITH MORPHOLOGICAL MASKING (ARGMM AND FEATURE EX-TRACTION THROUGH COMPLETE LOCAL BINARY PATTERN AND MICROSCOPIC INFORMATION PATTERN

    Directory of Open Access Journals (Sweden)

    Senthil Kumar

    2015-04-01

    Full Text Available An efficient Autoseed Region Growing with Morphological Masking(ARGMM is imple-mented in this paper on the Lung CT Slice to segment the 'Lung Nodules',which may be the potential indicator for the Lung Cancer. The segmentation of lung nodules car-ried out in this paper through Multi-Thresholding, ARGMM and Level Set Evolution. ARGMM takes twice the time compared to Level Set, but still the number of suspected segmented nodules are doubled, which make sure that no potential cancerous nodules go unnoticed at the earlier stages of diagnosis. It is very important not to panic the patient by finding the presence of nodules from Lung CT scan. Only 40 percent of nod-ules can be cancerous. Hence, in this paper an efficient Shape and Texture analysis is computed to quantitatively describe the segmented lung nodules. The Frequency spectrum of the lung nodules is developed and its frequency domain features are com-puted. The Complete Local binary pattern of lung nodules is computed in this paper by constructing the combine histogram of Sign and Magnitude Local Binary Patterns. Lo-cal Configuration Pattern is also determined in this work for lung nodules to numeri-cally model the microscopic information of nodules pattern.

  2. Automatic Text Summarization for Indonesian Language Using TextTeaser

    Science.gov (United States)

    Gunawan, D.; Pasaribu, A.; Rahmat, R. F.; Budiarto, R.

    2017-04-01

    Text summarization is one of the solution for information overload. Reducing text without losing the meaning not only can save time to read, but also maintain the reader’s understanding. One of many algorithms to summarize text is TextTeaser. Originally, this algorithm is intended to be used for text in English. However, due to TextTeaser algorithm does not consider the meaning of the text, we implement this algorithm for text in Indonesian language. This algorithm calculates four elements, such as title feature, sentence length, sentence position and keyword frequency. We utilize TextRank, an unsupervised and language independent text summarization algorithm, to evaluate the summarized text yielded by TextTeaser. The result shows that the TextTeaser algorithm needs more improvement to obtain better accuracy.

  3. Important Text Characteristics for Early-Grades Text Complexity

    Science.gov (United States)

    Fitzgerald, Jill; Elmore, Jeff; Koons, Heather; Hiebert, Elfrieda H.; Bowen, Kimberly; Sanford-Moore, Eleanor E.; Stenner, A. Jackson

    2015-01-01

    The Common Core set a standard for all children to read increasingly complex texts throughout schooling. The purpose of the present study was to explore text characteristics specifically in relation to early-grades text complexity. Three hundred fifty primary-grades texts were selected and digitized. Twenty-two text characteristics were identified…

  4. Improving text recognition by distinguishing scene and overlay text

    Science.gov (United States)

    Quehl, Bernhard; Yang, Haojin; Sack, Harald

    2015-02-01

    Video texts are closely related to the content of a video. They provide a valuable source for indexing and interpretation of video data. Text detection and recognition task in images or videos typically distinguished between overlay and scene text. Overlay text is artificially superimposed on the image at the time of editing and scene text is text captured by the recording system. Typically, OCR systems are specialized on one kind of text type. However, in video images both types of text can be found. In this paper, we propose a method to automatically distinguish between overlay and scene text to dynamically control and optimize post processing steps following text detection. Based on a feature combination a Support Vector Machine (SVM) is trained to classify scene and overlay text. We show how this distinction in overlay and scene text improves the word recognition rate. Accuracy of the proposed methods has been evaluated by using publicly available test data sets.

  5. Using Genetic Algorithms for Texts Classification Problems

    Directory of Open Access Journals (Sweden)

    A. A. Shumeyko

    2009-01-01

    Full Text Available The avalanche quantity of the information developed by mankind has led to concept of automation of knowledge extraction – Data Mining ([1]. This direction is connected with a wide spectrum of problems - from recognition of the fuzzy set to creation of search machines. Important component of Data Mining is processing of the text information. Such problems lean on concept of classification and clustering ([2]. Classification consists in definition of an accessory of some element (text to one of in advance created classes. Clustering means splitting a set of elements (texts on clusters which quantity are defined by localization of elements of the given set in vicinities of these some natural centers of these clusters. Realization of a problem of classification initially should lean on the given postulates, basic of which – the aprioristic information on primary set of texts and a measure of affinity of elements and classes.

  6. Text summarization as a decision support aid

    OpenAIRE

    Workman, T Elizabeth; Fiszman, Marcelo; Hurdle, John F

    2012-01-01

    Abstract Background PubMed data potentially can provide decision support information, but PubMed was not exclusively designed to be a point-of-care tool. Natural language processing applications that summarize PubMed citations hold promise for extracting decision support information. The objective of this study was to evaluate the efficiency of a text summarization application called Semantic MEDLINE, enhanced with a novel dynamic summarization method, in identifying decision support data. Me...

  7. Text analysis methods, text analysis apparatuses, and articles of manufacture

    Science.gov (United States)

    Whitney, Paul D; Willse, Alan R; Lopresti, Charles A; White, Amanda M

    2014-10-28

    Text analysis methods, text analysis apparatuses, and articles of manufacture are described according to some aspects. In one aspect, a text analysis method includes accessing information indicative of data content of a collection of text comprising a plurality of different topics, using a computing device, analyzing the information indicative of the data content, and using results of the analysis, identifying a presence of a new topic in the collection of text.

  8. Classroom Texting in College Students

    Science.gov (United States)

    Pettijohn, Terry F.; Frazier, Erik; Rieser, Elizabeth; Vaughn, Nicholas; Hupp-Wilds, Bobbi

    2015-01-01

    A 21-item survey on texting in the classroom was given to 235 college students. Overall, 99.6% of students owned a cellphone and 98% texted daily. Of the 138 students who texted in the classroom, most texted friends or significant others, and indicate the reason for classroom texting is boredom or work. Students who texted sent a mean of 12.21…

  9. Mining the Text: 34 Text Features that Can Ease or Obstruct Text Comprehension and Use

    Science.gov (United States)

    White, Sheida

    2012-01-01

    This article presents 34 characteristics of texts and tasks ("text features") that can make continuous (prose), noncontinuous (document), and quantitative texts easier or more difficult for adolescents and adults to comprehend and use. The text features were identified by examining the assessment tasks and associated texts in the national…

  10. Text as Statistical Mechanics Object

    OpenAIRE

    Koroutchev, K.; Korutcheva, E.

    2008-01-01

    In this article we present a model of human written text based on statistical mechanics approach by deriving the potential energy for different parts of the text using large text corpus. We have checked the results numerically and found that the specific heat parameter effectively separates the closed class words from the specific terms used in the text.

  11. Text analysis devices, articles of manufacture, and text analysis methods

    Science.gov (United States)

    Turner, Alan E; Hetzler, Elizabeth G; Nakamura, Grant C

    2013-05-28

    Text analysis devices, articles of manufacture, and text analysis methods are described according to some aspects. In one aspect, a text analysis device includes processing circuitry configured to analyze initial text to generate a measurement basis usable in analysis of subsequent text, wherein the measurement basis comprises a plurality of measurement features from the initial text, a plurality of dimension anchors from the initial text and a plurality of associations of the measurement features with the dimension anchors, and wherein the processing circuitry is configured to access a viewpoint indicative of a perspective of interest of a user with respect to the analysis of the subsequent text, and wherein the processing circuitry is configured to use the viewpoint to generate the measurement basis.

  12. Supported eText: Assistive Technology through Text Transformations

    Science.gov (United States)

    Anderson-Inman, Lynne; Horney, Mark A.

    2007-01-01

    To gain meaningful access to the curriculum, students with reading difficulties must overcome substantial barriers imposed by the printed materials they are asked to read. Technology can assist students to overcome these challenges by enabling a shift from printed text to electronic text. By electronic text it means textual material read using a…

  13. Text mining from ontology learning to automated text processing applications

    CERN Document Server

    Biemann, Chris

    2014-01-01

    This book comprises a set of articles that specify the methodology of text mining, describe the creation of lexical resources in the framework of text mining and use text mining for various tasks in natural language processing (NLP). The analysis of large amounts of textual data is a prerequisite to build lexical resources such as dictionaries and ontologies and also has direct applications in automated text processing in fields such as history, healthcare and mobile applications, just to name a few. This volume gives an update in terms of the recent gains in text mining methods and reflects

  14. A Scene Text-Based Image Retrieval System

    Science.gov (United States)

    2012-12-01

    images. The majority of OCR engines is designed for scanned text and so depends on segmentation which correctly separates text from background...size is 8×8, cell size is 2×2 and 9 bins for histogram. For each candidate word, HOG feature is extracted and used by the SVM classifier to verify...images. One approach is to extract text appearing in images which often gives an indication of a scene’s semantic content. However, it can be

  15. Working with text tools, techniques and approaches for text mining

    CERN Document Server

    Tourte, Gregory J L

    2016-01-01

    Text mining tools and technologies have long been a part of the repository world, where they have been applied to a variety of purposes, from pragmatic aims to support tools. Research areas as diverse as biology, chemistry, sociology and criminology have seen effective use made of text mining technologies. Working With Text collects a subset of the best contributions from the 'Working with text: Tools, techniques and approaches for text mining' workshop, alongside contributions from experts in the area. Text mining tools and technologies in support of academic research include supporting research on the basis of a large body of documents, facilitating access to and reuse of extant work, and bridging between the formal academic world and areas such as traditional and social media. Jisc have funded a number of projects, including NaCTem (the National Centre for Text Mining) and the ResDis programme. Contents are developed from workshop submissions and invited contributions, including: Legal considerations in te...

  16. The Only Safe SMS Texting Is No SMS Texting.

    Science.gov (United States)

    Toth, Cheryl; Sacopulos, Michael J

    2015-01-01

    Many physicians and practice staff use short messaging service (SMS) text messaging to communicate with patients. But SMS text messaging is unencrypted, insecure, and does not meet HIPAA requirements. In addition, the short and abbreviated nature of text messages creates opportunities for misinterpretation, and can negatively impact patient safety and care. Until recently, asking patients to sign a statement that they understand and accept these risks--as well as having policies, device encryption, and cyber insurance in place--would have been enough to mitigate the risk of using SMS text in a medical practice. But new trends and policies have made SMS text messaging unsafe under any circumstance. This article explains these trends and policies, as well as why only secure texting or secure messaging should be used for physician-patient communication.

  17. Multilingual Text Analysis for Text-to-Speech Synthesis

    CERN Document Server

    Sproat, R

    1996-01-01

    We present a model of text analysis for text-to-speech (TTS) synthesis based on (weighted) finite-state transducers, which serves as the text-analysis module of the multilingual Bell Labs TTS system. The transducers are constructed using a lexical toolkit that allows declarative descriptions of lexicons, morphological rules, numeral-expansion rules, and phonological rules, inter alia. To date, the model has been applied to eight languages: Spanish, Italian, Romanian, French, German, Russian, Mandarin and Japanese.

  18. Multilingual Text Analysis for Text-to-Speech Synthesis

    OpenAIRE

    Sproat, Richard

    1996-01-01

    We present a model of text analysis for text-to-speech (TTS) synthesis based on (weighted) finite-state transducers, which serves as the text-analysis module of the multilingual Bell Labs TTS system. The transducers are constructed using a lexical toolkit that allows declarative descriptions of lexicons, morphological rules, numeral-expansion rules, and phonological rules, inter alia. To date, the model has been applied to eight languages: Spanish, Italian, Romanian, French, German, Russian, ...

  19. Incremental semantics for propositional texts

    NARCIS (Netherlands)

    Vermeulen, C.F.M.

    In this paper we are concerned with the special requirements that a semantics of texts should meet. It is argued that a semantics of texts should be incremental and should satisfy the break in principle. We develop a semantics for propositional texts that satisfies these constraints. We will see

  20. Knowledge Representation in Travelling Texts

    DEFF Research Database (Denmark)

    Mousten, Birthe; Locmele, Gunta

    2014-01-01

    and the purpose of the text in a new context as well as on predefined parameters for text travel. For texts used in marketing and in technology, the question is whether culture-bound knowledge representation should be domesticated or kept as foreign elements, or should be mirrored or moulded—or should not travel...

  1. Predicting Prosody from Text for Text-to-Speech Synthesis

    CERN Document Server

    Rao, K Sreenivasa

    2012-01-01

    Predicting Prosody from Text for Text-to-Speech Synthesis covers the specific aspects of prosody, mainly focusing on how to predict the prosodic information from linguistic text, and then how to exploit the predicted prosodic knowledge for various speech applications. Author K. Sreenivasa Rao discusses proposed methods along with state-of-the-art techniques for the acquisition and incorporation of prosodic knowledge for developing speech systems. Positional, contextual and phonological features are proposed for representing the linguistic and production constraints of the sound units present in the text. This book is intended for graduate students and researchers working in the area of speech processing.

  2. ParaText : scalable text modeling and analysis.

    Energy Technology Data Exchange (ETDEWEB)

    Dunlavy, Daniel M.; Stanton, Eric T.; Shead, Timothy M.

    2010-06-01

    Automated processing, modeling, and analysis of unstructured text (news documents, web content, journal articles, etc.) is a key task in many data analysis and decision making applications. As data sizes grow, scalability is essential for deep analysis. In many cases, documents are modeled as term or feature vectors and latent semantic analysis (LSA) is used to model latent, or hidden, relationships between documents and terms appearing in those documents. LSA supplies conceptual organization and analysis of document collections by modeling high-dimension feature vectors in many fewer dimensions. While past work on the scalability of LSA modeling has focused on the SVD, the goal of our work is to investigate the use of distributed memory architectures for the entire text analysis process, from data ingestion to semantic modeling and analysis. ParaText is a set of software components for distributed processing, modeling, and analysis of unstructured text. The ParaText source code is available under a BSD license, as an integral part of the Titan toolkit. ParaText components are chained-together into data-parallel pipelines that are replicated across processes on distributed-memory architectures. Individual components can be replaced or rewired to explore different computational strategies and implement new functionality. ParaText functionality can be embedded in applications on any platform using the native C++ API, Python, or Java. The ParaText MPI Process provides a 'generic' text analysis pipeline in a command-line executable that can be used for many serial and parallel analysis tasks. ParaText can also be deployed as a web service accessible via a RESTful (HTTP) API. In the web service configuration, any client can access the functionality provided by ParaText using commodity protocols ... from standard web browsers to custom clients written in any language.

  3. AHP 45: REVIEW: TIBETAN LITERARY GENRES, TEXTS, AND TEXT TYPES

    Directory of Open Access Journals (Sweden)

    Tricia Kehoe

    2017-03-01

    Full Text Available Intended as a follow-up to Cabezón and Jackson's groundbreaking Tibetan Literature: Studies in Genre (1996, Tibetan Literary Genres, Texts, and Text Types: From Genre Classification to Transformation aims to deepen our understandings of Tibetan literature by approaching Tibetan text types from systematic and historical perspectives. Growing out of a conference panel at the twelfth Tibetan Studies seminar, the book explores both pre-modern and contemporary genres, as well as issues of classification and methodologies. In doing so, this collection of essays edited by Jim Rheingans covers a great deal of new ground in terms of discussions of terminology, definitions, and the theoretical landscape pertaining to literature, genre, text boundaries, and typologies in the field of Tibetan literature. ...

  4. Investigation into the behavior of metal-argon polyatomic ions (MAr{sup +}) in the extraction region of inductively coupled plasma-mass spectrometry

    Energy Technology Data Exchange (ETDEWEB)

    Ebert, Chris H.; Witte, Travis M.; Houk, R.S., E-mail: rshouk@iastate.edu

    2012-10-15

    The abundances of metal-argon polyatomic ions (MAr{sup +}) are determined in inductively coupled plasma-mass spectrometry (ICP-MS). The ratios of MAr{sup +} abundance to that for M{sup +} ions are measured experimentally. These ratios are compared to expected values, calculated for typical plasma conditions using spectroscopic data. For all metals studied (Ti, V, Cr, Mn, Fe, Co, Ni, Cu, and Zn), the measured ratios are significantly lower than the calculated ratios. Increasing the plasma potential (and thereby increasing the ion kinetic energy) by means of a homemade guard electrode with a wide gap further reduces the MAr{sup +}/M{sup +} ratio. Implementing a skimmer cone designed for high transmission of light ions increases the MAr{sup +} abundance. Considering this evidence, the scarcity of MAr{sup +} ions is attributed to collision induced dissociation (CID), likely due to a shock wave at the tip of or in the throat of the skimmer cone. - Highlights: Black-Right-Pointing-Pointer MAr{sup +} ions are less abundant in the mass spectrum than expected from the ICP. Black-Right-Pointing-Pointer Increasing the plasma potential reduces their abundance further. Black-Right-Pointing-Pointer The extraction lens voltage does not greatly affect the MAr{sup +} abundances. Black-Right-Pointing-Pointer The weakly-bound MAr{sup +} ions are probably dissociated by collisions during extraction.

  5. Text mining of web-based medical content

    CERN Document Server

    Neustein, Amy

    2014-01-01

    Text Mining of Web-Based Medical Content examines web mining for extracting useful information that can be used for treating and monitoring the healthcare of patients. This work provides methodological approaches to designing mapping tools that exploit data found in social media postings. Specific linguistic features of medical postings are analyzed vis-a-vis available data extraction tools for culling useful information.

  6. Text Mining Applications and Theory

    CERN Document Server

    Berry, Michael W

    2010-01-01

    Text Mining: Applications and Theory presents the state-of-the-art algorithms for text mining from both the academic and industrial perspectives.  The contributors span several countries and scientific domains: universities, industrial corporations, and government laboratories, and demonstrate the use of techniques from machine learning, knowledge discovery, natural language processing and information retrieval to design computational models for automated text analysis and mining. This volume demonstrates how advancements in the fields of applied mathematics, computer science, machine learning

  7. Hermeneutic reading of classic texts.

    Science.gov (United States)

    Koskinen, Camilla A-L; Lindström, Unni Å

    2013-09-01

    The purpose of this article is to broaden the understandinfg of the hermeneutic reading of classic texts. The aim is to show how the choice of a specific scientific tradition in conjunction with a methodological approach creates the foundation that clarifies the actual realization of the reading. This hermeneutic reading of classic texts is inspired by Gadamer's notion that it is the researcher's own research tradition and a clearly formulated theoretical fundamental order that shape the researcher's attitude towards texts and create the starting point that guides all reading, uncovering and interpretation. The researcher's ethical position originates in a will to openness towards what is different in the text and which constantly sets the researcher's preunderstanding and research tradition in movement. It is the researcher's attitude towards the text that allows the text to address, touch and arouse wonder. Through a flexible, lingering and repeated reading of classic texts, what is different emerges with a timeless value. The reading of classic texts is an act that may rediscover and create understanding for essential dimensions and of human beings' reality on a deeper level. The hermeneutic reading of classic texts thus brings to light constantly new possibilities of uncovering for a new envisioning and interpretation for a new understanding of the essential concepts and phenomena within caring science. © 2012 The Authors Scandinavian Journal of Caring Sciences © 2012 Nordic College of Caring Science.

  8. Text messaging reduces analgesic requirements during surgery.

    Science.gov (United States)

    Guillory, Jamie E; Hancock, Jeffrey T; Woodruff, Christopher; Keilman, Jeffrey

    2015-04-01

    This study aims to determine whether communicating via short message service text message during surgery procedures leads to decreased intake of fentanyl for patients receiving regional anesthesia below the waist compared with a distraction condition and no intervention. Ninety-eight patients receiving regional anesthesia for minor surgeries were recruited from a hospital in Montreal, QC, between January and March 2012. Patients were randomly assigned to text message with a companion, text message with a stranger, play a distracting mobile phone game, or receive standard perioperative management. Participants who were asked to text message or play a game did so before receiving the anesthetic and continued until the end of the procedure. The odds of receiving supplemental analgesia during surgery for patients receiving standard perioperative management were 6.77 (P=0.009; N=13/25) times the odds for patients in the text a stranger condition (N=22/25 of patients), 4.39 times the odds for those in the text a companion condition (P=0.03; N=19/23), and 1.96 times the odds for those in the distraction condition (P=0.25; N=17/25). Text messaging during surgery provides analgesic-sparing benefits that surpass distraction techniques, suggesting that mobile phones provide new opportunities for social support to improve patient comfort and reduce analgesic requirements during minor surgeries and in other clinical settings. Wiley Periodicals, Inc.

  9. Word and text processing in acquired prosopagnosia.

    Science.gov (United States)

    Hills, Charlotte S; Pancaroglu, Raika; Duchaine, Brad; Barton, Jason J S

    2015-08-01

    A novel hypothesis of object recognition asserts that multiple regions are engaged in processing an object type, and that cerebral regions participate in processing multiple types of objects. In particular, for high-level expert processing, it proposes shared rather than dedicated resources for word and face perception, and predicts that prosopagnosic subjects would have minor deficits in visual word processing, and alexic subjects would have subtle impairments in face perception. In this study, we evaluated whether prosopagnosic subjects had deficits in processing either the word content or the style of visual text. Eleven prosopagnosic subjects, 6 with unilateral right lesions and 5 with bilateral lesions, participated. In the first study, we evaluated their word length effect in reading single words. In the second study, we assessed their time and accuracy for sorting text by word content independent of style, and for sorting text by handwriting or font style independent of word content. Only subjects with bilateral lesions showed mildly elevated word length effects. Subjects were not slowed in sorting text by word content, but were nearly uniformly impaired in accuracy for sorting text by style. Our results show that prosopagnosic subjects are impaired not only in face recognition but also in perceiving stylistic aspects of text. This supports a modified version of the many-to-many hypothesis that incorporates hemispheric specialization for processing different aspects of visual text. © 2015 American Neurological Association.

  10. Texte et contre-texte en situation de diglossie

    OpenAIRE

    Carpanin Marimoutou, Jean-Claude

    2015-01-01

    Le texte en situation diglossique s'inscrit dans une relation dialogique conflictuelle indépassée qui produit le contre-texte et que le contre-texte reproduit en retour, déplaçant non pas le conflit, mais les pôles du conflit. Une vue d'ensemble de la littérature réunionnaise suffit à mettre en évidence ce jeu de miroir. Une étude des préfaces montre la conscience des producteurs de ce que le combat des textes cache d'enjeux et comment celui qui est posé comme Autre semble ne produire qu'une ...

  11. Zum Uebersetzen fachlicher Texte (On the Translation of Technical Texts)

    Science.gov (United States)

    Friederich, Wolf

    1975-01-01

    Reviews a 1974 East German publication on translation of scientific literature from Russian to German. Considers terminology, different standard levels of translation in East Germany, and other matters related to translation. (Text is in German.) (DH)

  12. English Metafunction Analysis in Chemistry Text: Characterization of Scientific Text

    Directory of Open Access Journals (Sweden)

    Ahmad Amin Dalimunte, M.Hum

    2013-09-01

    Full Text Available The objectives of this research are to identify what Metafunctions are applied in chemistry text and how they characterize a scientific text. It was conducted by applying content analysis. The data for this research was a twelve-paragraph chemistry text. The data were collected by applying a documentary technique. The document was read and analyzed to find out the Metafunction. The data were analyzed by some procedures: identifying the types of process, counting up the number of the processes, categorizing and counting up the cohesion devices, classifying the types of modulation and determining modality value, finally counting up the number of sentences and clauses, then scoring the grammatical intricacy index. The findings of the research show that Material process (71of 100 is mostly used, circumstance of spatial location (26 of 56 is more dominant than the others. Modality (5 is less used in order to avoid from subjectivity. Impersonality is implied through less use of reference either pronouns (7 or demonstrative (7, conjunctions (60 are applied to develop ideas, and the total number of the clauses are found much more dominant (109 than the total number of the sentences (40 which results high grammatical intricacy index. The Metafunction found indicate that the chemistry text has fulfilled the characteristics of scientific or academic text which truly reflects it as a natural science.

  13. Comparação de soluções extratoras de ferro e manganês em solos da Amazônia Comparison of extracting solution for iron and manganese in soils of the Amazon Region

    Directory of Open Access Journals (Sweden)

    Maria do Rosário Lobato Rodrigues

    2001-01-01

    Full Text Available O objetivo deste trabalho foi comparar soluções extratoras (Mehlich 1, Mehlich 3, DTPA-TEA de ferro e manganês em solos representativos da Região Amazônica. Foram determinadas as correlações desses micronutrientes nos solos com os teores e conteúdos na matéria seca da parte aérea de plantas de arroz de três cultivos sucessivos. Aplicou-se a técnica do diagnóstico por subtração, em delineamento em blocos casualizados com parcelas subdivididas. Foram utilizados os solos Podzol, Podzólico Amarelo, Podzólico Vermelho-Amarelo, Latossolo Amarelo, Latossolo Húmico e Aluvial, sob oito tratamentos: controle, completo e com omissão de um dos micronutrientes B, Cu, Fe, Mn, Mo e Zn. Osmicronutrientes e a calagem foram aplicados somente antes do primeiro cultivo. A primeira colheita foi realizada aos 58 dias, a segunda aos 68 e a terceira aos 70 dias após a emergência das plântulas. A solução extratora Mehlich 3 apresentou a maior correlação com o teor de micronutrientes na planta. O melhor coeficiente de determinação foi observado entre as soluções Mehlich 1 e Mehlich 3 quanto aos teores de Fe extraídos dos solos Podzol, Aluvial e Podzólico Vermelho-Amarelo. Com relação ao Mn trocável, os três extratores mostraram-se eficientes na determinação do elemento nos diferentes solos, apresentando coeficientes de determinação significativos entre si.The aim of this work was to compare extracting solutions (Mehlich 1, Mehlich 3, DTPA-TEA for iron and manganese in soils of the Amazon Region. The correlations between Fe and Mn in soils and their contents in the rice plants were determined. The technique of diagnosis by subtractions was used, in a randomized block design with split plots. The soil types were Podzols, Yellow Podzolic, Red-Yellow Podzolic, Yellow Latosol, Humic Latosol, and Alluvial under eight treatments: control, complete with all micronutrients, and omitting one by one of the following micronutrients B, Cu, Fe

  14. Extraction of polyphenols

    Directory of Open Access Journals (Sweden)

    Loucif Seiad L.

    2013-07-01

    Full Text Available The aim of the study is to investigate the influence of certain parameters on efficiency of the extraction of polyphenols from an Algerian tree (Pinus Halepensis Mill. Extraction was conducted in a stirred closed extractor. Our study was conducted to optimize the extraction conditions for total phenolic contents (TPC using Folin Ciocalteu method. A response surface methodology (RSM was launched to investigate the influence of process variables on extraction followed by a composite design (CD approach. The statistical analysis revealed that the optimized conditions were for a temperature of 45°C and for the smallest particles.

  15. Free-Text Disease Classification

    Science.gov (United States)

    2011-09-01

    1-59593-597- 7. http://doi.acm.org/10.1145/1277741.1277889. [9] Ingo Feinerer. tm: Text Mining Package, 2011. http://tm.r-forge.r-project. org/. R...package version 0.5-6. [10] Duncan Temple Lang. r-cran-xml, 2011. [11] Ingo Feinerer, Kurt Hornik, and David Meyer. Text mining infrastructure in r

  16. Strategies for Translating Vocative Texts

    Directory of Open Access Journals (Sweden)

    Olga COJOCARU

    2014-12-01

    Full Text Available The paper deals with the linguistic and cultural elements of vocative texts and the techniques used in translating them by giving some examples of texts that are typically vocative (i.e. advertisements and instructions for use. Semantic and communicative strategies are popular in translation studies and each of them has its own advantages and disadvantages in translating vocative texts. The advantage of semantic translation is that it takes more account of the aesthetic value of the SL text, while communicative translation attempts to render the exact contextual meaning of the original text in such a way that both content and language are readily acceptable and comprehensible to the readership. Focus is laid on the strategies used in translating vocative texts, strategies that highlight and introduce a cultural context to the target audience, in order to achieve their overall purpose, that is to sell or persuade the reader to behave in a certain way. Thus, in order to do that, a number of advertisements from the field of cosmetics industry and electronic gadgets were selected for analysis. The aim is to gather insights into vocative text translation and to create new perspectives on this field of research, now considered a process of innovation and diversion, especially in areas as important as economy and marketing.

  17. Text Genres in Information Organization

    Science.gov (United States)

    Nahotko, Marek

    2016-01-01

    Introduction: Text genres used by so-called information organizers in the processes of information organization in information systems were explored in this research. Method: The research employed text genre socio-functional analysis. Five genre groups in information organization were distinguished. Every genre group used in information…

  18. Intercultural Rhetoric Research: Beyond Texts

    Science.gov (United States)

    Connor, Ulla

    2004-01-01

    This paper proposes a set of new methods for intercultural rhetoric research that is context-sensitive and, in many instances, goes beyond mere text analysis. It considers changes in the field as intercultural rhetoric has moved from the EAP study of student essays to the study of writing in many disciplines and genres. New developments in text,…

  19. The Case for Multiple Texts

    Science.gov (United States)

    Cummins, Sunday

    2017-01-01

    Reading just one text on any topic, Cummins argues, isn't enough if we expect students to learn at deep levels about the topic, synthesize various sources of information, and gain the knowledge they need to write and speak seriously about the topic. Reading a second or third text expands a reader's knowledge on any topic or story--and the why…

  20. Understanding and Teaching Complex Texts

    Science.gov (United States)

    Fisher, Douglas; Frey, Nancy

    2014-01-01

    Teachers in today's classrooms struggle every day to design instructional interventions that would build students' reading skills and strategies in order to ensure their comprehension of complex texts. Text complexity can be determined in both qualitative and quantitative ways. In this article, the authors describe various innovative…

  1. LOTUS: Linked open text unleashed

    NARCIS (Netherlands)

    Ilievski, F.; Beek, Wouter; Van Erp, Marieke; Rietveld, Laurens; Schlobach, Stefan

    2015-01-01

    It is dificult to find resources on the Semantic Web today, in particular if one wants to search for resources based on natural language keywords and across multiple datasets. In this paper, we present LOTUS: Linked Open Text UnleaShed, a full-text lookup index over a huge Linked Open Data

  2. Enriching text with images and colored light

    Science.gov (United States)

    Sekulovski, Dragan; Geleijnse, Gijs; Kater, Bram; Korst, Jan; Pauws, Steffen; Clout, Ramon

    2008-01-01

    We present an unsupervised method to enrich textual applications with relevant images and colors. The images are collected by querying large image repositories and subsequently the colors are computed using image processing. A prototype system based on this method is presented where the method is applied to song lyrics. In combination with a lyrics synchronization algorithm the system produces a rich multimedia experience. In order to identify terms within the text that may be associated with images and colors, we select noun phrases using a part of speech tagger. Large image repositories are queried with these terms. Per term representative colors are extracted using the collected images. Hereto, we either use a histogram-based or a mean shift-based algorithm. The representative color extraction uses the non-uniform distribution of the colors found in the large repositories. The images that are ranked best by the search engine are displayed on a screen, while the extracted representative colors are rendered on controllable lighting devices in the living room. We evaluate our method by comparing the computed colors to standard color representations of a set of English color terms. A second evaluation focuses on the distance in color between a queried term in English and its translation in a foreign language. Based on results from three sets of terms, a measure of suitability of a term for color extraction based on KL Divergence is proposed. Finally, we compare the performance of the algorithm using either the automatically indexed repository of Google Images and the manually annotated Flickr.com. Based on the results of these experiments, we conclude that using the presented method we can compute the relevant color for a term using a large image repository and image processing.

  3. Linguistic Dating of Biblical Texts

    DEFF Research Database (Denmark)

    Ehrensvärd, Martin Gustaf

    2003-01-01

    For two centuries, scholars have pointed to consistent differences in the Hebrew of certain biblical texts and interpreted these differences as reflecting the date of composition of the texts. Until the 1980s, this was quite uncontroversial as the linguistic findings largely confirmed the chronol......For two centuries, scholars have pointed to consistent differences in the Hebrew of certain biblical texts and interpreted these differences as reflecting the date of composition of the texts. Until the 1980s, this was quite uncontroversial as the linguistic findings largely confirmed...... the chronology of the texts established by other means: the Hebrew of Genesis-2 Kings was judged to be early and that of Esther, Daniel, Ezra, Nehemiah, and Chronicles to be late. In the current debate where revisionists have questioned the traditional dating, linguistic arguments in the dating of texts have...... come more into focus. The study critically examines some linguistic arguments adduced to support the traditional position, and reviewing the arguments it points to weaknesses in the linguistic dating of EBH texts to pre-exilic times. When viewing the linguistic evidence in isolation it will be clear...

  4. Text structures in medical text processing: empirical evidence and a text understanding prototype.

    Science.gov (United States)

    Hahn, U.; Romacker, M.

    1997-01-01

    We consider the role of textual structures in medical texts. In particular, we examine the impact the lacking recognition of text phenomena has on the validity of medical knowledge bases fed by a natural language understanding front-end. First, we review the results from an empirical study on a sample of medical texts considering, in various forms of local coherence phenomena (anaphora and textual ellipses). We then discuss the representation bias emerging in the text knowledge base that is likely to occur when these phenomena are not dealt with--mainly the emergence of referentially incoherent and invalid representations. We then turn to a medical text understanding system designed to account for local text coherence. PMID:9357739

  5. Toward text understanding: classification of text documents by word map

    Science.gov (United States)

    Visa, Ari J. E.; Toivanen, Jarmo; Back, Barbro; Vanharanta, Hannu

    2000-04-01

    In many fields, for example in business, engineering, and law there is interest in the search and the classification of text documents in large databases. To information retrieval purposes there exist methods. They are mainly based on keywords. In cases where keywords are lacking the information retrieval is problematic. One approach is to use the whole text document as a search key. Neural networks offer an adaptive tool for this purpose. This paper suggests a new adaptive approach to the problem of clustering and search in large text document databases. The approach is a multilevel one based on word, sentence, and paragraph level maps. Here only the word map level is reported. The reported approach is based on smart encoding, on Self-Organizing Maps, and on document histograms. The results are very promising.

  6. Biomarker Identification Using Text Mining

    Directory of Open Access Journals (Sweden)

    Hui Li

    2012-01-01

    Full Text Available Identifying molecular biomarkers has become one of the important tasks for scientists to assess the different phenotypic states of cells or organisms correlated to the genotypes of diseases from large-scale biological data. In this paper, we proposed a text-mining-based method to discover biomarkers from PubMed. First, we construct a database based on a dictionary, and then we used a finite state machine to identify the biomarkers. Our method of text mining provides a highly reliable approach to discover the biomarkers in the PubMed database.

  7. A STUDY OF TEXT MINING METHODS, APPLICATIONS,AND TECHNIQUES

    OpenAIRE

    R. Rajamani*1 & S. Saranya2

    2017-01-01

    Data mining is used to extract useful information from the large amount of data. It is used to implement and solve different types of research problems. The research related areas in data mining are text mining, web mining, image mining, sequential pattern mining, spatial mining, medical mining, multimedia mining, structure mining and graph mining. Text mining also referred to text of data mining, it is also called knowledge discovery in text (KDT) or knowledge of intelligent text analysis. T...

  8. An Experimental Text-Commentary

    Science.gov (United States)

    O'Brien, Joan

    1976-01-01

    An experimental text-commentary of selected passages from Sophocles'"Antigone" is described. The commentary is intended for students seeking more than a conventional translation who do not know enough Greek to use a standard commentary. (RM)

  9. Anomaly Detection with Text Mining

    Data.gov (United States)

    National Aeronautics and Space Administration — Many existing complex space systems have a significant amount of historical maintenance and problem data bases that are stored in unstructured text forms. The...

  10. Text Detection and Translation from Natural Scenes

    National Research Council Canada - National Science Library

    Gao, Jiang; Yang, Jie; Zhang, Ying; Waibel, Alex

    2001-01-01

    .... The paper addresses challenges in automatic sign extraction and translation, describes methods for automatic sign extraction, and extends example-based machine translation technology for sign translation...

  11. Individual Profiling Using Text Analysis

    Science.gov (United States)

    2016-04-15

    likelihood that it belongs to the input text , although early experiments showed that this added no benefit. Parts–of– speech In early experiments all...tweets were POS tagged as part of the pre– processing step using a Twitter specific part–of– speech tagger [8]. Various studies have identified POS tags as...AFRL-AFOSR-UK-TR-2016-0011 Individual Profiling using Text Analysis 140333 Mark Stevenson UNIVERSITY OF SHEFFIELD, DEPARTMENT OF PSYCHOLOGY Final

  12. Phytochemical profile and analgesic evaluation of Vitex cymosa leaf extracts

    Directory of Open Access Journals (Sweden)

    Suzana Guimarães Leitão

    2011-09-01

    Full Text Available Vitex cymosa Bertero ex Spreng., Lamiaceae, is found in Central and Amazon regions of Brazil, where it is popularly used as antirheumatic. Extracts from the leaves of V. cymosa were tested in analgesia models such as abdominal contortions induced by acetic acid and formalin to test peripheral analgesia; as well as the tail flick and hot plate models, to test spinal and supraspinal analgesia. A significant reduction was observed in the number of contortions with all extracts and in all doses. In the formalin model, a reduction in the second phase (inflammatory was observed with all extracts, whereas only the n-butanol extract was able to act in the first, neurogenic, phase. In the tail flick model, all extracts increased latency time. Naloxone treatment reverted analgesic effect of all extracts with the exception of the dichloromethane one. All extracts developed peripheral and central analgesic activity. In the hot plate model no antinociceptive effect was observed for all tested extracts. All these results taken together suggest that V. cymosa leaf extracts were able to promote peripheral and central antinociceptive activity mediated by the opioid system.Twenty three substances were isolated and identified in the extracts and include flavonoids (C-glucosyl flavones, flavones and flavonols, triterpene acids from ursane and oleanane types, iridoids (free and glucosides, as well as simple phenols.

  13. Thinking About Religious Texts Anthropologically

    Directory of Open Access Journals (Sweden)

    Joel S. Kahn

    2016-01-01

    Full Text Available This paper addresses the conference themes by asking what contribution anthropology can make to the study of religious literature and heritage. In particular I will discuss ways in which anthropologists engage with religious texts. The paper begins with an assessment of what is probably the dominant approach to religious texts in mainstream anthropology and sociology, namely avoiding them and focussing instead on the religious ‘practices’ of ‘ordinary believers’. Arguing that this tendency to neglect the study of texts is ill-advised, the paper looks at the reasons why anthropologists need to engage with contemporary religious texts, particularly in their studies of/in the modern Muslim world. Drawing on the insights of anthropologist of religion Joel Robbins into what he called the “awkward relationship” between anthropology and theology, the paper proposes three possible ways in which anthropology might engage with religious literature. Based on a reading of three rather different modern texts on or about Islam, the strengths and weaknesses of each of the three modes of anthropological engagement is assessed and a case is made for Robbins’s third approach on the grounds that it offers a way out of the impasse in which mainstream anthropology of religion finds itself, caught as it is between the ‘emic’ and the ‘etic’, i.e. between ontologically different worlds.

  14. A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts

    DEFF Research Database (Denmark)

    Westergaard, David; Stærfeldt, Hans Henrik; Tønsberg, Christian

    2018-01-01

    million English scientific full-text articles published during the period 1823-2016. We describe the development in article length and publication sub-topics during these nearly 250 years. We showcase the potential of text mining by extracting published protein-protein, disease-gene, and protein......-text articles consistently outperforms using abstracts only....

  15. A method for extracting burned areas from Landsat TM/ETM+ images by soft aggregation of multiple Spectral Indices and a region growing algorithm

    Science.gov (United States)

    Stroppiana, D.; Bordogna, G.; Carrara, P.; Boschetti, M.; Boschetti, L.; Brivio, P. A.

    2012-04-01

    Since fire is a major threat to forests and wooded areas in the Mediterranean environment of Southern Europe, systematic regional fire monitoring is a necessity. Satellite data constitute a unique cost-effective source of information on the occurrence of fire events and on the extent of the area burned. Our objective is to develop a (semi-)automated algorithm for mapping burned areas from medium spatial resolution (30 m) satellite data. In this article we present a multi-criteria approach based on Spectral Indices, soft computing techniques and a region growing algorithm; theoretically this approach relies on the convergence of partial evidence of burning provided by the indices. Our proposal features several innovative aspects: it is flexible in adapting to a variable number of indices and to missing data; it exploits positive and negative evidence (bipolar information) and it offers different criteria for aggregating partial evidence in order to derive the layers of candidate seeds and candidate region growing boundaries. The study was conducted on a set of Landsat TM images, acquired for the year 2003 over Southern Europe and pre-processed with the LEDAPS (Landsat Ecosystem Disturbance Adaptive Processing System) processing chain for deriving surface spectral reflectance ρi in the TM bands. The proposed method was applied to show its flexibility and the sensitivity of the accuracy of the resulting burned area maps to different aggregation criteria and thresholds for seed selection. Validation performed over an entire independent Landsat TM image shows the commission and omission errors to be below 21% and 3%, respectively.

  16. Analysing ESP Texts, but How?

    Directory of Open Access Journals (Sweden)

    Borza Natalia

    2015-03-01

    Full Text Available English as a second language (ESL teachers instructing general English and English for specific purposes (ESP in bilingual secondary schools face various challenges when it comes to choosing the main linguistic foci of language preparatory courses enabling non-native students to study academic subjects in English. ESL teachers intending to analyse English language subject textbooks written for secondary school students with the aim of gaining information about what bilingual secondary school students need to know in terms of language to process academic textbooks cannot avoiding deal with a dilemma. It needs to be decided which way it is most appropriate to analyse the texts in question. Handbooks of English applied linguistics are not immensely helpful with regard to this problem as they tend not to give recommendation as to which major text analytical approaches are advisable to follow in a pre-college setting. The present theoretical research aims to address this lacuna. Respectively, the purpose of this pedagogically motivated theoretical paper is to investigate two major approaches of ESP text analysis, the register and the genre analysis, in order to find the more suitable one for exploring the language use of secondary school subject texts from the point of view of an English as a second language teacher. Comparing and contrasting the merits and limitations of the two contrastive approaches allows for a better understanding of the nature of the two different perspectives of text analysis. The study examines the goals, the scope of analysis, and the achievements of the register perspective and those of the genre approach alike. The paper also investigates and reviews in detail the starkly different methods of ESP text analysis applied by the two perspectives. Discovering text analysis from a theoretical and methodological angle supports a practical aspect of English teaching, namely making an informed choice when setting out to analyse

  17. A rapid and reliable method for discriminating rice products from different regions using MCX-based solid-phase extraction and DI-MS/MS-based metabolomics approach.

    Science.gov (United States)

    Lim, Dong Kyu; Mo, Changyeun; Long, Nguyen Phuoc; Lim, Jongguk; Kwon, Sung Won

    2017-09-01

    The expansion of the global rice marketplace ultimately raises concerns about authenticity control. Several analytical methods for differentiating the geographical origin of rice have been developed, yet a high-throughput method is still in demand. In this study, we developed a rapid approach using direct infusion-mass spectrometry (DI-MS) to distinguish rice products from different countries. Specifically, the elimination of the matrix effect by a polytetrafluoroethylene (PTFE) filter, a mixed-mode cation exchange (MCX) solid-phase extraction (SPE) with 20% methanol, and an MCX SPE with 100% methanol were measured. Afterward, partial least squares discriminant analysis and random forests were applied to seek the optimal discrimination method. The results revealed that the combination of MCX SPE with 100% methanol and DI-MS in positive ion mode (accuracy=1.000, R2=0.916, Q2=0.720, B/W-based p-value=0.015) or the combination of MCX SPE with 20% methanol and targeted DI-MS/MS in positive ion mode (accuracy=1.000, R2=0.931, Q2=0.849, B/W-based p-value=0.002) showed the excellent discriminatory ability. Furthermore, differentially expressed metabolites including sodiated lysophosphatidylcholine, lysophosphatidylcholine, lysophosphatidylethanolamines and lysophosphatidylglycerol classes were found. In conclusion, our study provides a rapid and reliable platform for geographical discrimination of white rice and will contribute to the authenticity control of rice products. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Comparision of conventional and supercritical CO2-extracted rosehip oil

    Directory of Open Access Journals (Sweden)

    J.M. del Valle

    2000-09-01

    Full Text Available Supercritical CO2 (SCO2 can be utilized to extract oils from a number of plant materials as a nontoxic alternative to hexane, and there is industrial interest in using SCO2 extraction to obtain high-quality oils for cosmetics and other high-value applications. A possible substrate is rosehip (Rosa aff. rubiginosa seed. The scope of our work was to select SCO2 extraction conditions and to compare cold-pressed, hexane-extracted and SCO2-extracted rosehip oil. We used a fractional factorial experimental design with extraction temperature (T, 40-60 °C, extraction pressure (p, 300-500 bar and dynamic extraction time (t, 90-270 min as independent variables and yield and color as response variables. Samples of 100 g flaked rosehip seeds were extracted with 21 g CO2/min, following a static extraction (15 min adjustment period. Resulting data were analyzed using response surface methodology. Extracted oil (4.7-7.1% in our experimental region increased slightly with p and more pronouncedly with T and specially t. On the other hand, the photometric color index was independent of t but worsened (increased as a result of an increase in either p or specially T. We extracted five batches of 250 g seeds with 21 g CO2/min at 40 °C and 300 bar for 270 min and compared the oil with samples obtained by solvent extraction (a batch of 2.5 kg of laminated seeds was treated with 10 L hexane and rotaevaporated until there was virtually no residual hexane and cold pressing, by determining color, fatty acid composition, iodine index and saponification index. It was concluded that SCO2 allows an almost complete recovery of rosehip oil (6.5% yield, which is of a better quality than the oil extracted with hexane. Yield was higher than it was when using a cold-pressing process (5.0% yield.

  19. GPU-Accelerated Text Mining

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Mueller, Frank [North Carolina State University; Zhang, Yongpeng [ORNL; Potok, Thomas E [ORNL

    2009-01-01

    Accelerating hardware devices represent a novel promise for improving the performance for many problem domains but it is not clear for which domains what accelerators are suitable. While there is no room in general-purpose processor design to significantly increase the processor frequency, developers are instead resorting to multi-core chips duplicating conventional computing capabilities on a single die. Yet, accelerators offer more radical designs with a much higher level of parallelism and novel programming environments. This present work assesses the viability of text mining on CUDA. Text mining is one of the key concepts that has become prominent as an effective means to index the Internet, but its applications range beyond this scope and extend to providing document similarity metrics, the subject of this work. We have developed and optimized text search algorithms for GPUs to exploit their potential for massive data processing. We discuss the algorithmic challenges of parallelization for text search problems on GPUs and demonstrate the potential of these devices in experiments by reporting significant speedups. Our study may be one of the first to assess more complex text search problems for suitability for GPU devices, and it may also be one of the first to exploit and report on atomic instruction usage that have recently become available in NVIDIA devices.

  20. Comprehending text in literature class

    Directory of Open Access Journals (Sweden)

    Purić Daliborka S.

    2016-01-01

    Full Text Available The paper discusses the problem of understanding a text and the contribution of methodological apparatus in the reader book to comprehension of a text being read in junior classes of elementary school. By using the technique of content analysis from methodological apparatuses in eight reader books for the fourth grade of elementary school, approved for usage in 2014/2015 academic year, and surveying 350 teachers in 33 elementary schools and 11 administrative districts in the Republic of Serbia we examined: (a to what extent the Serbian language text book contents enable junior students to understand a literary text; (b to what extent teachers accept the suggestions offered in the textbook for preparing literature teaching. The results show that a large number of suggestions relate to reading comprehension, but some of categories of understanding are unevenly distributed in the methodological apparatus. On the other hand, the majority of teachers use the methodological apparatus given in a textbook for preparing classes, not only the textbook he or she selected for teaching but also other textbooks for the same grade.

  1. A Fast and Robust Text Spotter.

    Science.gov (United States)

    Qin, Siyang; Manduchi, Roberto

    2016-03-01

    We introduce an algorithm for text detection and localization ("spotting") that is computationally efficient and produces state-of-the-art results. Our system uses multi-channel MSERs to detect a large number of promising regions, then subsamples these regions using a clustering approach. Representatives of region clusters are binarized and then passed on to a deep network. A final line grouping stage forms word-level segments. On the ICDAR 2011 and 2015 benchmarks, our algorithm obtains an F-score of 82% and 83%, respectively, at a computational cost of 1.2 seconds per frame. We also introduce a version that is three times as fast, with only a slight reduction in performance.

  2. Princess Brambilla - images/text

    Directory of Open Access Journals (Sweden)

    Maria Aparecida Barbosa

    2016-01-01

    Full Text Available Read the illustrated literary text is simultaneously think pictures and words. This articulation between the written text and pictures adds potential, expands and becomes complex. Coincides with nowadays discussions on Giorgio Agamben's "contemporary" that add to what adheres to respectively time the displacement and the distance needed to understand it, shakes linear notions of historical chronology. Somehow the coincidence is related to the current interest in the concept of "Nachleben" (survival, which assumes the images of the past ransom, postulated by the art historian Aby Warburg in a research on ancient art of motion characteristics in Renaissance pictures Botticelli's. For the translation of the Princesa Brambilla – um capriccio segundo Jakob Callot, de E. T. A. Hoffmann, com 8 gravuras cunhadas a partir de moldes originais de Callot (1820 to Portuguese such discussions were fundamental, as I try to present in this article.

  3. Lidový text a grafika

    OpenAIRE

    Lukš, Jiří

    2015-01-01

    The dissertation "The Folk Text and Graphic Art" studies a song as a topic for graphic and book production. Within the praktical part of the dissertation the author works up a graphic design of a original song-book, which represent his former music band's texts. He surveys the clash of today's fashionable music trends with folk traditions in his region and asks a question about the character of the contemporary folk song. The author's song-book is one of answers. On the base of this effort he...

  4. Quality Inspection of Printed Texts

    DEFF Research Database (Denmark)

    Pedersen, Jesper Ballisager; Nasrollahi, Kamal; Moeslund, Thomas B.

    2016-01-01

    Inspecting the quality of printed texts has its own importance in many industrial applications. To do so, this paper proposes a grading system which evaluates the performance of the printing task using some quality measures for each character and symbols. The purpose of these grading system is two......-folded: for costumers of the printing and verification system, the overall grade used to verify if the text is of sufficient quality, while for printer's manufacturer, the detailed character/symbols grades and quality measurements are used for the improvement and optimization of the printing task. The proposed system...

  5. Text mining in livestock animal science: introducing the potential of text mining to animal sciences.

    Science.gov (United States)

    Sahadevan, S; Hofmann-Apitius, M; Schellander, K; Tesfaye, D; Fluck, J; Friedrich, C M

    2012-10-01

    In biological research, establishing the prior art by searching and collecting information already present in the domain has equal importance as the experiments done. To obtain a complete overview about the relevant knowledge, researchers mainly rely on 2 major information sources: i) various biological databases and ii) scientific publications in the field. The major difference between the 2 information sources is that information from databases is available, typically well structured and condensed. The information content in scientific literature is vastly unstructured; that is, dispersed among the many different sections of scientific text. The traditional method of information extraction from scientific literature occurs by generating a list of relevant publications in the field of interest and manually scanning these texts for relevant information, which is very time consuming. It is more than likely that in using this "classical" approach the researcher misses some relevant information mentioned in the literature or has to go through biological databases to extract further information. Text mining and named entity recognition methods have already been used in human genomics and related fields as a solution to this problem. These methods can process and extract information from large volumes of scientific text. Text mining is defined as the automatic extraction of previously unknown and potentially useful information from text. Named entity recognition (NER) is defined as the method of identifying named entities (names of real world objects; for example, gene/protein names, drugs, enzymes) in text. In animal sciences, text mining and related methods have been briefly used in murine genomics and associated fields, leaving behind other fields of animal sciences, such as livestock genomics. The aim of this work was to develop an information retrieval platform in the livestock domain focusing on livestock publications and the recognition of relevant data from

  6. Extraction of all propagation constants in a specified region from the transcendental equation of a dispersion relation using the Sakurai-Sugiura projection method.

    Science.gov (United States)

    Sato, Shingo; Shimada, Takao; Hasegawa, Koji

    2015-07-01

    A transcendental equation occurs when we compute the dispersion relations of an electromagnetic waveguide, such as a planar multilayer waveguide. Without an initial guess, the Sakurai-Sugiura projection method (SSM) can obtain solutions to the transcendental equation in a region bounded by a contour integral path in the complex plane. In this paper, a criterion employing the condition number of eigenvalues as a simple index to distinguish physical solutions from spurious ones in the SSM is presented, and a transcendental equation of a multilayer waveguide obtained by the transfer matrix method is solved by the SSM. Numerical results show the usefulness of the index and good agreement with the results of the argument principle method and Newton's method.

  7. Seductive Texts with Serious Intentions.

    Science.gov (United States)

    Nielsen, Harriet Bjerrum

    1995-01-01

    Debates whether a text claiming to have scientific value is using seduction irresponsibly at the expense of the truth, and discusses who is the subject and who is the object of such seduction. It argues that, rather than being an assault against scientific ethics, seduction is a necessary premise for a sensible conversation to take place. (GR)

  8. Text linguistics: memory and representation

    Directory of Open Access Journals (Sweden)

    Leonor Lopes Fávero

    2012-12-01

    Full Text Available Text Linguistics originates in Brazil in the 80s of the twentieth century. The first work that we know of is from 1981, authored by Prof. Ignacio Antonio Neiss, entitled Por uma gramática textua, which was followed by two other in 1983: Linguística textual: o que é e como se faz, by Prof. Luiz Antônio Marcuschi and Linguística textual: introdução by Leonor Lopes Favero and Ingedore Villaça Koch. Professor Neiss shows how initial attempts to textual linguistics, were generally related to structural and generative grammars. The work of Prof. Marcuschi focuses on the analysis of some text definitions and on the study of theoretical aspects in relation to their applicability. Leonor Lopes Favero and Ingedore V. Koch aim to provide the Brazilian reader with an overview of text linguistics in Europe, a recent branch of language science then. This work is part of the History of Linguistic Ideas, part of the Cultural History, which seeks to identify how at different times , a social reality is constructed, designed, and enlightened (Chartier, 1990.

  9. Text Comprehension Processes in Bilinguals.

    Science.gov (United States)

    1985-08-01

    work unit 03.05 (utilization of bilingual Navy personnel). The objective of this work unit is to understand and improve the communicative competence of...project aimed at understanding and improving the communicative competence of bilingual personnel. Background Chang (1984) found that the text

  10. Hebrew Text Database ETCBC4

    NARCIS (Netherlands)

    Roorda, D.; Talstra, Eep; van Peursen, Wido Th.; Dyk, Janet; Sikkel, Constantijn; Glanz, Oliver; Oosting, Reinoud; Kalkman, Gino

    2014-01-01

    The ETCBC database of the Hebrew Bible (formerly known as WIVU database), contains the scholarly text of the Hebrew Bible with linguistic markup. A previous version can be found in EASY (see the link below). The present dataset is an improvement in many ways: (A) it contains a new version of the

  11. AHP 45: REVIEW: TIBETAN LITERARY GENRES, TEXTS, AND TEXT TYPES

    Directory of Open Access Journals (Sweden)

    Zoe Tribur

    2017-03-01

    Full Text Available Following the quantitative tradition of sociolinguistic research pioneered by such scholars as William Labov, Walt Wolfram, and Penelope Eckert, Reynolds presents a detailed, coherent analysis of the social parameters behind a specific on-going sound change, the merger of syllable final bilabial nasal (m with aveolar coronal nasal (n, in one small farming community in Qinghai Province. His is certainly not the first such study on Tibetan sound change. It is also not the first study to investigate the merger of (m into (n, which is a prominent feature of so-called "farmer" dialects of Amdo Tibetan (Hua 2005. ...

  12. Text Plagi, detecció de text no citat

    OpenAIRE

    Martínez Vilanova, Albert

    2013-01-01

    El treball presenta una aplicació web que permet la detecció de text plagiat en un arxiu prèviament seleccionat per l'usuari gràcies a l'API Summon Service a través d'un entorn web. La idea sorgeix del director de projecte Jordi Duran Cals quan em va proposar una col·laboració amb la Universitat Oberta de Catalunya per desenvolupar aquesta nova eina, que donaria solució a la necessitat dels professors de poder detectar possibles plagis. Amb la nostra aplicació hem complert l'objectiu principa...

  13. Social Media Text Classification by Enhancing Well-Formed Text Trained Model

    Directory of Open Access Journals (Sweden)

    Phat Jotikabukkana

    2016-09-01

    Full Text Available Social media are a powerful communication tool in our era of digital information. The large amount of user-generated data is a useful novel source of data, even though it is not easy to extract the treasures from this vast and noisy trove. Since classification is an important part of text mining, many techniques have been proposed to classify this kind of information. We developed an effective technique of social media text classification by semi-supervised learning utilizing an online news source consisting of well-formed text. The computer first automatically extracts news categories, well-categorized by publishers, as classes for topic classification. A bag of words taken from news articles provides the initial keywords related to their category in the form of word vectors. The principal task is to retrieve a set of new productive keywords. Term Frequency-Inverse Document Frequency weighting (TF-IDF and Word Article Matrix (WAM are used as main methods. A modification of WAM is recomputed until it becomes the most effective model for social media text classification. The key success factor was enhancing our model with effective keywords from social media. A promising result of 99.50% accuracy was achieved, with more than 98.5% of Precision, Recall, and F-measure after updating the model three times.

  14. Text Mining in Biomedical Domain with Emphasis on Document Clustering

    Science.gov (United States)

    2017-01-01

    Objectives With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. Methods This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Results Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Conclusions Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise. PMID:28875048

  15. Text Mining in Biomedical Domain with Emphasis on Document Clustering.

    Science.gov (United States)

    Renganathan, Vinaitheerthan

    2017-07-01

    With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.

  16. AGRICULTURAL USES OF SEAWEEDS EXTRACTS

    Directory of Open Access Journals (Sweden)

    Monica Popescu

    2013-12-01

    Full Text Available Marine bioactive substances extracted from seaweed are currently used in food, animal feed, as a raw material in the industry and have therapeutic applications. Most of the products based on marine algae are extracted from Brown algae Ascophyllum nodosum. The use of extracts of seaweed in agriculture is beneficial because the amount of chemical fertilizers and obtaining organic yield.

  17. Measurement of the jet mass in highly boosted [Formula: see text] events from pp collisions at [Formula: see text][Formula: see text].

    Science.gov (United States)

    Sirunyan, A M; Tumasyan, A; Adam, W; Asilar, E; Bergauer, T; Brandstetter, J; Brondolin, E; Dragicevic, M; Erö, J; Flechl, M; Friedl, M; Frühwirth, R; Ghete, V M; Hartl, C; Hörmann, N; Hrubec, J; Jeitler, M; König, A; Krätschmer, I; Liko, D; Matsushita, T; Mikulec, I; Rabady, D; Rad, N; Rahbaran, B; Rohringer, H; Schieck, J; Strauss, J; Waltenberger, W; Wulz, C-E; Dvornikov, O; Makarenko, V; Mossolov, V; Suarez Gonzalez, J; Zykunov, V; Shumeiko, N; Alderweireldt, S; De Wolf, E A; Janssen, X; Lauwers, J; Van De Klundert, M; Van Haevermaet, H; Van Mechelen, P; Van Remortel, N; Van Spilbeeck, A; Abu Zeid, S; Blekman, F; D'Hondt, J; Daci, N; De Bruyn, I; Deroover, K; Lowette, S; Moortgat, S; Moreels, L; Olbrechts, A; Python, Q; Skovpen, K; Tavernier, S; Van Doninck, W; Van Mulders, P; Van Parijs, I; Brun, H; Clerbaux, B; De Lentdecker, G; Delannoy, H; Fasanella, G; Favart, L; Goldouzian, R; Grebenyuk, A; Karapostoli, G; Lenzi, T; Léonard, A; Luetic, J; Maerschalk, T; Marinov, A; Randle-Conde, A; Seva, T; Vander Velde, C; Vanlaer, P; Vannerom, D; Yonamine, R; Zenoni, F; Zhang, F; Cimmino, A; Cornelis, T; Dobur, D; Fagot, A; Gul, M; Khvastunov, I; Poyraz, D; Salva, S; Schöfbeck, R; Tytgat, M; Van Driessche, W; Yazgan, E; Zaganidis, N; Bakhshiansohi, H; Beluffi, C; Bondu, O; Brochet, S; Bruno, G; Caudron, A; De Visscher, S; Delaere, C; Delcourt, M; Francois, B; Giammanco, A; Jafari, A; Komm, M; Krintiras, G; Lemaitre, V; Magitteri, A; Mertens, A; Musich, M; Piotrzkowski, K; Quertenmont, L; Selvaggi, M; Vidal Marono, M; Wertz, S; Beliy, N; Aldá Júnior, W L; Alves, F L; Alves, G A; Brito, L; Hensel, C; Moraes, A; Pol, M E; Rebello Teles, P; Belchior Batista Das Chagas, E; Carvalho, W; Chinellato, J; Custódio, A; Da Costa, E M; Da Silveira, G G; De Jesus Damiao, D; De Oliveira Martins, C; Fonseca De Souza, S; Huertas Guativa, L M; Malbouisson, H; Matos Figueiredo, D; Mora Herrera, C; Mundim, L; Nogima, H; Prado Da Silva, W L; Santoro, A; Sznajder, A; Tonelli Manganote, E J; Torres Da Silva De Araujo, F; Vilela Pereira, A; Ahuja, S; Bernardes, C A; Dogra, S; Fernandez Perez Tomei, T R; Gregores, E M; Mercadante, P G; Moon, C S; Novaes, S F; Padula, Sandra S; Romero Abad, D; Ruiz Vargas, J C; Aleksandrov, A; Hadjiiska, R; Iaydjiev, P; Rodozov, M; Stoykova, S; Sultanov, G; Vutova, M; Dimitrov, A; Glushkov, I; Litov, L; Pavlov, B; Petkov, P; Fang, W; Ahmad, M; Bian, J G; Chen, G M; Chen, H S; Chen, M; Chen, Y; Cheng, T; Jiang, C H; Leggat, D; Liu, Z; Romeo, F; Ruan, M; Shaheen, S M; Spiezia, A; Tao, J; Wang, C; Wang, Z; Zhang, H; Zhao, J; Ban, Y; Chen, G; Li, Q; Liu, S; Mao, Y; Qian, S J; Wang, D; Xu, Z; Avila, C; Cabrera, A; Chaparro Sierra, L F; Florez, C; Gomez, J P; González Hernández, C F; Ruiz Alvarez, J D; Sanabria, J C; Godinovic, N; Lelas, D; Puljak, I; Ribeiro Cipriano, P M; Sculac, T; Antunovic, Z; Kovac, M; Brigljevic, V; Ferencek, D; Kadija, K; Mesic, B; Susa, T; Attikis, A; Mavromanolakis, G; Mousa, J; Nicolaou, C; Ptochos, F; Razis, P A; Rykaczewski, H; Tsiakkouri, D; Finger, M; Finger, M; Carrera Jarrin, E; Abdelalim, A A; Mohammed, Y; Salama, E; Kadastik, M; Perrini, L; Raidal, M; Tiko, A; Veelken, C; Eerola, P; Pekkanen, J; Voutilainen, M; Härkönen, J; Järvinen, T; Karimäki, V; Kinnunen, R; Lampén, T; Lassila-Perini, K; Lehti, S; Lindén, T; Luukka, P; Tuominiemi, J; Tuovinen, E; Wendland, L; Talvitie, J; Tuuva, T; Besancon, M; Couderc, F; Dejardin, M; Denegri, D; Fabbro, B; Faure, J L; Favaro, C; Ferri, F; Ganjour, S; Ghosh, S; Givernaud, A; Gras, P; Hamel de Monchenault, G; Jarry, P; Kucher, I; Locci, E; Machet, M; Malcles, J; Rander, J; Rosowsky, A; Titov, M; Abdulsalam, A; Antropov, I; Baffioni, S; Beaudette, F; Busson, P; Cadamuro, L; Chapon, E; Charlot, C; Davignon, O; Granier de Cassagnac, R; Jo, M; Lisniak, S; Miné, P; Nguyen, M; Ochando, C; Ortona, G; Paganini, P; Pigard, P; Regnard, S; Salerno, R; Sirois, Y; Stahl Leiton, A G; Strebler, T; Yilmaz, Y; Zabi, A; Zghiche, A; Agram, J-L; Andrea, J; Aubin, A; Bloch, D; Brom, J-M; Buttignol, M; Chabert, E C; Chanon, N; Collard, C; Conte, E; Coubez, X; Fontaine, J-C; Gelé, D; Goerlach, U; Le Bihan, A-C; Van Hove, P; Gadrat, S; Beauceron, S; Bernet, C; Boudoul, G; Carrillo Montoya, C A; Chierici, R; Contardo, D; Courbon, B; Depasse, P; El Mamouni, H; Fay, J; Gascon, S; Gouzevitch, M; Grenier, G; Ille, B; Lagarde, F; Laktineh, I B; Lethuillier, M; Mirabito, L; Pequegnot, A L; Perries, S; Popov, A; Sabes, D; Sordini, V; Vander Donckt, M; Verdier, P; Viret, S; Khvedelidze, A; Tsamalaidze, Z; Autermann, C; Beranek, S; Feld, L; Kiesel, M K; Klein, K; Lipinski, M; Preuten, M; Schomakers, C; Schulz, J; Verlage, T; Albert, A; Brodski, M; Dietz-Laursonn, E; Duchardt, D; Endres, M; Erdmann, M; Erdweg, S; Esch, T; Fischer, R; Güth, A; Hamer, M; Hebbeker, T; Heidemann, C; Hoepfner, K; Knutzen, S; Merschmeyer, M; Meyer, A; Millet, P; Mukherjee, S; Olschewski, M; Padeken, K; Pook, T; Radziej, M; Reithler, H; Rieger, M; Scheuch, F; Sonnenschein, L; Teyssier, D; Thüer, S; Cherepanov, V; Flügge, G; Kargoll, B; Kress, T; Künsken, A; Lingemann, J; Müller, T; Nehrkorn, A; Nowack, A; Pistone, C; Pooth, O; Stahl, A; Aldaya Martin, M; Arndt, T; Asawatangtrakuldee, C; Beernaert, K; Behnke, O; Behrens, U; Bin Anuar, A A; Borras, K; Campbell, A; Connor, P; Contreras-Campana, C; Costanza, F; Diez Pardos, C; Dolinska, G; Eckerlin, G; Eckstein, D; Eichhorn, T; Eren, E; Gallo, E; Garay Garcia, J; Geiser, A; Gizhko, A; Grados Luyando, J M; Grohsjean, A; Gunnellini, P; Harb, A; Hauk, J; Hempel, M; Jung, H; Kalogeropoulos, A; Karacheban, O; Kasemann, M; Keaveney, J; Kleinwort, C; Korol, I; Krücker, D; Lange, W; Lelek, A; Lenz, T; Leonard, J; Lipka, K; Lobanov, A; Lohmann, W; Mankel, R; Melzer-Pellmann, I-A; Meyer, A B; Mittag, G; Mnich, J; Mussgiller, A; Pitzl, D; Placakyte, R; Raspereza, A; Roland, B; Sahin, M Ö; Saxena, P; Schoerner-Sadenius, T; Spannagel, S; Stefaniuk, N; Van Onsem, G P; Walsh, R; Wissing, C; Blobel, V; Centis Vignali, M; Draeger, A R; Dreyer, T; Garutti, E; Gonzalez, D; Haller, J; Hoffmann, M; Junkes, A; Klanner, R; Kogler, R; Kovalchuk, N; Lapsien, T; Marchesini, I; Marconi, D; Meyer, M; Niedziela, M; Nowatschin, D; Pantaleo, F; Peiffer, T; Perieanu, A; Scharf, C; Schleper, P; Schmidt, A; Schumann, S; Schwandt, J; Stadie, H; Steinbrück, G; Stober, F M; Stöver, M; Tholen, H; Troendle, D; Usai, E; Vanelderen, L; Vanhoefer, A; Vormwald, B; Akbiyik, M; Barth, C; Baur, S; Baus, C; Berger, J; Butz, E; Caspart, R; Chwalek, T; Colombo, F; De Boer, W; Dierlamm, A; Fink, S; Freund, B; Friese, R; Giffels, M; Gilbert, A; Goldenzweig, P; Haitz, D; Hartmann, F; Heindl, S M; Husemann, U; Katkov, I; Kudella, S; Mildner, H; Mozer, M U; Müller, Th; Plagge, M; Quast, G; Rabbertz, K; Röcker, S; Roscher, F; Schröder, M; Shvetsov, I; Sieber, G; Simonis, H J; Ulrich, R; Wayand, S; Weber, M; Weiler, T; Williamson, S; Wöhrmann, C; Wolf, R; Anagnostou, G; Daskalakis, G; Geralis, T; Giakoumopoulou, V A; Kyriakis, A; Loukas, D; Topsis-Giotis, I; Kesisoglou, S; Panagiotou, A; Saoulidou, N; Tziaferi, E; Evangelou, I; Flouris, G; Foudas, C; Kokkas, P; Loukas, N; Manthos, N; Papadopoulos, I; Paradas, E; Filipovic, N; Pasztor, G; Bencze, G; Hajdu, C; Horvath, D; Sikler, F; Veszpremi, V; Vesztergombi, G; Zsigmond, A J; Beni, N; Czellar, S; Karancsi, J; Makovec, A; Molnar, J; Szillasi, Z; Bartók, M; Raics, P; Trocsanyi, Z L; Ujvari, B; Komaragiri, J R; Bahinipati, S; Bhowmik, S; Choudhury, S; Mal, P; Mandal, K; Nayak, A; Sahoo, D K; Sahoo, N; Swain, S K; Bansal, S; Beri, S B; Bhatnagar, V; Chawla, R; Bhawandeep, U; Kalsi, A K; Kaur, A; Kaur, M; Kumar, R; Kumari, P; Mehta, A; Mittal, M; Singh, J B; Walia, G; Kumar, Ashok; Bhardwaj, A; Choudhary, B C; Garg, R B; Keshri, S; Malhotra, S; Naimuddin, M; Ranjan, K; Sharma, R; Sharma, V; Bhattacharya, R; Bhattacharya, S; Chatterjee, K; Dey, S; Dutt, S; Dutta, S; Ghosh, S; Majumdar, N; Modak, A; Mondal, K; Mukhopadhyay, S; Nandan, S; Purohit, A; Roy, A; Roy, D; Roy Chowdhury, S; Sarkar, S; Sharan, M; Thakur, S; Behera, P K; Chudasama, R; Dutta, D; Jha, V; Kumar, V; Mohanty, A K; Netrakanti, P K; Pant, L M; Shukla, P; Topkar, A; Aziz, T; Dugad, S; Kole, G; Mahakud, B; Mitra, S; Mohanty, G B; Parida, B; Sur, N; Sutar, B; Banerjee, S; Dewanjee, R K; Ganguly, S; Guchait, M; Jain, Sa; Kumar, S; Maity, M; Majumder, G; Mazumdar, K; Sarkar, T; Wickramage, N; Chauhan, S; Dube, S; Hegde, V; Kapoor, A; Kothekar, K; Pandey, S; Rane, A; Sharma, S; Chenarani, S; Eskandari Tadavani, E; Etesami, S M; Khakzad, M; Mohammadi Najafabadi, M; Naseri, M; Paktinat Mehdiabadi, S; Rezaei Hosseinabadi, F; Safarzadeh, B; Zeinali, M; Felcini, M; Grunewald, M; Abbrescia, M; Calabria, C; Caputo, C; Colaleo, A; Creanza, D; Cristella, L; De Filippis, N; De Palma, M; Fiore, L; Iaselli, G; Maggi, G; Maggi, M; Miniello, G; My, S; Nuzzo, S; Pompili, A; Pugliese, G; Radogna, R; Ranieri, A; Selvaggi, G; Sharma, A; Silvestris, L; Venditti, R; Verwilligen, P; Abbiendi, G; Battilana, C; Bonacorsi, D; Braibant-Giacomelli, S; Brigliadori, L; Campanini, R; Capiluppi, P; Castro, A; Cavallo, F R; Chhibra, S S; Codispoti, G; Cuffiani, M; Dallavalle, G M; Fabbri, F; Fanfani, A; Fasanella, D; Giacomelli, P; Grandi, C; Guiducci, L; Marcellini, S; Masetti, G; Montanari, A; Navarria, F L; Perrotta, A; Rossi, A M; Rovelli, T; Siroli, G P; Tosi, N; Albergo, S; Costa, S; Di Mattia, A; Giordano, F; Potenza, R; Tricomi, A; Tuve, C; Barbagli, G; Ciulli, V; Civinini, C; D'Alessandro, R; Focardi, E; Lenzi, P; Meschini, M; Paoletti, S; Russo, L; Sguazzoni, G; Strom, D; Viliani, L; Benussi, L; Bianco, S; Fabbri, F; Piccolo, D; Primavera, F; Calvelli, V; Ferro, F; Monge, M R; Robutti, E; Tosi, S; Brianza, L; Brivio, F; Ciriolo, V; Dinardo, M E; Fiorendi, S; Gennai, S; Ghezzi, A; Govoni, P; Malberti, M; Malvezzi, S; Manzoni, R A; Menasce, D; Moroni, L; Paganoni, M; Pedrini, D; Pigazzini, S; Ragazzi, S; Tabarelli de Fatis, T; Buontempo, S; Cavallo, N; De Nardo, G; Di Guida, S; Esposito, M; Fabozzi, F; Fienga, F; Iorio, A O M; Lanza, G; Lista, L; Meola, S; Paolucci, P; Sciacca, C; Thyssen, F; Azzi, P; Bacchetta, N; Benato, L; Bisello, D; Boletti, A; Carlin, R; Carvalho Antunes de Oliveira, A; Checchia, P; Dall'Osso, M; De Castro Manzano, P; Dorigo, T; Dosselli, U; Gasparini, F; Gasparini, U; Gozzelino, A; Lacaprara, S; Margoni, M; Meneguzzo, A T; Pazzini, J; Pozzobon, N; Ronchese, P; Simonetto, F; Torassa, E; Zanetti, M; Zotto, P; Zumerle, G; Braghieri, A; Fallavollita, F; Magnani, A; Montagna, P; Ratti, S P; Re, V; Riccardi, C; Salvini, P; Vai, I; Vitulo, P; Alunni Solestizi, L; Bilei, G M; Ciangottini, D; Fanò, L; Lariccia, P; Leonardi, R; Mantovani, G; Menichelli, M; Saha, A; Santocchia, A; Androsov, K; Azzurri, P; Bagliesi, G; Bernardini, J; Boccali, T; Castaldi, R; Ciocci, M A; Dell'Orso, R; Donato, S; Fedi, G; Giassi, A; Grippo, M T; Ligabue, F; Lomtadze, T; Martini, L; Messineo, A; Palla, F; Rizzi, A; Savoy-Navarro, A; Spagnolo, P; Tenchini, R; Tonelli, G; Venturi, A; Verdini, P G; Barone, L; Cavallari, F; Cipriani, M; Del Re, D; Diemoz, M; Gelli, S; Longo, E; Margaroli, F; Marzocchi, B; Meridiani, P; Organtini, G; Paramatti, R; Preiato, F; Rahatlou, S; Rovelli, C; Santanastasio, F; Amapane, N; Arcidiacono, R; Argiro, S; Arneodo, M; Bartosik, N; Bellan, R; Biino, C; Cartiglia, N; Cenna, F; Costa, M; Covarelli, R; Degano, A; Demaria, N; Finco, L; Kiani, B; Mariotti, C; Maselli, S; Migliore, E; Monaco, V; Monteil, E; Monteno, M; Obertino, M M; Pacher, L; Pastrone, N; Pelliccioni, M; Pinna Angioni, G L; Ravera, F; Romero, A; Ruspa, M; Sacchi, R; Shchelina, K; Sola, V; Solano, A; Staiano, A; Traczyk, P; Belforte, S; Casarsa, M; Cossutti, F; Della Ricca, G; Zanetti, A; Kim, D H; Kim, G N; Kim, M S; Lee, S; Lee, S W; Oh, Y D; Sekmen, S; Son, D C; Yang, Y C; Lee, A; Kim, H; Brochero Cifuentes, J A; Kim, T J; Cho, S; Choi, S; Go, Y; Gyun, D; Ha, S; Hong, B; Jo, Y; Kim, Y; Lee, K; Lee, K S; Lee, S; Lim, J; Park, S K; Roh, Y; Almond, J; Kim, J; Lee, H; Oh, S B; Radburn-Smith, B C; Seo, S H; Yang, U K; Yoo, H D; Yu, G B; Choi, M; Kim, H; Kim, J H; Lee, J S H; Park, I C; Ryu, G; Ryu, M S; Choi, Y; Goh, J; Hwang, C; Lee, J; Yu, I; Dudenas, V; Juodagalvis, A; Vaitkus, J; Ahmed, I; Ibrahim, Z A; Md Ali, M A B; Mohamad Idris, F; Wan Abdullah, W A T; Yusli, M N; Zolkapli, Z; Castilla-Valdez, H; De La Cruz-Burelo, E; Heredia-De La Cruz, I; Hernandez-Almada, A; Lopez-Fernandez, R; Magaña Villalba, R; Mejia Guisao, J; Sanchez-Hernandez, A; Carrillo Moreno, S; Oropeza Barrera, C; Vazquez Valencia, F; Carpinteyro, S; Pedraza, I; Salazar Ibarguen, H A; Uribe Estrada, C; Morelos Pineda, A; Krofcheck, D; Butler, P H; Ahmad, A; Ahmad, M; Hassan, Q; Hoorani, H R; Khan, W A; Saddique, A; Shah, M A; Shoaib, M; Waqas, M; Bialkowska, H; Bluj, M; Boimska, B; Frueboes, T; Górski, M; Kazana, M; Nawrocki, K; Romanowska-Rybinska, K; Szleper, M; Zalewski, P; Bunkowski, K; Byszuk, A; Doroba, K; Kalinowski, A; Konecki, M; Krolikowski, J; Misiura, M; Olszewski, M; Walczak, M; Bargassa, P; Beirão Da Cruz E Silva, C; Calpas, B; Di Francesco, A; Faccioli, P; Ferreira Parracho, P G; Gallinaro, M; Hollar, J; Leonardo, N; Lloret Iglesias, L; Nemallapudi, M V; Rodrigues Antunes, J; Seixas, J; Toldaiev, O; Vadruccio, D; Varela, J; Afanasiev, S; Bunin, P; Gavrilenko, M; Golutvin, I; Gorbunov, I; Kamenev, A; Karjavin, V; Lanev, A; Malakhov, A; Matveev, V; Palichik, V; Perelygin, V; Shmatov, S; Shulha, S; Skatchkov, N; Smirnov, V; Voytishin, N; Zarubin, A; Chtchipounov, L; Golovtsov, V; Ivanov, Y; Kim, V; Kuznetsova, E; Murzin, V; Oreshkin, V; Sulimov, V; Vorobyev, A; Andreev, Yu; Dermenev, A; Gninenko, S; Golubev, N; Karneyeu, A; Kirsanov, M; Krasnikov, N; Pashenkov, A; Tlisov, D; Toropin, A; Epshteyn, V; Gavrilov, V; Lychkovskaya, N; Popov, V; Pozdnyakov, I; Safronov, G; Spiridonov, A; Toms, M; Vlasov, E; Zhokin, A; Aushev, T; Bylinkin, A; Chistov, R; Polikarpov, S; Zhemchugov, E; Andreev, V; Azarkin, M; Dremin, I; Kirakosyan, M; Leonidov, A; Terkulov, A; Baskakov, A; Belyaev, A; Boos, E; Bunichev, V; Dubinin, M; Dudko, L; Ershov, A; Klyukhin, V; Korneeva, N; Lokhtin, I; Miagkov, I; Obraztsov, S; Perfilov, M; Savrin, V; Volkov, P; Blinov, V; Skovpen, Y; Shtol, D; Azhgirey, I; Bayshev, I; Bitioukov, S; Elumakhov, D; Kachanov, V; Kalinin, A; Konstantinov, D; Krychkine, V; Petrov, V; Ryutin, R; Sobol, A; Troshin, S; Tyurin, N; Uzunian, A; Volkov, A; Adzic, P; Cirkovic, P; Devetak, D; Dordevic, M; Milosevic, J; Rekovic, V; Alcaraz Maestre, J; Barrio Luna, M; Calvo, E; Cerrada, M; Chamizo Llatas, M; Colino, N; De La Cruz, B; Delgado Peris, A; Escalante Del Valle, A; Fernandez, C; Fernández Ramos, J P; Flix, J; Fouz, M C; Garcia-Abia, P; Gonzalez Lopez, O; Goy Lopez, S; Hernandez, J M; Josa, M I; Navarro De Martino, E; Pérez-Calero Yzquierdo, A; Puerta Pelayo, J; Quintario Olmeda, A; Redondo, I; Romero, L; Soares, M S; de Trocóniz, J F; Missiroli, M; Moran, D; Cuevas, J; Fernandez Menendez, J; Gonzalez Caballero, I; González Fernández, J R; Palencia Cortezon, E; Sanchez Cruz, S; Suárez Andrés, I; Vischia, P; Vizan Garcia, J M; Cabrillo, I J; Calderon, A; Curras, E; Fernandez, M; Garcia-Ferrero, J; Gomez, G; Lopez Virto, A; Marco, J; Martinez Rivero, C; Matorras, F; Piedra Gomez, J; Rodrigo, T; Ruiz-Jimeno, A; Scodellaro, L; Trevisani, N; Vila, I; Vilar Cortabitarte, R; Abbaneo, D; Auffray, E; Auzinger, G; Baillon, P; Ball, A H; Barney, D; Bloch, P; Bocci, A; Botta, C; Camporesi, T; Castello, R; Cepeda, M; Cerminara, G; Chen, Y; d'Enterria, D; Dabrowski, A; Daponte, V; David, A; De Gruttola, M; De Roeck, A; Di Marco, E; Dobson, M; Dorney, B; du Pree, T; Duggan, D; Dünser, M; Dupont, N; Elliott-Peisert, A; Everaerts, P; Fartoukh, S; Franzoni, G; Fulcher, J; Funk, W; Gigi, D; Gill, K; Girone, M; Glege, F; Gulhan, D; Gundacker, S; Guthoff, M; Harris, P; Hegeman, J; Innocente, V; Janot, P; Kieseler, J; Kirschenmann, H; Knünz, V; Kornmayer, A; Kortelainen, M J; Kousouris, K; Krammer, M; Lange, C; Lecoq, P; Lourenço, C; Lucchini, M T; Malgeri, L; Mannelli, M; Martelli, A; Meijers, F; Merlin, J A; Mersi, S; Meschi, E; Milenovic, P; Moortgat, F; Morovic, S; Mulders, M; Neugebauer, H; Orfanelli, S; Orsini, L; Pape, L; Perez, E; Peruzzi, M; Petrilli, A; Petrucciani, G; Pfeiffer, A; Pierini, M; Racz, A; Reis, T; Rolandi, G; Rovere, M; Sakulin, H; Sauvan, J B; Schäfer, C; Schwick, C; Seidel, M; Sharma, A; Silva, P; Sphicas, P; Steggemann, J; Stoye, M; Takahashi, Y; Tosi, M; Treille, D; Triossi, A; Tsirou, A; Veckalns, V; Veres, G I; Verweij, M; Wardle, N; Wöhri, H K; Zagozdzinska, A; Zeuner, W D; Bertl, W; Deiters, K; Erdmann, W; Horisberger, R; Ingram, Q; Kaestli, H C; Kotlinski, D; Langenegger, U; Rohe, T; Wiederkehr, S A; Bachmair, F; Bäni, L; Bianchini, L; Casal, B; Dissertori, G; Dittmar, M; Donegà, M; Grab, C; Heidegger, C; Hits, D; Hoss, J; Kasieczka, G; Lustermann, W; Mangano, B; Marionneau, M; Martinez Ruiz Del Arbol, P; Masciovecchio, M; Meinhard, M T; Meister, D; Micheli, F; Musella, P; Nessi-Tedaldi, F; Pandolfi, F; Pata, J; Pauss, F; Perrin, G; Perrozzi, L; Quittnat, M; Rossini, M; Schönenberger, M; Starodumov, A; Tavolaro, V R; Theofilatos, K; Wallny, R; Aarrestad, T K; Amsler, C; Caminada, L; Canelli, M F; De Cosa, A; Galloni, C; Hinzmann, A; Hreus, T; Kilminster, B; Ngadiuba, J; Pinna, D; Rauco, G; Robmann, P; Salerno, D; Seitz, C; Yang, Y; Zucchetta, A; Candelise, V; Doan, T H; Jain, Sh; Khurana, R; Konyushikhin, M; Kuo, C M; Lin, W; Pozdnyakov, A; Yu, S S; Kumar, Arun; Chang, P; Chang, Y H; Chao, Y; Chen, K F; Chen, P H; Fiori, F; Hou, W-S; Hsiung, Y; Liu, Y F; Lu, R-S; Miñano Moya, M; Paganis, E; Psallidas, A; Tsai, J F; Asavapibhop, B; Singh, G; Sri Manobhas, N; Suwonjandee, N; Adiguzel, A; Bakirci, M N; Damarseckin, S; Demiroglu, Z S; Dozen, C; Eskut, E; Girgis, S; Gokbulut, G; Guler, Y; Hos, I; Kangal, E E; Kara, O; Kiminsu, U; Oglakci, M; Onengut, G; Ozdemir, K; Ozturk, S; Polatoz, A; Sunar Cerci, D; Turkcapar, S; Zorbakir, I S; Zorbilmez, C; Bilin, B; Bilmis, S; Isildak, B; Karapinar, G; Yalvac, M; Zeyrek, M; Gülmez, E; Kaya, M; Kaya, O; Yetkin, E A; Yetkin, T; Cakir, A; Cankocak, K; Sen, S; Grynyov, B; Levchuk, L; Sorokin, P; Aggleton, R; Ball, F; Beck, L; Brooke, J J; Burns, D; Clement, E; Cussans, D; Flacher, H; Goldstein, J; Grimes, M; Heath, G P; Heath, H F; Jacob, J; Kreczko, L; Lucas, C; Newbold, D M; Paramesvaran, S; Poll, A; Sakuma, T; Seif El Nasr-Storey, S; Smith, D; Smith, V J; Bell, K W; Belyaev, A; Brew, C; Brown, R M; Calligaris, L; Cieri, D; Cockerill, D J A; Coughlan, J A; Harder, K; Harper, S; Olaiya, E; Petyt, D; Shepherd-Themistocleous, C H; Thea, A; Tomalin, I R; Williams, T; Baber, M; Bainbridge, R; Buchmuller, O; Bundock, A; Burton, D; Casasso, S; Citron, M; Colling, D; Corpe, L; Dauncey, P; Davies, G; De Wit, A; Della Negra, M; Di Maria, R; Dunne, P; Elwood, A; Futyan, D; Haddad, Y; Hall, G; Iles, G; James, T; Lane, R; Laner, C; Lucas, R; Lyons, L; Magnan, A-M; Malik, S; Mastrolorenzo, L; Nash, J; Nikitenko, A; Pela, J; Penning, B; Pesaresi, M; Raymond, D M; Richards, A; Rose, A; Scott, E; Seez, C; Summers, S; Tapper, A; Uchida, K; Vazquez Acosta, M; Virdee, T; Wright, J; Zenz, S C; Cole, J E; Hobson, P R; Khan, A; Kyberd, P; Reid, I D; Symonds, P; Teodorescu, L; Turner, M; Borzou, A; Call, K; Dittmann, J; Hatakeyama, K; Liu, H; Pastika, N; Bartek, R; Dominguez, A; Buccilli, A; Cooper, S I; Henderson, C; Rumerio, P; West, C; Arcaro, D; Avetisyan, A; Bose, T; Gastler, D; Rankin, D; Richardson, C; Rohlf, J; Sulak, L; Zou, D; Benelli, G; Cutts, D; Garabedian, A; Hakala, J; Heintz, U; Hogan, J M; Jesus, O; Kwok, K H M; Laird, E; Landsberg, G; Mao, Z; Narain, M; Piperov, S; Sagir, S; Spencer, E; Syarif, R; Breedon, R; Burns, D; Calderon De La Barca Sanchez, M; Chauhan, S; Chertok, M; Conway, J; Conway, R; Cox, P T; Erbacher, R; Flores, C; Funk, G; Gardner, M; Ko, W; Lander, R; Mclean, C; Mulhearn, M; Pellett, D; Pilot, J; Shalhout, S; Shi, M; Smith, J; Squires, M; Stolp, D; Tos, K; Tripathi, M; Bachtis, M; Bravo, C; Cousins, R; Dasgupta, A; Florent, A; Hauser, J; Ignatenko, M; Mccoll, N; Saltzberg, D; Schnaible, C; Valuev, V; Weber, M; Bouvier, E; Burt, K; Clare, R; Ellison, J; Gary, J W; Ghiasi Shirazi, S M A; Hanson, G; Heilman, J; Jandir, P; Kennedy, E; Lacroix, F; Long, O R; Negrete, M Olmedo; Paneva, M I; Shrinivas, A; Si, W; Wei, H; Wimpenny, S; Yates, B R; Branson, J G; Cerati, G B; Cittolin, S; Derdzinski, M; Gerosa, R; Holzner, A; Klein, D; Krutelyov, V; Letts, J; Macneill, I; Olivito, D; Padhi, S; Pieri, M; Sani, M; Sharma, V; Simon, S; Tadel, M; Vartak, A; Wasserbaech, S; Welke, C; Wood, J; Würthwein, F; Yagil, A; Della Porta, G Zevi; Amin, N; Bhandari, R; Bradmiller-Feld, J; Campagnari, C; Dishaw, A; Dutta, V; Franco Sevilla, M; George, C; Golf, F; Gouskos, L; Gran, J; Heller, R; Incandela, J; Mullin, S D; Ovcharova, A; Qu, H; Richman, J; Stuart, D; Suarez, I; Yoo, J; Anderson, D; Bendavid, J; Bornheim, A; Bunn, J; Duarte, J; Lawhorn, J M; Mott, A; Newman, H B; Pena, C; Spiropulu, M; Vlimant, J R; Xie, S; Zhu, R Y; Andrews, M B; Ferguson, T; Paulini, M; Russ, J; Sun, M; Vogel, H; Vorobiev, I; Weinberg, M; Cumalat, J P; Ford, W T; Jensen, F; Johnson, A; Krohn, M; Leontsinis, S; Mulholland, T; Stenson, K; Wagner, S R; Alexander, J; Chaves, J; Chu, J; Dittmer, S; Mcdermott, K; Mirman, N; Nicolas Kaufman, G; Patterson, J R; Rinkevicius, A; Ryd, A; Skinnari, L; Soffi, L; Tan, S M; Tao, Z; Thom, J; Tucker, J; Wittich, P; Zientek, M; Winn, D; Abdullin, S; Albrow, M; Apollinari, G; Apresyan, A; Banerjee, S; Bauerdick, L A T; Beretvas, A; Berryhill, J; Bhat, P C; Bolla, G; Burkett, K; Butler, J N; Cheung, H W K; Chlebana, F; Cihangir, S; Cremonesi, M; Elvira, V D; Fisk, I; Freeman, J; Gottschalk, E; Gray, L; Green, D; Grünendahl, S; Gutsche, O; Hare, D; Harris, R M; Hasegawa, S; Hirschauer, J; Hu, Z; Jayatilaka, B; Jindariani, S; Johnson, M; Joshi, U; Klima, B; Kreis, B; Lammel, S; Linacre, J; Lincoln, D; Lipton, R; Liu, M; Liu, T; Lopes De Sá, R; Lykken, J; Maeshima, K; Magini, N; Marraffino, J M; Maruyama, S; Mason, D; McBride, P; Merkel, P; Mrenna, S; Nahn, S; O'Dell, V; Pedro, K; Prokofyev, O; Rakness, G; Ristori, L; Sexton-Kennedy, E; Soha, A; Spalding, W J; Spiegel, L; Stoynev, S; Strait, J; Strobbe, N; Taylor, L; Tkaczyk, S; Tran, N V; Uplegger, L; Vaandering, E W; Vernieri, C; Verzocchi, M; Vidal, R; Wang, M; Weber, H A; Whitbeck, A; Wu, Y; Acosta, D; Avery, P; Bortignon, P; Bourilkov, D; Brinkerhoff, A; Carnes, A; Carver, M; Curry, D; Das, S; Field, R D; Furic, I K; Konigsberg, J; Korytov, A; Low, J F; Ma, P; Matchev, K; Mei, H; Mitselmakher, G; Rank, D; Shchutska, L; Sperka, D; Thomas, L; Wang, J; Wang, S; Yelton, J; Linn, S; Markowitz, P; Martinez, G; Rodriguez, J L; Ackert, A; Adams, T; Askew, A; Bein, S; Hagopian, S; Hagopian, V; Johnson, K F; Kolberg, T; Prosper, H; Santra, A; Yohay, R; Baarmand, M M; Bhopatkar, V; Colafranceschi, S; Hohlmann, M; Noonan, D; Roy, T; Yumiceva, F; Adams, M R; Apanasevich, L; Berry, D; Betts, R R; Bucinskaite, I; Cavanaugh, R; Evdokimov, O; Gauthier, L; Gerber, C E; Hofman, D J; Jung, K; Sandoval Gonzalez, I D; Varelas, N; Wang, H; Wu, Z; Zakaria, M; Zhang, J; Bilki, B; Clarida, W; Dilsiz, K; Durgut, S; Gandrajula, R P; Haytmyradov, M; Khristenko, V; Merlo, J-P; Mermerkaya, H; Mestvirishvili, A; Moeller, A; Nachtman, J; Ogul, H; Onel, Y; Ozok, F; Penzo, A; Snyder, C; Tiras, E; Wetzel, J; Yi, K; Blumenfeld, B; Cocoros, A; Eminizer, N; Fehling, D; Feng, L; Gritsan, A V; Maksimovic, P; Roskes, J; Sarica, U; Swartz, M; Xiao, M; You, C; Al-Bataineh, A; Baringer, P; Bean, A; Boren, S; Bowen, J; Castle, J; Forthomme, L; Kenny Iii, R P; Khalil, S; Kropivnitskaya, A; Majumder, D; Mcbrayer, W; Murray, M; Sanders, S; Stringer, R; Tapia Takaki, J D; Wang, Q; Ivanov, A; Kaadze, K; Maravin, Y; Mohammadi, A; Saini, L K; Skhirtladze, N; Toda, S; Rebassoo, F; Wright, D; Anelli, C; Baden, A; Baron, O; Belloni, A; Calvert, B; Eno, S C; Ferraioli, C; Gomez, J A; Hadley, N J; Jabeen, S; Jeng, G Y; Kellogg, R G; Kunkle, J; Mignerey, A C; Ricci-Tam, F; Shin, Y H; Skuja, A; Tonjes, M B; Tonwar, S C; Abercrombie, D; Allen, B; Apyan, A; Azzolini, V; Barbieri, R; Baty, A; Bi, R; Bierwagen, K; Brandt, S; Busza, W; Cali, I A; D'Alfonso, M; Demiragli, Z; Gomez Ceballos, G; Goncharov, M; Hsu, D; Iiyama, Y; Innocenti, G M; Klute, M; Kovalskyi, D; Krajczar, K; Lai, Y S; Lee, Y-J; Levin, A; Luckey, P D; Maier, B; Marini, A C; Mcginn, C; Mironov, C; Narayanan, S; Niu, X; Paus, C; Roland, C; Roland, G; Salfeld-Nebgen, J; Stephans, G S F; Tatar, K; Velicanu, D; Wang, J; Wang, T W; Wyslouch, B; Benvenuti, A C; Chatterjee, R M; Evans, A; Hansen, P; Kalafut, S; Kao, S C; Kubota, Y; Lesko, Z; Mans, J; Nourbakhsh, S; Ruckstuhl, N; Rusack, R; Tambe, N; Turkewitz, J; Acosta, J G; Oliveros, S; Avdeeva, E; Bloom, K; Claes, D R; Fangmeier, C; Suarez, R Gonzalez; Kamalieddin, R; Kravchenko, I; Rodrigues, A Malta; Monroy, J; Siado, J E; Snow, G R; Stieger, B; Alyari, M; Dolen, J; Godshalk, A; Harrington, C; Iashvili, I; Kaisen, J; Nguyen, D; Parker, A; Rappoccio, S; Roozbahani, B; Alverson, G; Barberis, E; Hortiangtham, A; Massironi, A; Morse, D M; Nash, D; Orimoto, T; Teixeira De Lima, R; Trocino, D; Wang, R-J; Wood, D; Bhattacharya, S; Charaf, O; Hahn, K A; Kumar, A; Mucia, N; Odell, N; Pollack, B; Schmitt, M H; Sung, K; Trovato, M; Velasco, M; Dev, N; Hildreth, M; Hurtado Anampa, K; Jessop, C; Karmgard, D J; Kellams, N; Lannon, K; Marinelli, N; Meng, F; Mueller, C; Musienko, Y; Planer, M; Reinsvold, A; Ruchti, R; Rupprecht, N; Smith, G; Taroni, S; Wayne, M; Wolf, M; Woodard, A; Alimena, J; Antonelli, L; Bylsma, B; Durkin, L S; Flowers, S; Francis, B; Hart, A; Hill, C; Hughes, R; Ji, W; Liu, B; Luo, W; Puigh, D; Winer, B L; Wulsin, H W; Cooperstein, S; Driga, O; Elmer, P; Hardenbrook, J; Hebda, P; Lange, D; Luo, J; Marlow, D; Medvedeva, T; Mei, K; Ojalvo, I; Olsen, J; Palmer, C; Piroué, P; Stickland, D; Svyatkovskiy, A; Tully, C; Malik, S; Barker, A; Barnes, V E; Folgueras, S; Gutay, L; Jha, M K; Jones, M; Jung, A W; Khatiwada, A; Miller, D H; Neumeister, N; Schulte, J F; Shi, X; Sun, J; Wang, F; Xie, W; Parashar, N; Stupak, J; Adair, A; Akgun, B; Chen, Z; Ecklund, K M; Geurts, F J M; Guilbaud, M; Li, W; Michlin, B; Northup, M; Padley, B P; Roberts, J; Rorie, J; Tu, Z; Zabel, J; Betchart, B; Bodek, A; de Barbaro, P; Demina, R; Duh, Y T; Ferbel, T; Galanti, M; Garcia-Bellido, A; Han, J; Hindrichs, O; Khukhunaishvili, A; Lo, K H; Tan, P; Verzetti, M; Agapitos, A; Chou, J P; Gershtein, Y; Gómez Espinosa, T A; Halkiadakis, E; Heindl, M; Hughes, E; Kaplan, S; Kunnawalkam Elayavalli, R; Kyriacou, S; Lath, A; Nash, K; Osherson, M; Saka, H; Salur, S; Schnetzer, S; Sheffield, D; Somalwar, S; Stone, R; Thomas, S; Thomassen, P; Walker, M; Delannoy, A G; Foerster, M; Heideman, J; Riley, G; Rose, K; Spanier, S; Thapa, K; Bouhali, O; Celik, A; Dalchenko, M; De Mattia, M; Delgado, A; Dildick, S; Eusebi, R; Gilmore, J; Huang, T; Juska, E; Kamon, T; Mueller, R; Pakhotin, Y; Patel, R; Perloff, A; Perniè, L; Rathjens, D; Safonov, A; Tatarinov, A; Ulmer, K A; Akchurin, N; Cowden, C; Damgov, J; De Guio, F; Dragoiu, C; Dudero, P R; Faulkner, J; Gurpinar, E; Kunori, S; Lamichhane, K; Lee, S W; Libeiro, T; Peltola, T; Undleeb, S; Volobouev, I; Wang, Z; Greene, S; Gurrola, A; Janjam, R; Johns, W; Maguire, C; Melo, A; Ni, H; Sheldon, P; Tuo, S; Velkovska, J; Xu, Q; Arenton, M W; Barria, P; Cox, B; Goodell, J; Hirosky, R; Ledovskoy, A; Li, H; Neu, C; Sinthuprasith, T; Sun, X; Wang, Y; Wolfe, E; Xia, F; Clarke, C; Harr, R; Karchin, P E; Sturdy, J; Belknap, D A; Buchanan, J; Caillol, C; Dasu, S; Dodd, L; Duric, S; Gomber, B; Grothe, M; Herndon, M; Hervé, A; Klabbers, P; Lanaro, A; Levine, A; Long, K; Loveless, R; Perry, T; Pierro, G A; Polese, G; Ruggles, T; Savin, A; Smith, N; Smith, W H; Taylor, D; Woods, N

    2017-01-01

    The first measurement of the jet mass [Formula: see text] of top quark jets produced in [Formula: see text] events from pp collisions at [Formula: see text] [Formula: see text] is reported for the jet with the largest transverse momentum [Formula: see text] in highly boosted hadronic top quark decays. The data sample, collected with the CMS detector, corresponds to an integrated luminosity of 19.7[Formula: see text]. The measurement is performed in the lepton+jets channel in which the products of the semileptonic decay [Formula: see text] with [Formula: see text] where [Formula: see text] is an electron or muon, are used to select [Formula: see text] events with large Lorentz boosts. The products of the fully hadronic decay [Formula: see text] with [Formula: see text] are reconstructed using a single Cambridge-Aachen jet with distance parameter [Formula: see text], and [Formula: see text] [Formula: see text]. The [Formula: see text] cross section as a function of [Formula: see text] is unfolded at the particle level and is used to test the modelling of highly boosted top quark production. The peak position of the [Formula: see text] distribution is sensitive to the top quark mass [Formula: see text], and the data are used to extract a value of [Formula: see text] to assess this sensitivity.

  18. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...... the accuracy at the same time. The test example is classified using simpler and smaller model. The training examples in a particular cluster share the common vocabulary. At the time of clustering, we do not take into account the labels of the training examples. After the clusters have been created......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...

  19. Text Segmentation Using Exponential Models

    CERN Document Server

    Beeferman, D; Lafferty, G D; Beeferman, Doug; Berger, Adam; Lafferty, John

    1997-01-01

    This paper introduces a new statistical approach to partitioning text automatically into coherent segments. Our approach enlists both short-range and long-range language models to help it sniff out likely sites of topic changes in text. To aid its search, the system consults a set of simple lexical hints it has learned to associate with the presence of boundaries through inspection of a large corpus of annotated data. We also propose a new probabilistically motivated error metric for use by the natural language processing and information retrieval communities, intended to supersede precision and recall for appraising segmentation algorithms. Qualitative assessment of our algorithm as well as evaluation using this new metric demonstrate the effectiveness of our approach in two very different domains, Wall Street Journal articles and the TDT Corpus, a collection of newswire articles and broadcast news transcripts.

  20. Linguistic dating of biblical texts

    DEFF Research Database (Denmark)

    Young, Ian; Rezetko, Robert; Ehrensvärd, Martin Gustaf

    Since the beginning of critical scholarship biblical texts have been dated using linguistic evidence. In recent years this has become a controversial topic, especially with the publication of Ian Young (ed.), Biblical Hebrew: Studies in Chronology and Typology (2003). However, until now there has...... been no introduction and comprehensive study of the field. Volume 1 introduces the field of linguistic dating of biblical texts, particularly to intermediate and advanced students of biblical Hebrew who have a reasonable background in the language, having completed at least an introductory course...... in this volume are: What is it that makes Archaic Biblical Hebrew archaic , Early Biblical Hebrew early , and Late Biblical Hebrew late ? Does linguistic typology, i.e. different linguistic characteristics, convert easily and neatly into linguistic chronology, i.e. different historical origins? A large amount...

  1. Text Mining the Biomedical Literature

    Science.gov (United States)

    2007-11-05

    LECTURE NOTES IN COMPUTER SCIENCE Gelbukh, A; Sidorov, G; Guzman -Arenas, A. 1999. Use of a weighted topic hierarchy for document classification...matrix decomposition. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE 26 (3): 415-435. Kongovi, M; Guzman , JC; Dasigi, V. 2002. Text categorization: An...RECOGNITION, SPEECH AND IMAGE ANALYSIS 2905: 596-603. LECTURE NOTES IN COMPUTER SCIENCE Porter, AL; Kongthon, A; Lui , JC. 2002. Research profiling

  2. Active Discriminative Text Representation Learning

    OpenAIRE

    Zhang, Ye; Lease, Matthew; Wallace, Byron C.

    2016-01-01

    We propose a new active learning (AL) method for text classification with convolutional neural networks (CNNs). In AL, one selects the instances to be manually labeled with the aim of maximizing model performance with minimal effort. Neural models capitalize on word embeddings as representations (features), tuning these to the task at hand. We argue that AL strategies for multi-layered neural models should focus on selecting instances that most affect the embedding space (i.e., induce discrim...

  3. Princess Brambilla - images/text

    Directory of Open Access Journals (Sweden)

    Maria Aparecida Barbosa

    2015-08-01

    Full Text Available http://dx.doi.org/10.5007/2175-7968.2016v36n1p79 Read the illustrated literary text is simultaneously think pictures and words. This articulation between the written text and pictures adds potential, expands and becomes complex. Coincides with nowadays discussions on Giorgio Agamben's "contemporary" that add to what adheres to respectively time the displacement and the distance needed to understand it, shakes linear notions of historical chronology. Somehow the coincidence is related to the current interest in the concept of "Nachleben" (survival, which assumes the images of the past ransom, postulated by the art historian Aby Warburg in a research on ancient art of motion characteristics in Renaissance pictures Botticelli's. For the translation of the Princesa Brambilla – um capriccio segundo Jakob Callot, de E. T. A. Hoffmann, com 8 gravuras cunhadas a partir de moldes originais de Callot (1820 to Portuguese such discussions were fundamental, as I try to present in this article.

  4. Water vapor motion signal extraction from FY-2E longwave infrared window images for cloud-free regions: The temporal difference technique

    Science.gov (United States)

    Yang, Lu; Wang, Zhenhui; Chu, Yanli; Zhao, Hang; Tang, Min

    2014-11-01

    The aim of this study is to calculate the low-level atmospheric motion vectors (AMVs) in clear areas with FY-2E IR2 window (11.59-12.79 μm) channel imagery, where the traditional cloud motion wind technique fails. A new tracer selection procedure, which we call the temporal difference technique, is demonstrated in this paper. This technique makes it possible to infer low-level wind by tracking features in the moisture pattern that appear as brightness temperature ( T B) differences between consecutive sequences of 30-min-interval FY-2E IR2 images over cloud-free regions. The T B difference corresponding to a 10% change in water vapor density is computed with the Moderate Resolution Atmospheric Transmission (MODTRAN4) radiative transfer model. The total contribution from each of the 10 layers is analyzed under four typical atmospheric conditions: tropical, midlatitude summer, U.S. standard, and midlatitude winter. The peak level of the water vapor weighting function for the four typical atmospheres is assigned as a specific height to the T B "wind". This technique is valid over cloud-free ocean areas. The proposed algorithm exhibits encouraging statistical results in terms of vector difference (VD), speed bias (BIAS), mean vector difference (MVD), standard deviation (SD), and root-mean-square error (RMSE), when compared with the wind field of NCEP reanalysis data and rawinsonde observations.

  5. The Balinese Unicode Text Processing

    Directory of Open Access Journals (Sweden)

    Imam Habibi

    2009-06-01

    Full Text Available In principal, the computer only recognizes numbers as the representation of a character. Therefore, there are many encoding systems to allocate these numbers although not all characters are covered. In Europe, every single language even needs more than one encoding system. Hence, a new encoding system known as Unicode has been established to overcome this problem. Unicode provides unique id for each different characters which does not depend on platform, program, and language. Unicode standard has been applied in a number of industries, such as Apple, HP, IBM, JustSystem, Microsoft, Oracle, SAP, Sun, Sybase, and Unisys. In addition, language standards and modern information exchanges such as XML, Java, ECMA Script (JavaScript, LDAP, CORBA 3.0, and WML make use of Unicode as an official tool for implementing ISO/IEC 10646. There are four things to do according to Balinese script: the algorithm of transliteration, searching, sorting, and word boundary analysis (spell checking. To verify the truth of algorithm, some applications are made. These applications can run on Linux/Windows OS platform using J2SDK 1.5 and J2ME WTK2 library. The input and output of the algorithm/application are character sequence that is obtained from keyboard punch and external file. This research produces a module or a library which is able to process the Balinese text based on Unicode standard. The output of this research is the ability, skill, and mastering of 1. Unicode standard (21-bit as a substitution to ASCII (7-bit and ISO8859-1 (8-bit as the former default character set in many applications. 2. The Balinese Unicode text processing algorithm. 3. An experience of working with and learning from an international team that consists of the foremost experts in the area: Michael Everson (Ireland, Peter Constable (Microsoft US, I Made Suatjana, and Ida Bagus Adi Sudewa.

  6. Text as an Autopoietic System

    DEFF Research Database (Denmark)

    Nicolaisen, Maria Skou

    2016-01-01

    The aim of the present research article is to discuss the possibilities and limitations in addressing text as an autopoietic system. The theory of autopoiesis originated in the field of biology in order to explain the dynamic processes entailed in sustaining living organisms at cellular level. Th....... By comparing the biological with the textual account of autopoietic agency, the end conclusion is that a newly derived concept of sociopoiesis might be better suited for discussing the architecture of textual systems.......The aim of the present research article is to discuss the possibilities and limitations in addressing text as an autopoietic system. The theory of autopoiesis originated in the field of biology in order to explain the dynamic processes entailed in sustaining living organisms at cellular level....... The theory has been introduced in a slightly altered version to the field of textual scholarship by Jerome McGann. In its original version, the defining traits of autopoietic system functioning are associated with distinguishable boundaries between the system and its environment, a distinction that does...

  7. Text Mining for Drug–Drug Interaction

    Science.gov (United States)

    Wu, Heng-Yi; Chiang, Chien-Wei; Li, Lang

    2015-01-01

    In order to understand the mechanisms of drug–drug interaction (DDI), the study of pharmacokinetics (PK), pharmacodynamics (PD), and pharmacogenetics (PG) data are significant. In recent years, drug PK parameters, drug interaction parameters, and PG data have been unevenly collected in different databases and published extensively in literature. Also the lack of an appropriate PK ontology and a well-annotated PK corpus, which provide the background knowledge and the criteria of determining DDI, respectively, lead to the difficulty of developing DDI text mining tools for PK data collection from the literature and data integration from multiple databases. To conquer the issues, we constructed a comprehensive pharmacokinetics ontology. It includes all aspects of in vitro pharmacokinetics experiments, in vivo pharmacokinetics studies, as well as drug metabolism and transportation enzymes. Using our pharmacokinetics ontology, a PK corpus was constructed to present four classes of pharmacokinetics abstracts: in vivo pharmacokinetics studies, in vivo pharmacogenetic studies, in vivo drug interaction studies, and in vitro drug interaction studies. A novel hierarchical three-level annotation scheme was proposed and implemented to tag key terms, drug interaction sentences, and drug interaction pairs. The utility of the pharmacokinetics ontology was demonstrated by annotating three pharmacokinetics studies; and the utility of the PK corpus was demonstrated by a drug interaction extraction text mining analysis. The pharmacokinetics ontology annotates both in vitro pharmacokinetics experiments and in vivo pharmacokinetics studies. The PK corpus is a highly valuable resource for the text mining of pharmacokinetics parameters and drug interactions. PMID:24788261

  8. Text mining for drug-drug interaction.

    Science.gov (United States)

    Wu, Heng-Yi; Chiang, Chien-Wei; Li, Lang

    2014-01-01

    In order to understand the mechanisms of drug-drug interaction (DDI), the study of pharmacokinetics (PK), pharmacodynamics (PD), and pharmacogenetics (PG) data are significant. In recent years, drug PK parameters, drug interaction parameters, and PG data have been unevenly collected in different databases and published extensively in literature. Also the lack of an appropriate PK ontology and a well-annotated PK corpus, which provide the background knowledge and the criteria of determining DDI, respectively, lead to the difficulty of developing DDI text mining tools for PK data collection from the literature and data integration from multiple databases.To conquer the issues, we constructed a comprehensive pharmacokinetics ontology. It includes all aspects of in vitro pharmacokinetics experiments, in vivo pharmacokinetics studies, as well as drug metabolism and transportation enzymes. Using our pharmacokinetics ontology, a PK corpus was constructed to present four classes of pharmacokinetics abstracts: in vivo pharmacokinetics studies, in vivo pharmacogenetic studies, in vivo drug interaction studies, and in vitro drug interaction studies. A novel hierarchical three-level annotation scheme was proposed and implemented to tag key terms, drug interaction sentences, and drug interaction pairs. The utility of the pharmacokinetics ontology was demonstrated by annotating three pharmacokinetics studies; and the utility of the PK corpus was demonstrated by a drug interaction extraction text mining analysis.The pharmacokinetics ontology annotates both in vitro pharmacokinetics experiments and in vivo pharmacokinetics studies. The PK corpus is a highly valuable resource for the text mining of pharmacokinetics parameters and drug interactions.

  9. Suppression and azimuthal anisotropy of prompt and nonprompt [Formula: see text] production in PbPb collisions at [Formula: see text][Formula: see text].

    Science.gov (United States)

    Khachatryan, V; Sirunyan, A M; Tumasyan, A; Adam, W; Asilar, E; Bergauer, T; Brandstetter, J; Brondolin, E; Dragicevic, M; Erö, J; Flechl, M; Friedl, M; Frühwirth, R; Ghete, V M; Hartl, C; Hörmann, N; Hrubec, J; Jeitler, M; König, A; Krätschmer, I; Liko, D; Matsushita, T; Mikulec, I; Rabady, D; Rad, N; Rahbaran, B; Rohringer, H; Schieck, J; Strauss, J; Waltenberger, W; Wulz, C-E; Dvornikov, O; Makarenko, V; Zykunov, V; Mossolov, V; Shumeiko, N; Suarez Gonzalez, J; Alderweireldt, S; De Wolf, E A; Janssen, X; Lauwers, J; Van De Klundert, M; Van Haevermaet, H; Van Mechelen, P; Van Remortel, N; Van Spilbeeck, A; Abu Zeid, S; Blekman, F; D'Hondt, J; Daci, N; De Bruyn, I; Deroover, K; Lowette, S; Moortgat, S; Moreels, L; Olbrechts, A; Python, Q; Tavernier, S; Van Doninck, W; Van Mulders, P; Van Parijs, I; Brun, H; Clerbaux, B; De Lentdecker, G; Delannoy, H; Fasanella, G; Favart, L; Goldouzian, R; Grebenyuk, A; Karapostoli, G; Lenzi, T; Léonard, A; Luetic, J; Maerschalk, T; Marinov, A; Randle-Conde, A; Seva, T; Vander Velde, C; Vanlaer, P; Vannerom, D; Yonamine, R; Zenoni, F; Zhang, F; Cimmino, A; Cornelis, T; Dobur, D; Fagot, A; Garcia, G; Gul, M; Khvastunov, I; Poyraz, D; Salva, S; Schöfbeck, R; Sharma, A; Tytgat, M; Van Driessche, W; Yazgan, E; Zaganidis, N; Bakhshiansohi, H; Beluffi, C; Bondu, O; Brochet, S; Bruno, G; Caudron, A; De Visscher, S; Delaere, C; Delcourt, M; Francois, B; Giammanco, A; Jafari, A; Jez, P; Komm, M; Krintiras, G; Lemaitre, V; Magitteri, A; Mertens, A; Musich, M; Nuttens, C; Piotrzkowski, K; Quertenmont, L; Selvaggi, M; Vidal Marono, M; Wertz, S; Beliy, N; Aldá Júnior, W L; Alves, F L; Alves, G A; Brito, L; Hensel, C; Moraes, A; Pol, M E; Rebello Teles, P; Chagas, E Belchior Batista Das; Carvalho, W; Chinellato, J; Custódio, A; Da Costa, E M; Da Silveira, G G; De Jesus Damiao, D; De Oliveira Martins, C; De Souza, S Fonseca; Guativa, L M Huertas; Malbouisson, H; Matos Figueiredo, D; Mora Herrera, C; Mundim, L; Nogima, H; Prado Da Silva, W L; Santoro, A; Sznajder, A; Tonelli Manganote, E J; Vilela Pereira, A; Ahuja, S; Bernardes, C A; Dogra, S; Fernandez Perez Tomei, T R; Gregores, E M; Mercadante, P G; Moon, C S; Novaes, S F; Padula, Sandra S; Romero Abad, D; Ruiz Vargas, J C; Aleksandrov, A; Hadjiiska, R; Iaydjiev, P; Rodozov, M; Stoykova, S; Sultanov, G; Vutova, M; Dimitrov, A; Glushkov, I; Litov, L; Pavlov, B; Petkov, P; Fang, W; Ahmad, M; Bian, J G; Chen, G M; Chen, H S; Chen, M; Chen, Y; Cheng, T; Jiang, C H; Leggat, D; Liu, Z; Romeo, F; Shaheen, S M; Spiezia, A; Tao, J; Wang, C; Wang, Z; Zhang, H; Zhao, J; Ban, Y; Chen, G; Li, Q; Liu, S; Mao, Y; Qian, S J; Wang, D; Xu, Z; Avila, C; Cabrera, A; Chaparro Sierra, L F; Florez, C; Gomez, J P; González Hernández, C F; Ruiz Alvarez, J D; Sanabria, J C; Godinovic, N; Lelas, D; Puljak, I; Ribeiro Cipriano, P M; Sculac, T; Antunovic, Z; Kovac, M; Brigljevic, V; Ferencek, D; Kadija, K; Micanovic, S; Sudic, L; Susa, T; Attikis, A; Mavromanolakis, G; Mousa, J; Nicolaou, C; Ptochos, F; Razis, P A; Rykaczewski, H; Tsiakkouri, D; Finger, M; Finger, M; Jarrin, E Carrera; Kamel, A Ellithi; Mahmoud, M A; Radi, A; Kadastik, M; Perrini, L; Raidal, M; Tiko, A; Veelken, C; Eerola, P; Pekkanen, J; Voutilainen, M; Härkönen, J; Järvinen, T; Karimäki, V; Kinnunen, R; Lampén, T; Lassila-Perini, K; Lehti, S; Lindén, T; Luukka, P; Tuominiemi, J; Tuovinen, E; Wendland, L; Talvitie, J; Tuuva, T; Besancon, M; Couderc, F; Dejardin, M; Denegri, D; Fabbro, B; Faure, J L; Favaro, C; Ferri, F; Ganjour, S; Ghosh, S; Givernaud, A; Gras, P; Hamel de Monchenault, G; Jarry, P; Kucher, I; Locci, E; Machet, M; Malcles, J; Rander, J; Rosowsky, A; Titov, M; Zghiche, A; Abdulsalam, A; Antropov, I; Baffioni, S; Beaudette, F; Busson, P; Cadamuro, L; Chapon, E; Charlot, C; Davignon, O; Granier de Cassagnac, R; Jo, M; Lisniak, S; Miné, P; Nguyen, M; Ochando, C; Ortona, G; Paganini, P; Pigard, P; Regnard, S; Salerno, R; Sirois, Y; Strebler, T; Yilmaz, Y; Zabi, A; Agram, J-L; Andrea, J; Aubin, A; Bloch, D; Brom, J-M; Buttignol, M; Chabert, E C; Chanon, N; Collard, C; Conte, E; Coubez, X; Fontaine, J-C; Gelé, D; Goerlach, U; Le Bihan, A-C; Skovpen, K; Van Hove, P; Gadrat, S; Beauceron, S; Bernet, C; Boudoul, G; Bouvier, E; Carrillo Montoya, C A; Chierici, R; Contardo, D; Courbon, B; Depasse, P; El Mamouni, H; Fan, J; Fay, J; Gascon, S; Gouzevitch, M; Grenier, G; Ille, B; Lagarde, F; Laktineh, I B; Lethuillier, M; Mirabito, L; Pequegnot, A L; Perries, S; Popov, A; Sabes, D; Sordini, V; Vander Donckt, M; Verdier, P; Viret, S; Toriashvili, T; Lomidze, D; Autermann, C; Beranek, S; Feld, L; Heister, A; Kiesel, M K; Klein, K; Lipinski, M; Ostapchuk, A; Preuten, M; Raupach, F; Schael, S; Schomakers, C; Schulz, J; Verlage, T; Weber, H; Zhukov, V; Albert, A; Brodski, M; Dietz-Laursonn, E; Duchardt, D; Endres, M; Erdmann, M; Erdweg, S; Esch, T; Fischer, R; Güth, A; Hamer, M; Hebbeker, T; Heidemann, C; Hoepfner, K; Knutzen, S; Merschmeyer, M; Meyer, A; Millet, P; Mukherjee, S; Olschewski, M; Padeken, K; Pook, T; Radziej, M; Reithler, H; Rieger, M; Scheuch, F; Sonnenschein, L; Teyssier, D; Thüer, S; Cherepanov, V; Flügge, G; Kargoll, B; Kress, T; Künsken, A; Lingemann, J; Müller, T; Nehrkorn, A; Nowack, A; Pistone, C; Pooth, O; Stahl, A; Aldaya Martin, M; Arndt, T; Asawatangtrakuldee, C; Beernaert, K; Behnke, O; Behrens, U; Bin Anuar, A A; Borras, K; Campbell, A; Connor, P; Contreras-Campana, C; Costanza, F; Diez Pardos, C; Dolinska, G; Eckerlin, G; Eckstein, D; Eichhorn, T; Eren, E; Gallo, E; Garay Garcia, J; Geiser, A; Gizhko, A; Grados Luyando, J M; Gunnellini, P; Harb, A; Hauk, J; Hempel, M; Jung, H; Kalogeropoulos, A; Karacheban, O; Kasemann, M; Keaveney, J; Kleinwort, C; Korol, I; Krücker, D; Lange, W; Lelek, A; Leonard, J; Lipka, K; Lobanov, A; Lohmann, W; Mankel, R; Melzer-Pellmann, I-A; Meyer, A B; Mittag, G; Mnich, J; Mussgiller, A; Ntomari, E; Pitzl, D; Placakyte, R; Raspereza, A; Roland, B; Sahin, M Ö; Saxena, P; Schoerner-Sadenius, T; Seitz, C; Spannagel, S; Stefaniuk, N; Van Onsem, G P; Walsh, R; Wissing, C; Blobel, V; Centis Vignali, M; Draeger, A R; Dreyer, T; Garutti, E; Gonzalez, D; Haller, J; Hoffmann, M; Junkes, A; Klanner, R; Kogler, R; Kovalchuk, N; Lapsien, T; Lenz, T; Marchesini, I; Marconi, D; Meyer, M; Niedziela, M; Nowatschin, D; Pantaleo, F; Peiffer, T; Perieanu, A; Poehlsen, J; Sander, C; Scharf, C; Schleper, P; Schmidt, A; Schumann, S; Schwandt, J; Stadie, H; Steinbrück, G; Stober, F M; Stöver, M; Tholen, H; Troendle, D; Usai, E; Vanelderen, L; Vanhoefer, A; Vormwald, B; Akbiyik, M; Barth, C; Baur, S; Baus, C; Berger, J; Butz, E; Caspart, R; Chwalek, T; Colombo, F; De Boer, W; Dierlamm, A; Fink, S; Freund, B; Friese, R; Giffels, M; Gilbert, A; Goldenzweig, P; Haitz, D; Hartmann, F; Heindl, S M; Husemann, U; Katkov, I; Kudella, S; Lobelle Pardo, P; Mildner, H; Mozer, M U; Müller, Th; Plagge, M; Quast, G; Rabbertz, K; Röcker, S; Roscher, F; Schröder, M; Shvetsov, I; Sieber, G; Simonis, H J; Ulrich, R; Wagner-Kuhr, J; Wayand, S; Weber, M; Weiler, T; Williamson, S; Wöhrmann, C; Wolf, R; Anagnostou, G; Daskalakis, G; Geralis, T; Giakoumopoulou, V A; Kyriakis, A; Loukas, D; Topsis-Giotis, I; Kesisoglou, S; Panagiotou, A; Saoulidou, N; Tziaferi, E; Evangelou, I; Flouris, G; Foudas, C; Kokkas, P; Loukas, N; Manthos, N; Papadopoulos, I; Paradas, E; Filipovic, N; Bencze, G; Hajdu, C; Horvath, D; Sikler, F; Veszpremi, V; Vesztergombi, G; Zsigmond, A J; Beni, N; Czellar, S; Karancsi, J; Makovec, A; Molnar, J; Szillasi, Z; Bartók, M; Raics, P; Trocsanyi, Z L; Ujvari, B; Bahinipati, S; Choudhury, S; Mal, P; Mandal, K; Nayak, A; Sahoo, D K; Sahoo, N; Swain, S K; Bansal, S; Beri, S B; Bhatnagar, V; Chawla, R; Bhawandeep, U; Kalsi, A K; Kaur, A; Kaur, M; Kumar, R; Kumari, P; Mehta, A; Mittal, M; Singh, J B; Walia, G; Kumar, Ashok; Bhardwaj, A; Choudhary, B C; Garg, R B; Keshri, S; Malhotra, S; Naimuddin, M; Nishu, N; Ranjan, K; Sharma, R; Sharma, V; Bhattacharya, R; Bhattacharya, S; Chatterjee, K; Dey, S; Dutt, S; Dutta, S; Ghosh, S; Majumdar, N; Modak, A; Mondal, K; Mukhopadhyay, S; Nandan, S; Purohit, A; Roy, A; Roy, D; Roy Chowdhury, S; Sarkar, S; Sharan, M; Thakur, S; Behera, P K; Chudasama, R; Dutta, D; Jha, V; Kumar, V; Mohanty, A K; Netrakanti, P K; Pant, L M; Shukla, P; Topkar, A; Aziz, T; Dugad, S; Kole, G; Mahakud, B; Mitra, S; Mohanty, G B; Parida, B; Sur, N; Sutar, B; Banerjee, S; Bhowmik, S; Dewanjee, R K; Ganguly, S; Guchait, M; Jain, Sa; Kumar, S; Maity, M; Majumder, G; Mazumdar, K; Sarkar, T; Wickramage, N; Chauhan, S; Dube, S; Hegde, V; Kapoor, A; Kothekar, K; Pandey, S; Rane, A; Sharma, S; Behnamian, H; Chenarani, S; Eskandari Tadavani, E; Etesami, S M; Fahim, A; Khakzad, M; Mohammadi Najafabadi, M; Naseri, M; Paktinat Mehdiabadi, S; Rezaei Hosseinabadi, F; Safarzadeh, B; Zeinali, M; Felcini, M; Grunewald, M; Abbrescia, M; Calabria, C; Caputo, C; Colaleo, A; Creanza, D; Cristella, L; De Filippis, N; De Palma, M; Fiore, L; Iaselli, G; Maggi, G; Maggi, M; Miniello, G; My, S; Nuzzo, S; Pompili, A; Pugliese, G; Radogna, R; Ranieri, A; Selvaggi, G; Silvestris, L; Venditti, R; Verwilligen, P; Abbiendi, G; Battilana, C; Bonacorsi, D; Braibant-Giacomelli, S; Brigliadori, L; Campanini, R; Capiluppi, P; Castro, A; Cavallo, F R; Chhibra, S S; Codispoti, G; Cuffiani, M; Dallavalle, G M; Fabbri, F; Fanfani, A; Fasanella, D; Giacomelli, P; Grandi, C; Guiducci, L; Marcellini, S; Masetti, G; Montanari, A; Navarria, F L; Perrotta, A; Rossi, A M; Rovelli, T; Siroli, G P; Tosi, N; Albergo, S; Costa, S; Di Mattia, A; Giordano, F; Potenza, R; Tricomi, A; Tuve, C; Barbagli, G; Ciulli, V; Civinini, C; D'Alessandro, R; Focardi, E; Lenzi, P; Meschini, M; Paoletti, S; Sguazzoni, G; Viliani, L; Benussi, L; Bianco, S; Fabbri, F; Piccolo, D; Primavera, F; Calvelli, V; Ferro, F; Lo Vetere, M; Monge, M R; Robutti, E; Tosi, S; Brianza, L; Dinardo, M E; Fiorendi, S; Gennai, S; Ghezzi, A; Govoni, P; Malberti, M; Malvezzi, S; Manzoni, R A; Menasce, D; Moroni, L; Paganoni, M; Pedrini, D; Pigazzini, S; Ragazzi, S; Tabarelli de Fatis, T; Buontempo, S; Cavallo, N; De Nardo, G; Di Guida, S; Esposito, M; Fabozzi, F; Fienga, F; Iorio, A O M; Lanza, G; Lista, L; Meola, S; Paolucci, P; Sciacca, C; Thyssen, F; Azzi, P; Bacchetta, N; Benato, L; Bisello, D; Boletti, A; Carlin, R; Carvalho Antunes De Oliveira, A; Checchia, P; Dall'Osso, M; De Castro Manzano, P; Dorigo, T; Dosselli, U; Gasparini, F; Gasparini, U; Gozzelino, A; Lacaprara, S; Margoni, M; Meneguzzo, A T; Pazzini, J; Pozzobon, N; Ronchese, P; Simonetto, F; Torassa, E; Zanetti, M; Zotto, P; Zumerle, G; Braghieri, A; Magnani, A; Montagna, P; Ratti, S P; Re, V; Riccardi, C; Salvini, P; Vai, I; Vitulo, P; Alunni Solestizi, L; Bilei, G M; Ciangottini, D; Fanò, L; Lariccia, P; Leonardi, R; Mantovani, G; Menichelli, M; Saha, A; Santocchia, A; Androsov, K; Azzurri, P; Bagliesi, G; Bernardini, J; Boccali, T; Castaldi, R; Ciocci, M A; Dell'Orso, R; Donato, S; Fedi, G; Giassi, A; Grippo, M T; Ligabue, F; Lomtadze, T; Martini, L; Messineo, A; Palla, F; Rizzi, A; Savoy-Navarro, A; Spagnolo, P; Tenchini, R; Tonelli, G; Venturi, A; Verdini, P G; Barone, L; Cavallari, F; Cipriani, M; Del Re, D; Diemoz, M; Gelli, S; Longo, E; Margaroli, F; Marzocchi, B; Meridiani, P; Organtini, G; Paramatti, R; Preiato, F; Rahatlou, S; Rovelli, C; Santanastasio, F; Amapane, N; Arcidiacono, R; Argiro, S; Arneodo, M; Bartosik, N; Bellan, R; Biino, C; Cartiglia, N; Cenna, F; Costa, M; Covarelli, R; Degano, A; Demaria, N; Finco, L; Kiani, B; Mariotti, C; Maselli, S; Migliore, E; Monaco, V; Monteil, E; Monteno, M; Obertino, M M; Pacher, L; Pastrone, N; Pelliccioni, M; Pinna Angioni, G L; Ravera, F; Romero, A; Ruspa, M; Sacchi, R; Shchelina, K; Sola, V; Solano, A; Staiano, A; Traczyk, P; Belforte, S; Casarsa, M; Cossutti, F; Della Ricca, G; Zanetti, A; Kim, D H; Kim, G N; Kim, M S; Lee, S; Lee, S W; Oh, Y D; Sekmen, S; Son, D C; Yang, Y C; Lee, A; Kim, H; Moon, D H; Brochero Cifuentes, J A; Kim, T J; Cho, S; Choi, S; Go, Y; Gyun, D; Ha, S; Hong, B; Jo, Y; Kim, Y; Lee, B; Lee, K; Lee, K S; Lee, S; Lim, J; Park, S K; Roh, Y; Almond, J; Kim, J; Lee, H; Oh, S B; Radburn-Smith, B C; Seo, S H; Yang, U K; Yoo, H D; Yu, G B; Choi, M; Kim, H; Kim, J H; Lee, J S H; Park, I C; Ryu, G; Ryu, M S; Choi, Y; Goh, J; Hwang, C; Lee, J; Yu, I; Dudenas, V; Juodagalvis, A; Vaitkus, J; Ahmed, I; Ibrahim, Z A; Komaragiri, J R; Md Ali, M A B; Mohamad Idris, F; Wan Abdullah, W A T; Yusli, M N; Zolkapli, Z; Castilla-Valdez, H; De La Cruz-Burelo, E; Heredia-De La Cruz, I; Hernandez-Almada, A; Lopez-Fernandez, R; Magaña Villalba, R; Mejia Guisao, J; Sanchez-Hernandez, A; Carrillo Moreno, S; Oropeza Barrera, C; Vazquez Valencia, F; Carpinteyro, S; Pedraza, I; Salazar Ibarguen, H A; Uribe Estrada, C; Morelos Pineda, A; Krofcheck, D; Butler, P H; Ahmad, A; Ahmad, M; Hassan, Q; Hoorani, H R; Khan, W A; Saddique, A; Shah, M A; Shoaib, M; Waqas, M; Bialkowska, H; Bluj, M; Boimska, B; Frueboes, T; Górski, M; Kazana, M; Nawrocki, K; Romanowska-Rybinska, K; Szleper, M; Zalewski, P; Bunkowski, K; Byszuk, A; Doroba, K; Kalinowski, A; Konecki, M; Krolikowski, J; Misiura, M; Olszewski, M; Walczak, M; Bargassa, P; Beirão Da Cruz E Silva, C; Calpas, B; Di Francesco, A; Faccioli, P; Ferreira Parracho, P G; Gallinaro, M; Hollar, J; Leonardo, N; Lloret Iglesias, L; Nemallapudi, M V; Rodrigues Antunes, J; Seixas, J; Toldaiev, O; Vadruccio, D; Varela, J; Vischia, P; Afanasiev, S; Bunin, P; Gavrilenko, M; Golutvin, I; Gorbunov, I; Kamenev, A; Karjavin, V; Lanev, A; Malakhov, A; Matveev, V; Palichik, V; Perelygin, V; Shmatov, S; Shulha, S; Skatchkov, N; Smirnov, V; Voytishin, N; Zarubin, A; Chtchipounov, L; Golovtsov, V; Ivanov, Y; Kim, V; Kuznetsova, E; Murzin, V; Oreshkin, V; Sulimov, V; Vorobyev, A; Andreev, Yu; Dermenev, A; Gninenko, S; Golubev, N; Karneyeu, A; Kirsanov, M; Krasnikov, N; Pashenkov, A; Tlisov, D; Toropin, A; Epshteyn, V; Gavrilov, V; Lychkovskaya, N; Popov, V; Pozdnyakov, I; Safronov, G; Spiridonov, A; Toms, M; Vlasov, E; Zhokin, A; Bylinkin, A; Chistov, R; Polikarpov, S; Rusinov, V; Andreev, V; Azarkin, M; Dremin, I; Kirakosyan, M; Leonidov, A; Terkulov, A; Baskakov, A; Belyaev, A; Boos, E; Demiyanov, A; Ershov, A; Gribushin, A; Kodolova, O; Korotkikh, V; Lokhtin, I; Miagkov, I; Obraztsov, S; Petrushanko, S; Savrin, V; Snigirev, A; Vardanyan, I; Blinov, V; Skovpen, Y; Shtol, D; Azhgirey, I; Bayshev, I; Bitioukov, S; Elumakhov, D; Kachanov, V; Kalinin, A; Konstantinov, D; Krychkine, V; Petrov, V; Ryutin, R; Sobol, A; Troshin, S; Tyurin, N; Uzunian, A; Volkov, A; Adzic, P; Cirkovic, P; Devetak, D; Dordevic, M; Milosevic, J; Rekovic, V; Alcaraz Maestre, J; Barrio Luna, M; Calvo, E; Cerrada, M; Chamizo Llatas, M; Colino, N; De La Cruz, B; Delgado Peris, A; Escalante Del Valle, A; Fernandez Bedoya, C; Fernández Ramos, J P; Flix, J; Fouz, M C; Garcia-Abia, P; Gonzalez Lopez, O; Goy Lopez, S; Hernandez, J M; Josa, M I; Navarro De Martino, E; Pérez-Calero Yzquierdo, A; Puerta Pelayo, J; Quintario Olmeda, A; Redondo, I; Romero, L; Soares, M S; de Trocóniz, J F; Missiroli, M; Moran, D; Cuevas, J; Fernandez Menendez, J; Gonzalez Caballero, I; González Fernández, J R; Palencia Cortezon, E; Sanchez Cruz, S; Suárez Andrés, I; Vizan Garcia, J M; Cabrillo, I J; Calderon, A; Castiñeiras De Saa, J R; Curras, E; Fernandez, M; Garcia-Ferrero, J; Gomez, G; Lopez Virto, A; Marco, J; Martinez Rivero, C; Matorras, F; Piedra Gomez, J; Rodrigo, T; Ruiz-Jimeno, A; Scodellaro, L; Trevisani, N; Vila, I; Vilar Cortabitarte, R; Abbaneo, D; Auffray, E; Auzinger, G; Bachtis, M; Baillon, P; Ball, A H; Barney, D; Bloch, P; Bocci, A; Bonato, A; Botta, C; Camporesi, T; Castello, R; Cepeda, M; Cerminara, G; D'Alfonso, M; d'Enterria, D; Dabrowski, A; Daponte, V; David, A; De Gruttola, M; De Roeck, A; Di Marco, E; Dobson, M; Dorney, B; du Pree, T; Duggan, D; Dünser, M; Dupont, N; Elliott-Peisert, A; Fartoukh, S; Franzoni, G; Fulcher, J; Funk, W; Gigi, D; Gill, K; Girone, M; Glege, F; Gulhan, D; Gundacker, S; Guthoff, M; Hammer, J; Harris, P; Hegeman, J; Innocente, V; Janot, P; Kieseler, J; Kirschenmann, H; Knünz, V; Kornmayer, A; Kortelainen, M J; Kousouris, K; Krammer, M; Lange, C; Lecoq, P; Lourenço, C; Lucchini, M T; Malgeri, L; Mannelli, M; Martelli, A; Meijers, F; Merlin, J A; Mersi, S; Meschi, E; Milenovic, P; Moortgat, F; Morovic, S; Mulders, M; Neugebauer, H; Orfanelli, S; Orsini, L; Pape, L; Perez, E; Peruzzi, M; Petrilli, A; Petrucciani, G; Pfeiffer, A; Pierini, M; Racz, A; Reis, T; Rolandi, G; Rovere, M; Ruan, M; Sakulin, H; Sauvan, J B; Schäfer, C; Schwick, C; Seidel, M; Sharma, A; Silva, P; Sphicas, P; Steggemann, J; Stoye, M; Takahashi, Y; Tosi, M; Treille, D; Triossi, A; Tsirou, A; Veckalns, V; Veres, G I; Verweij, M; Wardle, N; Wöhri, H K; Zagozdzinska, A; Zeuner, W D; Bertl, W; Deiters, K; Erdmann, W; Horisberger, R; Ingram, Q; Kaestli, H C; Kotlinski, D; Langenegger, U; Rohe, T; Bachmair, F; Bäni, L; Bianchini, L; Casal, B; Dissertori, G; Dittmar, M; Donegà, M; Grab, C; Heidegger, C; Hits, D; Hoss, J; Kasieczka, G; Lecomte, P; Lustermann, W; Mangano, B; Marionneau, M; Martinez Ruiz Del Arbol, P; Masciovecchio, M; Meinhard, M T; Meister, D; Micheli, F; Musella, P; Nessi-Tedaldi, F; Pandolfi, F; Pata, J; Pauss, F; Perrin, G; Perrozzi, L; Quittnat, M; Rossini, M; Schönenberger, M; Starodumov, A; Tavolaro, V R; Theofilatos, K; Wallny, R; Aarrestad, T K; Amsler, C; Caminada, L; Canelli, M F; De Cosa, A; Galloni, C; Hinzmann, A; Hreus, T; Kilminster, B; Ngadiuba, J; Pinna, D; Rauco, G; Robmann, P; Salerno, D; Yang, Y; Zucchetta, A; Candelise, V; Doan, T H; Jain, Sh; Khurana, R; Konyushikhin, M; Kuo, C M; Lin, W; Lu, Y J; Pozdnyakov, A; Yu, S S; Kumar, Arun; Chang, P; Chang, Y H; Chang, Y W; Chao, Y; Chen, K F; Chen, P H; Dietz, C; Fiori, F; Hou, W-S; Hsiung, Y; Liu, Y F; Lu, R-S; Miñano Moya, M; Paganis, E; Psallidas, A; Tsai, J F; Tzeng, Y M; Asavapibhop, B; Singh, G; Srimanobhas, N; Suwonjandee, N; Adiguzel, A; Cerci, S; Damarseckin, S; Demiroglu, Z S; Dozen, C; Dumanoglu, I; Girgis, S; Gokbulut, G; Guler, Y; Hos, I; Kangal, E E; Kara, O; Kayis Topaksu, A; Kiminsu, U; Oglakci, M; Onengut, G; Ozdemir, K; Sunar Cerci, D; Tali, B; Turkcapar, S; Zorbakir, I S; Zorbilmez, C; Bilin, B; Bilmis, S; Isildak, B; Karapinar, G; Yalvac, M; Zeyrek, M; Gülmez, E; Kaya, M; Kaya, O; Yetkin, E A; Yetkin, T; Cakir, A; Cankocak, K; Sen, S; Grynyov, B; Levchuk, L; Sorokin, P; Aggleton, R; Ball, F; Beck, L; Brooke, J J; Burns, D; Clement, E; Cussans, D; Flacher, H; Goldstein, J; Grimes, M; Heath, G P; Heath, H F; Jacob, J; Kreczko, L; Lucas, C; Newbold, D M; Paramesvaran, S; Poll, A; Sakuma, T; Seif El Nasr-Storey, S; Smith, D; Smith, V J; Belyaev, A; Brew, C; Brown, R M; Calligaris, L; Cieri, D; Cockerill, D J A; Coughlan, J A; Harder, K; Harper, S; Olaiya, E; Petyt, D; Shepherd-Themistocleous, C H; Thea, A; Tomalin, I R; Williams, T; Baber, M; Bainbridge, R; Buchmuller, O; Bundock, A; Burton, D; Casasso, S; Citron, M; Colling, D; Corpe, L; Dauncey, P; Davies, G; De Wit, A; Della Negra, M; Di Maria, R; Dunne, P; Elwood, A; Futyan, D; Haddad, Y; Hall, G; Iles, G; James, T; Lane, R; Laner, C; Lucas, R; Lyons, L; Magnan, A-M; Malik, S; Mastrolorenzo, L; Nash, J; Nikitenko, A; Pela, J; Penning, B; Pesaresi, M; Raymond, D M; Richards, A; Rose, A; Seez, C; Summers, S; Tapper, A; Uchida, K; Vazquez Acosta, M; Virdee, T; Wright, J; Zenz, S C; Cole, J E; Hobson, P R; Khan, A; Kyberd, P; Leslie, D; Reid, I D; Symonds, P; Teodorescu, L; Turner, M; Borzou, A; Call, K; Dittmann, J; Hatakeyama, K; Liu, H; Pastika, N; Cooper, S I; Henderson, C; Rumerio, P; West, C; Arcaro, D; Avetisyan, A; Bose, T; Gastler, D; Rankin, D; Richardson, C; Rohlf, J; Sulak, L; Zou, D; Benelli, G; Berry, E; Cutts, D; Garabedian, A; Hakala, J; Heintz, U; Hogan, J M; Jesus, O; Kwok, K H M; Laird, E; Landsberg, G; Mao, Z; Narain, M; Piperov, S; Sagir, S; Spencer, E; Syarif, R; Breedon, R; Breto, G; Burns, D; Calderon De La Barca Sanchez, M; Chauhan, S; Chertok, M; Conway, J; Conway, R; Cox, P T; Erbacher, R; Flores, C; Funk, G; Gardner, M; Ko, W; Lander, R; Mclean, C; Mulhearn, M; Pellett, D; Pilot, J; Shalhout, S; Smith, J; Squires, M; Stolp, D; Tripathi, M; Bravo, C; Cousins, R; Dasgupta, A; Everaerts, P; Florent, A; Hauser, J; Ignatenko, M; Mccoll, N; Saltzberg, D; Schnaible, C; Takasugi, E; Valuev, V; Weber, M; Burt, K; Clare, R; Ellison, J; Gary, J W; Ghiasi Shirazi, S M A; Hanson, G; Heilman, J; Jandir, P; Kennedy, E; Lacroix, F; Long, O R; Olmedo Negrete, M; Paneva, M I; Shrinivas, A; Si, W; Wei, H; Wimpenny, S; Yates, B R; Branson, J G; Cerati, G B; Cittolin, S; Derdzinski, M; Holzner, A; Klein, D; Krutelyov, V; Letts, J; Macneill, I; Olivito, D; Padhi, S; Pieri, M; Sani, M; Sharma, V; Simon, S; Tadel, M; Vartak, A; Wasserbaech, S; Welke, C; Wood, J; Würthwein, F; Yagil, A; Zevi Della Porta, G; Amin, N; Bhandari, R; Bradmiller-Feld, J; Campagnari, C; Dishaw, A; Dutta, V; Franco Sevilla, M; George, C; Golf, F; Gouskos, L; Gran, J; Heller, R; Incandela, J; Mullin, S D; Ovcharova, A; Qu, H; Richman, J; Stuart, D; Suarez, I; Yoo, J; Anderson, D; Apresyan, A; Bendavid, J; Bornheim, A; Bunn, J; Chen, Y; Duarte, J; Lawhorn, J M; Mott, A; Newman, H B; Pena, C; Spiropulu, M; Vlimant, J R; Xie, S; Zhu, R Y; Andrews, M B; Azzolini, V; Ferguson, T; Paulini, M; Russ, J; Sun, M; Vogel, H; Vorobiev, I; Weinberg, M; Cumalat, J P; Ford, W T; Jensen, F; Johnson, A; Krohn, M; Mulholland, T; Stenson, K; Wagner, S R; Alexander, J; Chaves, J; Chu, J; Dittmer, S; Mcdermott, K; Mirman, N; Nicolas Kaufman, G; Patterson, J R; Rinkevicius, A; Ryd, A; Skinnari, L; Soffi, L; Tan, S M; Tao, Z; Thom, J; Tucker, J; Wittich, P; Zientek, M; Winn, D; Abdullin, S; Albrow, M; Apollinari, G; Banerjee, S; Bauerdick, L A T; Beretvas, A; Berryhill, J; Bhat, P C; Bolla, G; Burkett, K; Butler, J N; Cheung, H W K; Chlebana, F; Cihangir, S; Cremonesi, M; Elvira, V D; Fisk, I; Freeman, J; Gottschalk, E; Gray, L; Green, D; Grünendahl, S; Gutsche, O; Hare, D; Harris, R M; Hasegawa, S; Hirschauer, J; Hu, Z; Jayatilaka, B; Jindariani, S; Johnson, M; Joshi, U; Klima, B; Kreis, B; Lammel, S; Linacre, J; Lincoln, D; Lipton, R; Liu, T; Lopes De Sá, R; Lykken, J; Maeshima, K; Magini, N; Marraffino, J M; Maruyama, S; Mason, D; McBride, P; Merkel, P; Mrenna, S; Nahn, S; Newman-Holmes, C; O'Dell, V; Pedro, K; Prokofyev, O; Rakness, G; Ristori, L; Sexton-Kennedy, E; Soha, A; Spalding, W J; Spiegel, L; Stoynev, S; Strobbe, N; Taylor, L; Tkaczyk, S; Tran, N V; Uplegger, L; Vaandering, E W; Vernieri, C; Verzocchi, M; Vidal, R; Wang, M; Weber, H A; Whitbeck, A; Wu, Y; Acosta, D; Avery, P; Bortignon, P; Bourilkov, D; Brinkerhoff, A; Carnes, A; Carver, M; Curry, D; Das, S; Field, R D; Furic, I K; Konigsberg, J; Korytov, A; Low, J F; Ma, P; Matchev, K; Mei, H; Mitselmakher, G; Rank, D; Shchutska, L; Sperka, D; Thomas, L; Wang, J; Wang, S; Yelton, J; Linn, S; Markowitz, P; Martinez, G; Rodriguez, J L; Ackert, A; Adams, J R; Adams, T; Askew, A; Bein, S; Diamond, B; Hagopian, S; Hagopian, V; Johnson, K F; Khatiwada, A; Prosper, H; Santra, A; Yohay, R; Baarmand, M M; Bhopatkar, V; Colafranceschi, S; Hohlmann, M; Noonan, D; Roy, T; Yumiceva, F; Adams, M R; Apanasevich, L; Berry, D; Betts, R R; Bucinskaite, I; Cavanaugh, R; Evdokimov, O; Gauthier, L; Gerber, C E; Hofman, D J; Jung, K; Kurt, P; O'Brien, C; Sandoval Gonzalez, I D; Turner, P; Varelas, N; Wang, H; Wu, Z; Zakaria, M; Zhang, J; Bilki, B; Clarida, W; Dilsiz, K; Durgut, S; Gandrajula, R P; Haytmyradov, M; Khristenko, V; Merlo, J-P; Mermerkaya, H; Mestvirishvili, A; Moeller, A; Nachtman, J; Ogul, H; Onel, Y; Ozok, F; Penzo, A; Snyder, C; Tiras, E; Wetzel, J; Yi, K; Anderson, I; Blumenfeld, B; Cocoros, A; Eminizer, N; Fehling, D; Feng, L; Gritsan, A V; Maksimovic, P; Martin, C; Osherson, M; Roskes, J; Sarica, U; Swartz, M; Xiao, M; Xin, Y; You, C; Al-Bataineh, A; Baringer, P; Bean, A; Boren, S; Bowen, J; Bruner, C; Castle, J; Forthomme, L; Kenny, R P; Khalil, S; Kropivnitskaya, A; Majumder, D; Mcbrayer, W; Murray, M; Sanders, S; Stringer, R; Tapia Takaki, J D; Wang, Q; Ivanov, A; Kaadze, K; Maravin, Y; Mohammadi, A; Saini, L K; Skhirtladze, N; Toda, S; Rebassoo, F; Wright, D; Anelli, C; Baden, A; Baron, O; Belloni, A; Calvert, B; Eno, S C; Ferraioli, C; Gomez, J A; Hadley, N J; Jabeen, S; Kellogg, R G; Kolberg, T; Kunkle, J; Lu, Y; Mignerey, A C; Ricci-Tam, F; Shin, Y H; Skuja, A; Tonjes, M B; Tonwar, S C; Abercrombie, D; Allen, B; Apyan, A; Barbieri, R; Baty, A; Bi, R; Bierwagen, K; Brandt, S; Busza, W; Cali, I A; Demiragli, Z; Di Matteo, L; Gomez Ceballos, G; Goncharov, M; Hsu, D; Iiyama, Y; Innocenti, G M; Klute, M; Kovalskyi, D; Krajczar, K; Lai, Y S; Lee, Y-J; Levin, A; Luckey, P D; Maier, B; Marini, A C; Mcginn, C; Mironov, C; Narayanan, S; Niu, X; Paus, C; Roland, C; Roland, G; Salfeld-Nebgen, J; Stephans, G S F; Sumorok, K; Tatar, K; Varma, M; Velicanu, D; Veverka, J; Wang, J; Wang, T W; Wyslouch, B; Yang, M; Zhukova, V; Benvenuti, A C; Chatterjee, R M; Evans, A; Finkel, A; Gude, A; Hansen, P; Kalafut, S; Kao, S C; Kubota, Y; Lesko, Z; Mans, J; Nourbakhsh, S; Ruckstuhl, N; Rusack, R; Tambe, N; Turkewitz, J; Acosta, J G; Oliveros, S; Avdeeva, E; Bartek, R; Bloom, K; Claes, D R; Dominguez, A; Fangmeier, C; Gonzalez Suarez, R; Kamalieddin, R; Kravchenko, I; Malta Rodrigues, A; Meier, F; Monroy, J; Siado, J E; Snow, G R; Stieger, B; Alyari, M; Dolen, J; George, J; Godshalk, A; Harrington, C; Iashvili, I; Kaisen, J; Kharchilava, A; Kumar, A; Parker, A; Rappoccio, S; Roozbahani, B; Alverson, G; Barberis, E; Hortiangtham, A; Massironi, A; Morse, D M; Nash, D; Orimoto, T; Teixeira De Lima, R; Trocino, D; Wang, R-J; Wood, D; Bhattacharya, S; Charaf, O; Hahn, K A; Kubik, A; Kumar, A; Mucia, N; Odell, N; Pollack, B; Schmitt, M H; Sung, K; Trovato, M; Velasco, M; Dev, N; Hildreth, M; Hurtado Anampa, K; Jessop, C; Karmgard, D J; Kellams, N; Lannon, K; Marinelli, N; Meng, F; Mueller, C; Musienko, Y; Planer, M; Reinsvold, A; Ruchti, R; Smith, G; Taroni, S; Wayne, M; Wolf, M; Woodard, A; Alimena, J; Antonelli, L; Bylsma, B; Durkin, L S; Flowers, S; Francis, B; Hart, A; Hill, C; Hughes, R; Ji, W; Liu, B; Luo, W; Puigh, D; Winer, B L; Wulsin, H W; Cooperstein, S; Driga, O; Elmer, P; Hardenbrook, J; Hebda, P; Lange, D; Luo, J; Marlow, D; Mc Donald, J; Medvedeva, T; Mei, K; Mooney, M; Olsen, J; Palmer, C; Piroué, P; Stickland, D; Svyatkovskiy, A; Tully, C; Zuranski, A; Malik, S; Barker, A; Barnes, V E; Folgueras, S; Gutay, L; Jha, M K; Jones, M; Jung, A W; Miller, D H; Neumeister, N; Schulte, J F; Shi, X; Sun, J; Wang, F; Xie, W; Parashar, N; Stupak, J; Adair, A; Akgun, B; Chen, Z; Ecklund, K M; Geurts, F J M; Guilbaud, M; Li, W; Michlin, B; Northup, M; Padley, B P; Redjimi, R; Roberts, J; Rorie, J; Tu, Z; Zabel, J; Betchart, B; Bodek, A; de Barbaro, P; Demina, R; Duh, Y T; Ferbel, T; Galanti, M; Garcia-Bellido, A; Han, J; Hindrichs, O; Khukhunaishvili, A; Lo, K H; Tan, P; Verzetti, M; Agapitos, A; Chou, J P; Contreras-Campana, E; Gershtein, Y; Gómez Espinosa, T A; Halkiadakis, E; Heindl, M; Hidas, D; Hughes, E; Kaplan, S; Kunnawalkam Elayavalli, R; Kyriacou, S; Lath, A; Nash, K; Saka, H; Salur, S; Schnetzer, S; Sheffield, D; Somalwar, S; Stone, R; Thomas, S; Thomassen, P; Walker, M; Delannoy, A G; Foerster, M; Heideman, J; Riley, G; Rose, K; Spanier, S; Thapa, K; Bouhali, O; Celik, A; Dalchenko, M; De Mattia, M; Delgado, A; Dildick, S; Eusebi, R; Gilmore, J; Huang, T; Juska, E; Kamon, T; Mueller, R; Pakhotin, Y; Patel, R; Perloff, A; Perniè, L; Rathjens, D; Rose, A; Safonov, A; Tatarinov, A; Ulmer, K A; Akchurin, N; Cowden, C; Damgov, J; De Guio, F; Dragoiu, C; Dudero, P R; Faulkner, J; Gurpinar, E; Kunori, S; Lamichhane, K; Lee, S W; Libeiro, T; Peltola, T; Undleeb, S; Volobouev, I; Wang, Z; Greene, S; Gurrola, A; Janjam, R; Johns, W; Maguire, C; Melo, A; Ni, H; Sheldon, P; Tuo, S; Velkovska, J; Xu, Q; Arenton, M W; Barria, P; Cox, B; Goodell, J; Hirosky, R; Ledovskoy, A; Li, H; Neu, C; Sinthuprasith, T; Sun, X; Wang, Y; Wolfe, E; Xia, F; Clarke, C; Harr, R; Karchin, P E; Sturdy, J; Belknap, D A; Buchanan, J; Caillol, C; Dasu, S; Dodd, L; Duric, S; Gomber, B; Grothe, M; Herndon, M; Hervé, A; Klabbers, P; Lanaro, A; Levine, A; Long, K; Loveless, R; Ojalvo, I; Perry, T; Pierro, G A; Polese, G; Ruggles, T; Savin, A; Smith, N; Smith, W H; Taylor, D; Woods, N

    2017-01-01

    The nuclear modification factor [Formula: see text] and the azimuthal anisotropy coefficient [Formula: see text] of prompt and nonprompt (i.e. those from decays of b hadrons) [Formula: see text] mesons, measured from PbPb and pp collisions at [Formula: see text] [Formula: see text] at the LHC, are reported. The results are presented in several event centrality intervals and several kinematic regions, for transverse momenta [Formula: see text] [Formula: see text] and rapidity [Formula: see text], extending down to [Formula: see text] [Formula: see text] in the [Formula: see text] range. The [Formula: see text] of prompt [Formula: see text] is found to be nonzero, but with no strong dependence on centrality, rapidity, or [Formula: see text] over the full kinematic range studied. The measured [Formula: see text] of nonprompt [Formula: see text] is consistent with zero. The [Formula: see text] of prompt [Formula: see text] exhibits a suppression that increases from peripheral to central collisions but does not vary strongly as a function of either y or [Formula: see text] in the fiducial range. The nonprompt [Formula: see text] [Formula: see text] shows a suppression which becomes stronger as rapidity or [Formula: see text] increases. The [Formula: see text] and [Formula: see text] of open and hidden charm, and of open charm and beauty, are compared.

  10. Multimedia Information Extraction

    CERN Document Server

    Maybury, Mark T

    2012-01-01

    The advent of increasingly large consumer collections of audio (e.g., iTunes), imagery (e.g., Flickr), and video (e.g., YouTube) is driving a need not only for multimedia retrieval but also information extraction from and across media. Furthermore, industrial and government collections fuel requirements for stock media access, media preservation, broadcast news retrieval, identity management, and video surveillance.  While significant advances have been made in language processing for information extraction from unstructured multilingual text and extraction of objects from imagery and vid

  11. Geographical Text Analysis: A new approach to understanding nineteenth-century mortality.

    Science.gov (United States)

    Porter, Catherine; Atkinson, Paul; Gregory, Ian

    2015-11-01

    This paper uses a combination of Geographic Information Systems (GIS) and corpus linguistic analysis to extract and analyse disease related keywords from the Registrar-General's Decennial Supplements. Combined with known mortality figures, this provides, for the first time, a spatial picture of the relationship between the Registrar-General's discussion of disease and deaths in England and Wales in the nineteenth and early twentieth centuries. Techniques such as collocation, density analysis, the Hierarchical Regional Settlement matrix and regression analysis are employed to extract and analyse the data resulting in new insight into the relationship between the Registrar-General's published texts and the changing mortality patterns during this time. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Electromembrane extraction

    DEFF Research Database (Denmark)

    Huang, Chuixiu; Chen, Zhiliang; Gjelstad, Astrid

    2017-01-01

    Electromembrane extraction (EME) was inspired by solid-phase microextraction and developed from hollow fiber liquid-phase microextraction in 2006 by applying an electric field over the supported liquid membrane (SLM). EME provides rapid extraction, efficient sample clean-up and selectivity based...

  13. Vacuum extraction

    DEFF Research Database (Denmark)

    Maagaard, Mathilde; Oestergaard, Jeanett; Johansen, Marianne

    2012-01-01

    Objectives. To develop and validate an Objective Structured Assessment of Technical Skills (OSATS) scale for vacuum extraction. Design. Two part study design: Primarily, development of a procedure-specific checklist for vacuum extraction. Hereafter, validationof the developed OSATS scale for vacuum...

  14. Newspaper archives + text mining = rich sources of historical geo-spatial data

    Science.gov (United States)

    Yzaguirre, A.; Smit, M.; Warren, R.

    2016-04-01

    Newspaper archives are rich sources of cultural, social, and historical information. These archives, even when digitized, are typically unstructured and organized by date rather than by subject or location, and require substantial manual effort to analyze. The effort of journalists to be accurate and precise means that there is often rich geo-spatial data embedded in the text, alongside text describing events that editors considered to be of sufficient importance to the region or the world to merit column inches. A regional newspaper can add over 100,000 articles to its database each year, and extracting information from this data for even a single country would pose a substantial Big Data challenge. In this paper, we describe a pilot study on the construction of a database of historical flood events (location(s), date, cause, magnitude) to be used in flood assessment projects, for example to calibrate models, estimate frequency, establish high water marks, or plan for future events in contexts ranging from urban planning to climate change adaptation. We then present a vision for extracting and using the rich geospatial data available in unstructured text archives, and suggest future avenues of research.

  15. A programmed text in statistics

    CERN Document Server

    Hine, J

    1975-01-01

    Exercises for Section 2 42 Physical sciences and engineering 42 43 Biological sciences 45 Social sciences Solutions to Exercises, Section 1 47 Physical sciences and engineering 47 49 Biological sciences 49 Social sciences Solutions to Exercises, Section 2 51 51 PhYSical sciences and engineering 55 Biological sciences 58 Social sciences 62 Tables 2 62 x - tests involving variances 2 63,64 x - one tailed tests 2 65 x - two tailed tests F-distribution 66-69 Preface This project started some years ago when the Nuffield Foundation kindly gave a grant for writing a pro­ grammed text to use with service courses in statistics. The work carried out by Mrs. Joan Hine and Professor G. B. Wetherill at Bath University, together with some other help from time to time by colleagues at Bath University and elsewhere. Testing was done at various colleges and universities, and some helpful comments were received, but we particularly mention King Edwards School, Bath, who provided some sixth formers as 'guinea pigs' for the fir...

  16. Audio Steganography with Embedded Text

    Science.gov (United States)

    Teck Jian, Chua; Chai Wen, Chuah; Rahman, Nurul Hidayah Binti Ab.; Hamid, Isredza Rahmi Binti A.

    2017-08-01

    Audio steganography is about hiding the secret message into the audio. It is a technique uses to secure the transmission of secret information or hide their existence. It also may provide confidentiality to secret message if the message is encrypted. To date most of the steganography software such as Mp3Stego and DeepSound use block cipher such as Advanced Encryption Standard or Data Encryption Standard to encrypt the secret message. It is a good practice for security. However, the encrypted message may become too long to embed in audio and cause distortion of cover audio if the secret message is too long. Hence, there is a need to encrypt the message with stream cipher before embedding the message into the audio. This is because stream cipher provides bit by bit encryption meanwhile block cipher provide a fixed length of bits encryption which result a longer output compare to stream cipher. Hence, an audio steganography with embedding text with Rivest Cipher 4 encryption cipher is design, develop and test in this project.

  17. The use of digressions in oral texts

    Directory of Open Access Journals (Sweden)

    Leila Maria Tesch

    2017-02-01

    Full Text Available Digression occurs in a momentary action of suspension of a topic, in which a new and of more interest one is introduced in that moment, followed by the reintroduction of the original topic. According Dascal and Katriel (1979, there are three types of digressions: 1 the one based on the enunciation; 2 the one based on the interaction; and on 3 inserted sequences. This paper aims to investigate the use of these three types of digressions in oral texts produced by Capixabas speakers, extracted from typically labovian interviews from the PortVix database. The objective of this research is to observe to what extent the interaction, by employing this strategy, is a kind of reorientation of its meaning. The qualitative analysis of the data showed that digression is a conversational strategy which is widely used by the speaker and, despite its suspensory and fluctuating character, it should be considered a coherent event, provided that it decisively intervenes in the establishment, and maintenance of textual and interactional organization of the communicative event.

  18. Characterization and Antioxidant Properties of Six Algerian Propolis Extracts: Ethyl Acetate Extracts Inhibit Myeloperoxidase Activity

    Directory of Open Access Journals (Sweden)

    Yasmina Mokhtaria Boufadi

    2014-02-01

    Full Text Available Because propolis contains many types of antioxidant compounds such as polyphenols and flavonoids, it can be useful in preventing oxidative damages. Ethyl acetate extracts of propolis from several Algerian regions show high activity by scavenging free radicals, preventing lipid peroxidation and inhibiting myeloperoxidase (MPO. By fractioning and assaying ethyl acetate extracts, it was observed that both polyphenols and flavonoids contribute to these activities. A correlation was observed between the polyphenol content and the MPO inhibition. However, it seems that kaempferol, a flavonoid, contributes mainly to the MPO inhibition. This molecule is in a high amount in the ethyl acetate extract and demonstrates the best efficiency towards the enzyme with an inhibiting concentration at 50% of 4 ± 2 µM.

  19. Discovering gene annotations in biomedical text databases

    Directory of Open Access Journals (Sweden)

    Ozsoyoglu Gultekin

    2008-03-01

    Full Text Available Abstract Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i automating the annotation of genomic entities with Gene Ontology concepts, and (ii providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate

  20. Text summarization as a decision support aid

    Directory of Open Access Journals (Sweden)

    Workman T

    2012-05-01

    Full Text Available Abstract Background PubMed data potentially can provide decision support information, but PubMed was not exclusively designed to be a point-of-care tool. Natural language processing applications that summarize PubMed citations hold promise for extracting decision support information. The objective of this study was to evaluate the efficiency of a text summarization application called Semantic MEDLINE, enhanced with a novel dynamic summarization method, in identifying decision support data. Methods We downloaded PubMed citations addressing the prevention and drug treatment of four disease topics. We then processed the citations with Semantic MEDLINE, enhanced with the dynamic summarization method. We also processed the citations with a conventional summarization method, as well as with a baseline procedure. We evaluated the results using clinician-vetted reference standards built from recommendations in a commercial decision support product, DynaMed. Results For the drug treatment data, Semantic MEDLINE enhanced with dynamic summarization achieved average recall and precision scores of 0.848 and 0.377, while conventional summarization produced 0.583 average recall and 0.712 average precision, and the baseline method yielded average recall and precision values of 0.252 and 0.277. For the prevention data, Semantic MEDLINE enhanced with dynamic summarization achieved average recall and precision scores of 0.655 and 0.329. The baseline technique resulted in recall and precision scores of 0.269 and 0.247. No conventional Semantic MEDLINE method accommodating summarization for prevention exists. Conclusion Semantic MEDLINE with dynamic summarization outperformed conventional summarization in terms of recall, and outperformed the baseline method in both recall and precision. This new approach to text summarization demonstrates potential in identifying decision support data for multiple needs.

  1. A Novel Method of Genomic DNA Extraction for Cactaceae

    Directory of Open Access Journals (Sweden)

    Shannon D. Fehlberg

    2013-03-01

    Full Text Available Premise of the study: Genetic studies of Cactaceae can at times be impeded by difficult sampling logistics and/or high mucilage content in tissues. Simplifying sampling and DNA isolation through the use of cactus spines has not previously been investigated. Methods and Results: Several protocols for extracting DNA from spines were tested and modified to maximize yield, amplification, and sequencing. Sampling of and extraction from spines resulted in a simplified protocol overall and complete avoidance of mucilage as compared to typical tissue extractions. Sequences from one nuclear and three plastid regions were obtained across eight genera and 20 species of cacti using DNA extracted from spines. Conclusions: Genomic DNA useful for amplification and sequencing can be obtained from cactus spines. The protocols described here are valuable for any cactus species, but are particularly useful for investigators interested in sampling living collections, extensive field sampling, and/or conservation genetic studies.

  2. The effects of tea extracts on proinflammatory signaling

    Directory of Open Access Journals (Sweden)

    McBride William H

    2006-12-01

    Full Text Available Abstract Background Skin toxicity is a common side effect of radiotherapy for solid tumors. Its management can cause treatment gaps and thus can impair cancer treatment. At present, in many countries no standard recommendation for treatment of skin during radiotherapy exists. In this study, we explored the effect of topically-applied tea extracts on the duration of radiation-induced skin toxicity. We investigated the underlying molecular mechanisms and compared effects of tea extracts with the effects of epigallocatechin-gallate, the proposed most-active moiety of green tea. Methods Data from 60 patients with cancer of the head and neck or pelvic region topically treated with green or black tea extracts were analyzed retrospectively. Tea extracts were compared for their ability to modulate IL-1β, IL-6, IL-8, TNFα and PGE2 release from human monocytes. Effects of tea extracts on 26S proteasome function were assessed. NF-κB activity was monitored by EMSAs. Viability and radiation response of macrophages after exposure to tea extracts was measured by MTT assays. Results Tea extracts supported the restitution of skin integrity. Tea extracts inhibited proteasome function and suppressed cytokine release. NF-κB activity was altered by tea extracts in a complex, caspase-dependent manner, which differed from the effects of epigallocatechin-gallate. Additionally, both tea extracts, as well as epigallocatechin-gallate, slightly protected macrophages from ionizing radiation Conclusion Tea extracts are an efficient, broadly available treatment option for patients suffering from acute radiation-induced skin toxicity. The molecular mechanisms underlying the beneficial effects are complex, and most likely not exclusively dependent on effects of tea polyphenols such as epigallocatechin-gallate.

  3. Linguistically informed digital fingerprints for text

    Science.gov (United States)

    Uzuner, Özlem

    2006-02-01

    Digital fingerprinting, watermarking, and tracking technologies have gained importance in the recent years in response to growing problems such as digital copyright infringement. While fingerprints and watermarks can be generated in many different ways, use of natural language processing for these purposes has so far been limited. Measuring similarity of literary works for automatic copyright infringement detection requires identifying and comparing creative expression of content in documents. In this paper, we present a linguistic approach to automatically fingerprinting novels based on their expression of content. We use natural language processing techniques to generate "expression fingerprints". These fingerprints consist of both syntactic and semantic elements of language, i.e., syntactic and semantic elements of expression. Our experiments indicate that syntactic and semantic elements of expression enable accurate identification of novels and their paraphrases, providing a significant improvement over techniques used in text classification literature for automatic copy recognition. We show that these elements of expression can be used to fingerprint, label, or watermark works; they represent features that are essential to the character of works and that remain fairly consistent in the works even when works are paraphrased. These features can be directly extracted from the contents of the works on demand and can be used to recognize works that would not be correctly identified either in the absence of pre-existing labels or by verbatim-copy detectors.

  4. Named entity recognition in Slovene text

    Directory of Open Access Journals (Sweden)

    Tadej Štajner

    2013-12-01

    Full Text Available This paper presents an approach and an implementation of a named entity extractor for Slovene language, based on a machine learning approach. It is designed as a supervised algorithm based on Conditional Random Fields and is trained on the ssj500k annotated corpus of Slovene. The corpus, which is available under a Creative Commons CC-BY-NC-SA licence, is annotated with morphosyntactic tags, as well as named entities for people, locations, organisations, and miscellaneous names. The paper discusses the influence of morphosyntactic tags, lexicons and conjunctions of features of neighbouring words. An important contribution of this investigation is that morphosyntactic tags benefit named entity extraction. Using all the best-performing features the recognizer reaches a precision of 74% and a recall of 72%, having stronger performance on personal and geographical named entities, followed by organizations, but performs poorly on the miscellaneous entities, since this class is very diverse and consequently difficult to predict. A major contribution of the paper is also showing the benefits of splitting the class of miscellaneous entities into organizations and other entities, which in turn improves performance even on personal and organizational names. The software, developed in this research is freely available under the Apache 2.0 licence at http://ailab.ijs.si/~tadej/slner.zip, while development versions are available at https://github.com/tadejs/slner.

  5. Text summarization as a decision support aid.

    Science.gov (United States)

    Workman, T Elizabeth; Fiszman, Marcelo; Hurdle, John F

    2012-05-23

    PubMed data potentially can provide decision support information, but PubMed was not exclusively designed to be a point-of-care tool. Natural language processing applications that summarize PubMed citations hold promise for extracting decision support information. The objective of this study was to evaluate the efficiency of a text summarization application called Semantic MEDLINE, enhanced with a novel dynamic summarization method, in identifying decision support data. We downloaded PubMed citations addressing the prevention and drug treatment of four disease topics. We then processed the citations with Semantic MEDLINE, enhanced with the dynamic summarization method. We also processed the citations with a conventional summarization method, as well as with a baseline procedure. We evaluated the results using clinician-vetted reference standards built from recommendations in a commercial decision support product, DynaMed. For the drug treatment data, Semantic MEDLINE enhanced with dynamic summarization achieved average recall and precision scores of 0.848 and 0.377, while conventional summarization produced 0.583 average recall and 0.712 average precision, and the baseline method yielded average recall and precision values of 0.252 and 0.277. For the prevention data, Semantic MEDLINE enhanced with dynamic summarization achieved average recall and precision scores of 0.655 and 0.329. The baseline technique resulted in recall and precision scores of 0.269 and 0.247. No conventional Semantic MEDLINE method accommodating summarization for prevention exists. Semantic MEDLINE with dynamic summarization outperformed conventional summarization in terms of recall, and outperformed the baseline method in both recall and precision. This new approach to text summarization demonstrates potential in identifying decision support data for multiple needs.

  6. Extraction with supercritical gases

    Energy Technology Data Exchange (ETDEWEB)

    Williams, D.F.

    1981-01-01

    Extraction with compressed fluids in the critical region is discussed in terms of the marked effect on solvent properties that can be brought about by small changes in pressure or temperature. The theoretical background and experimental data are described, including the classification of the phase behaviour of binary systems. A number of application studies are quoted, and comparison is made with liquid solvent extraction and distillation. Apart from such topics as the breaking of azeotropes, the main area of study is in performing separations on the basis of volatility where the general level of volatility is low. In the field of natural products these include the removal of undesirable substances such as caffeine and nicotine and the isolation of valuable constituents such as food essences and drugs. For fossil fuels, applications are described in enhanced oil recovery, fractionation of heavy petroleum liquids and extraction of liquids from coal.

  7. Data Mining of Causal Relations from Text: Analysing Maritime Accident Investigation Reports

    OpenAIRE

    Tirunagari, Santosh

    2015-01-01

    Text mining is a process of extracting information of interest from text. Such a method includes techniques from various areas such as Information Retrieval (IR), Natural Language Processing (NLP), and Information Extraction (IE). In this study, text mining methods are applied to extract causal relations from maritime accident investigation reports collected from the Marine Accident Investigation Branch (MAIB). These causal relations provide information on various mechanisms behind accidents,...

  8. Storyteller: Visual Analytics of Perspectives on Rich Text Interpretations

    NARCIS (Netherlands)

    Fokkens, A.S.

    2017-01-01

    Complexity of event data in texts makes it difficult to assess its content, espe- cially when considering larger collections in which different sources report on the same or similar situations. We present a system that makes it possible to visually analyze complex event and emotion data extracted

  9. How Well can We Learn Interpretable Entity Types from Text?

    DEFF Research Database (Denmark)

    Hovy, Dirk

    2014-01-01

    We investigate a largely unsupervised approach to learning interpretable, domain-specific entity types from unlabeled text. It assumes that any common noun in a domain can function as potential entity type, and uses those nouns as hidden variables in a HMM. To constrain training, it extracts co-o...... an informed baseline, reducing the error rate by 56%....

  10. Temporal analysis of text data using latent variable models

    DEFF Research Database (Denmark)

    Mølgaard, Lasse Lohilahti; Larsen, Jan; Goutte, Cyril

    2009-01-01

    Detecting and tracking of temporal data is an important task in multiple applications. In this paper we study temporal text mining methods for Music Information Retrieval. We compare two ways of detecting the temporal latent semantics of a corpus extracted from Wikipedia, using a stepwise...

  11. TextFlow: towards better understanding of evolving topics in text.

    Science.gov (United States)

    Cui, Weiwei; Liu, Shixia; Tan, Li; Shi, Conglei; Song, Yangqiu; Gao, Zekai J; Tong, Xin; Qu, Huamin

    2011-12-01

    Understanding how topics evolve in text data is an important and challenging task. Although much work has been devoted to topic analysis, the study of topic evolution has largely been limited to individual topics. In this paper, we introduce TextFlow, a seamless integration of visualization and topic mining techniques, for analyzing various evolution patterns that emerge from multiple topics. We first extend an existing analysis technique to extract three-level features: the topic evolution trend, the critical event, and the keyword correlation. Then a coherent visualization that consists of three new visual components is designed to convey complex relationships between them. Through interaction, the topic mining model and visualization can communicate with each other to help users refine the analysis result and gain insights into the data progressively. Finally, two case studies are conducted to demonstrate the effectiveness and usefulness of TextFlow in helping users understand the major topic evolution patterns in time-varying text data. © 2011 IEEE

  12. Supercritical extraction of pupunha (Guilielma speciosa oil in a fixed bed using carbon dioxide

    Directory of Open Access Journals (Sweden)

    Araújo M.E.

    2000-01-01

    Full Text Available The pupunha (Guilielma speciosa is the fruit of a palm tree typical of the Brazilian Northern region, whose stem is used as a source of heart of palm. The fruit, which is about 65% pulp, is a source of oil and carotenes. In the present work, an analysis of the kinetics of supercritical extraction of oil from the pupunha pulp is presented. Carbon dioxide was used as solvent. The extractions were carried out at 25 MPa and 323 K and 30 MPa and 318 K. The chemical composition of the extracts in terms of fatty acids was determined by gas chromatography. The amount of oleic acid, a saturated fatty acid, in the CO2 extracts was larger than that in the extract obtained with hexane. The overall extraction curves were modeled using the single-parameter model proposed in the literature to describe the desorption of toluene from activated coal.

  13. Word and text processing in developmental prosopagnosia.

    Science.gov (United States)

    Rubino, Cristina; Corrow, Sherryse L; Corrow, Jeffrey C; Duchaine, Brad; Barton, Jason J S

    2016-01-01

    The "many-to-many" hypothesis proposes that visual object processing is supported by distributed circuits that overlap for different object categories. For faces and words the hypothesis posits that both posterior fusiform regions contribute to both face and visual word perception and predicts that unilateral lesions impairing one will affect the other. However, studies testing this hypothesis have produced mixed results. We evaluated visual word processing in subjects with developmental prosopagnosia, a condition linked to right posterior fusiform abnormalities. Ten developmental prosopagnosic subjects performed a word-length effect task and a task evaluating the recognition of word content across variations in text style, and the recognition of style across variations in word content. All subjects had normal word-length effects. One had prolonged sorting time for word recognition in handwritten stimuli. These results suggest that the deficit in developmental prosopagnosia is unlikely to affect visual word processing, contrary to predictions of the many-to-many hypothesis.

  14. Utilization of Lavandula angustifolia Miller extracts as naturalrepellents, pharmaceutical and industrial auxiliaries

    Directory of Open Access Journals (Sweden)

    AYOE YUSUFOGLU

    2004-01-01

    Full Text Available Essential oils, absolutes and concretes were prepared from the flowers and leaves of the plant Lavandula angustifolia Miller cultivated in the Bosphorus region of Istanbul, Turkey. The difference in the chemical composition of the mentioned extracts was investigated and compared by using a combination of capillary GC-MS with the aim of offering them as repellent, pharmaceutical and industrial auxiliaries. The IR-spectra, the yields and the physico-chemical data of the extracts were also analysed.

  15. Understanding a reader's attraction to a literary short text

    OpenAIRE

    Darío Luis Banegas

    2014-01-01

    The aim of this article is to understand why a reader may feel attracted to a short stretch of fictional discourse. I analyse a short extract taken from Khaled Hosseini’s novel A Thousand Splendid Suns through the integration of different perspectives in discourse analysis. First, I analyse the text in terms of contexts of culture and situation including field, tenor, mode, participants’ social world, setting, channel, and key. In the second section I attempt to examine the text line by line ...

  16. Regional odontodysplasia: case report

    Directory of Open Access Journals (Sweden)

    Ana Carolina Magalhães

    2007-12-01

    Full Text Available Regional odontodysplasia (RO is a rare developmental anomaly involving both mesodermal and ectodermal dental components in a group of contiguous teeth. It affects the primary and permanent dentitions in the maxilla and mandible or both jaws. Generally it is localized in only one arch. The etiology of this dental anomaly is uncertain. Clinically, affected teeth have an abnormal morphology, are soft on probing and typically discolored, yellow or yellowish-brown. Radiographically, the affected teeth show a "ghostlike" appearance. This paper reports the case of a 5-year-old girl presenting this rare anomaly on the left side of the maxillary arch, which crossed the midline. The primary maxillary left teeth (except for the canine and the primary maxillary right central incisor were missing due to previous extractions. The permanent teeth had a "ghostlike" appearance radiographically. The treatment performed was rehabilitation with temporary partial acrylic denture and periodic controls. In the future, the extraction of affected permanent teeth and rehabilitation with dental implants will be evaluated. The presentation of this case adds valuable information to pediatric dentists to review special clinical and radiographic features of RO, which will facilitate the diagnosis and treatment of patients with this condition.

  17. Region segmentation along image sequence

    Energy Technology Data Exchange (ETDEWEB)

    Monchal, L.; Aubry, P.

    1995-12-31

    A method to extract regions in sequence of images is proposed. Regions are not matched from one image to the following one. The result of a region segmentation is used as an initialization to segment the following and image to track the region along the sequence. The image sequence is exploited as a spatio-temporal event. (authors). 12 refs., 8 figs.

  18. Asymmetric extractions in orthodontics

    Directory of Open Access Journals (Sweden)

    Camilo Aquino Melgaço

    2012-04-01

    Full Text Available INTRODUCTION: Extraction decisions are extremely important in during treatment planning. In addition to the extraction decision orthodontists have to choose what tooth should be extracted for the best solution of the problem and the esthetic/functional benefit of the patient. OBJECTIVE: This article aims at reviewing the literature relating the advantages, disadvantages and clinical implications of asymmetric extractions to orthodontics. METHODS: Keywords were selected in English and Portuguese and the EndNote 9 program was used for data base search in PubMed, Web of Science (WSc and LILACS. The selected articles were case reports, original articles and prospective or retrospective case-control studies concerning asymmetrical extractions of permanent teeth for the treatment of malocclusions. CONCLUSION: According to the literature reviewed asymmetric extractions can make some specific treatment mechanics easier. Cases finished with first permanent molars in Class II or III relationship in one or both sides seem not to cause esthetic or functional problems. However, diagnosis knowledge and mechanics control are essential for treatment success.

  19. Throw the bath water out, keep the baby: keeping medically-relevant terms for text mining.

    Science.gov (United States)

    Jarman, Jay; Berndt, Donald J

    2010-11-13

    The purpose of this research is to answer the question, can medically-relevant terms be extracted from text notes and text mined for the purpose of classification and obtain equal or better results than text mining the original note? A novel method is used to extract medically-relevant terms for the purpose of text mining. A dataset of 5,009 EMR text notes (1,151 related to falls) was obtained from a Veterans Administration Medical Center. The dataset was processed with a natural language processing (NLP) application which extracted concepts based on SNOMED-CT terms from the Unified Medical Language System (UMLS) Metathesaurus. SAS Enterprise Miner was used to text mine both the set of complete text notes and the set represented by the extracted concepts. Logistic regression models were built from the results, with the extracted concept model performing slightly better than the complete note model.

  20. EXPANDING EXTRACTIONS

    NARCIS (Netherlands)

    Dietzenbacher, Erik; Lahr, Michael L.

    2013-01-01

    In this paper, we generalize hypothetical extraction techniques. We suggest that the effect of certain economic phenomena can be measured by removing them from an input-output (I-O) table and by rebalancing the set of I-O accounts. The difference between the two sets of accounts yields the

  1. Protein Extractability

    African Journals Online (AJOL)

    limited to high oleic acid oil and water purification property (Katayon et al., 2006; Foid et al., 2001 and. Folkard et al., 1993), whereas it contains up to. 332.5 g of crude protein per kg of sample (Jose et al., 1999). Studies to characterize the interaction effects of pH and salts on the extraction of. PROTEIN EXTRACTABILITY ...

  2. Relating interesting quantitative time series patterns with text events and text features

    Science.gov (United States)

    Wanner, Franz; Schreck, Tobias; Jentner, Wolfgang; Sharalieva, Lyubka; Keim, Daniel A.

    2013-12-01

    In many application areas, the key to successful data analysis is the integrated analysis of heterogeneous data. One example is the financial domain, where time-dependent and highly frequent quantitative data (e.g., trading volume and price information) and textual data (e.g., economic and political news reports) need to be considered jointly. Data analysis tools need to support an integrated analysis, which allows studying the relationships between textual news documents and quantitative properties of the stock market price series. In this paper, we describe a workflow and tool that allows a flexible formation of hypotheses about text features and their combinations, which reflect quantitative phenomena observed in stock data. To support such an analysis, we combine the analysis steps of frequent quantitative and text-oriented data using an existing a-priori method. First, based on heuristics we extract interesting intervals and patterns in large time series data. The visual analysis supports the analyst in exploring parameter combinations and their results. The identified time series patterns are then input for the second analysis step, in which all identified intervals of interest are analyzed for frequent patterns co-occurring with financial news. An a-priori method supports the discovery of such sequential temporal patterns. Then, various text features like the degree of sentence nesting, noun phrase complexity, the vocabulary richness, etc. are extracted from the news to obtain meta patterns. Meta patterns are defined by a specific combination of text features which significantly differ from the text features of the remaining news data. Our approach combines a portfolio of visualization and analysis techniques, including time-, cluster- and sequence visualization and analysis functionality. We provide two case studies, showing the effectiveness of our combined quantitative and textual analysis work flow. The workflow can also be generalized to other

  3. Text Clustering Based on the User Search Intention

    Science.gov (United States)

    Liu, Wenjing; Zhou, Yanquan; Ren, Fuji

    This paper presents a novel algorithm of Text Clustering. With the popularity of the Internet, text information on the web shows explosive growth trend. Text Clustering technology as a method of unsupervised machine learning, which does not need the training process and pre-manual tagging, so Text Clustering is an effective way for dealing with massive text messages. The traditional Text Clustering is based on the content of the article, and they think that the articles which belong to the same class have the greater similarity. In this paper, we extracted label word from the summary information returned by search engine. Then did hierarchical clustering based on the text feature of the label word. Experiment shows that the algorithm is feasible.

  4. Microwave extraction of bioactive compounds

    Directory of Open Access Journals (Sweden)

    Monika Blekić

    2011-01-01

    Full Text Available Microwave extraction presents novel extraction and treatment method for food processing. In paper, several examples of microwave extraction of bioactive compounds are presented. Also, novel innovative equipment for microwave extraction and hydrodiffusion with gravitation is presented. Advantage of using novel equipment for microwave extraction is shown, and it include, shorter treatment time, less usage or without any solvent use. Novel method is compared to standard extraction methods. Some positive and negative aspects of microwave heating can be observed, and also its influence on development of oxidation in sunflower oil subjected to microwave heating. Also, use of microwaves for the extraction of essential oils is shown. One can also see the advantages of solvent-free microwave extraction of essential oil from aromatic herbs in comparison with the standard extraction, and determination of antioxidant components in rice bran oil extracted by microwave-assisted method. Comparison of microwave and ultrasound extraction, as well as positive and negative aspects of the combination of microwaves and ultrasound is described.

  5. Genotoxicity of plant extracts

    Directory of Open Access Journals (Sweden)

    Vera M. F. Vargas

    1991-01-01

    Full Text Available Aqueous extracts of seven species used in Brazilian popular medicine (Achyrocline satureoides, Iodina rhombifolia, Desmodium incanum, Baccharis anomala, Tibouchina asperior, Luehea divaricata, Maytenus ilicifolia were screened to the presence of mutagenic activity in the Ames test (Salmonella/microsome. Positive results were obtained for A. satureoides, B anomala and L. divaricata with microsomal activation. As shown elsewhere (Vargas et al., 1990 the metabolites of A. satureoides extract also show the capacity to induce prophage and/or SOS response in microscreen phage induction assay and SOS spot chromotest.

  6. Advanced text authorship detection methods and their application to biblical texts

    Science.gov (United States)

    Putniņš, Tālis; Signoriello, Domenic J.; Jain, Samant; Berryman, Matthew J.; Abbott, Derek

    2005-12-01

    Authorship attribution has a range of applications in a growing number of fields such as forensic evidence, plagiarism detection, email filtering, and web information management. In this study, three attribution techniques are extended, tested on a corpus of English texts, and applied to a book in the New Testament of disputed authorship. The word recurrence interval based method compares standard deviations of the number of words between successive occurrences of a keyword both graphically and with chi-squared tests. The trigram Markov method compares the probabilities of the occurrence of words conditional on the preceding two words to determine the similarity between texts. The third method extracts stylometric measures such as the frequency of occurrence of function words and from these constructs text classification models using multiple discriminant analysis. The effectiveness of these techniques is compared. The accuracy of the results obtained by some of these extended methods is higher than many of the current state of the art approaches. Statistical evidence is presented about the authorship of the selected book from the New Testament.

  7. Assessment of mutagenicity and cytotoxicity of Solanum paniculatum L. extracts using in vivo micronucleus test in mice

    Directory of Open Access Journals (Sweden)

    PM. Vieira

    Full Text Available Solanum paniculatum L. is a plant species widespread throughout tropical America, especially in the Brazilian Savanna region. It is used in Brazil for culinary purposes and in folk medicine to treat liver and gastric dysfunctions, as well as hangovers. Because of the wide use of this plant as a therapeutic resource and food, the present study aimed at evaluating the mutagenic and cytotoxic effects of S. paniculatum ethanolic leaf and fruit extracts using the mouse bone marrow micronucleus test. Our results indicate that neither S. paniculatum ethanolic leaf extract nor its ethanolic fruit extract exhibited mutagenic effect in mice bone marrow; however, at higher doses, both extracts presented cytotoxic activity.

  8. Comparing and combining chunkers of biomedical text.

    Science.gov (United States)

    Kang, Ning; van Mulligen, Erik M; Kors, Jan A

    2011-04-01

    Text chunking is an essential pre-processing step in information extraction systems. No comparative studies of chunking systems, including sentence splitting, tokenization and part-of-speech tagging, are available for the biomedical domain. We compared the usability (ease of integration, speed, trainability) and performance of six state-of-the-art chunkers for the biomedical domain, and combined the chunker results in order to improve chunking performance. We investigated six frequently used chunkers: GATE chunker, Genia Tagger, Lingpipe, MetaMap, OpenNLP, and Yamcha. All chunkers were integrated into the Unstructured Information Management Architecture framework. The GENIA Treebank corpus was used for training and testing. Performance was assessed for noun-phrase and verb-phrase chunking. For both noun-phrase chunking and verb-phrase chunking, OpenNLP performed best (F-scores 89.7% and 95.7%, respectively), but differences with Genia Tagger and Yamcha were small. With respect to usability, Lingpipe and OpenNLP scored best. When combining the results of the chunkers by a simple voting scheme, the F-score of the combined system improved by 3.1 percentage point for noun phrases and 0.6 percentage point for verb phrases as compared to the best single chunker. Changing the voting threshold offered a simple way to obtain a system with high precision (and moderate recall) or high recall (and moderate precision). This study is the first to compare the performance of the whole chunking pipeline, and to combine different existing chunking systems. Several chunkers showed good performance, but OpenNLP scored best both in performance and usability. The combination of chunker results by a simple voting scheme can further improve performance and allows for different precision-recall settings. Copyright © 2010 Elsevier Inc. All rights reserved.

  9. Measurement of prompt and nonprompt [Formula: see text] production in [Formula: see text] and [Formula: see text] collisions at [Formula: see text].

    Science.gov (United States)

    Sirunyan, A M; Tumasyan, A; Adam, W; Asilar, E; Bergauer, T; Brandstetter, J; Brondolin, E; Dragicevic, M; Erö, J; Flechl, M; Friedl, M; Frühwirth, R; Ghete, V M; Hartl, C; Hörmann, N; Hrubec, J; Jeitler, M; König, A; Krätschmer, I; Liko, D; Matsushita, T; Mikulec, I; Rabady, D; Rad, N; Rahbaran, B; Rohringer, H; Schieck, J; Strauss, J; Waltenberger, W; Wulz, C-E; Dvornikov, O; Makarenko, V; Mossolov, V; Suarez Gonzalez, J; Zykunov, V; Shumeiko, N; Alderweireldt, S; De Wolf, E A; Janssen, X; Lauwers, J; Van De Klundert, M; Van Haevermaet, H; Van Mechelen, P; Van Remortel, N; Van Spilbeeck, A; Abu Zeid, S; Blekman, F; D'Hondt, J; Daci, N; De Bruyn, I; Deroover, K; Lowette, S; Moortgat, S; Moreels, L; Olbrechts, A; Python, Q; Skovpen, K; Tavernier, S; Van Doninck, W; Van Mulders, P; Van Parijs, I; Brun, H; Clerbaux, B; De Lentdecker, G; Delannoy, H; Fasanella, G; Favart, L; Goldouzian, R; Grebenyuk, A; Karapostoli, G; Lenzi, T; Léonard, A; Luetic, J; Maerschalk, T; Marinov, A; Randle-Conde, A; Seva, T; Vander Velde, C; Vanlaer, P; Vannerom, D; Yonamine, R; Zenoni, F; Zhang, F; Cimmino, A; Cornelis, T; Dobur, D; Fagot, A; Gul, M; Khvastunov, I; Poyraz, D; Salva, S; Schöfbeck, R; Tytgat, M; Van Driessche, W; Yazgan, E; Zaganidis, N; Bakhshiansohi, H; Beluffi, C; Bondu, O; Brochet, S; Bruno, G; Caudron, A; De Visscher, S; Delaere, C; Delcourt, M; Francois, B; Giammanco, A; Jafari, A; Komm, M; Krintiras, G; Lemaitre, V; Magitteri, A; Mertens, A; Musich, M; Piotrzkowski, K; Quertenmont, L; Selvaggi, M; Vidal Marono, M; Wertz, S; Beliy, N; Aldá Júnior, W L; Alves, F L; Alves, G A; Brito, L; Hensel, C; Moraes, A; Pol, M E; Rebello Teles, P; Belchior Batista Das Chagas, E; Carvalho, W; Chinellato, J; Custódio, A; Da Costa, E M; Da Silveira, G G; De Jesus Damiao, D; De Oliveira Martins, C; Fonseca De Souza, S; Huertas Guativa, L M; Malbouisson, H; Matos Figueiredo, D; Mora Herrera, C; Mundim, L; Nogima, H; Prado Da Silva, W L; Santoro, A; Sznajder, A; Tonelli Manganote, E J; Torres Da Silva De Araujo, F; Vilela Pereira, A; Ahuja, S; Bernardes, C A; Dogra, S; Fernandez Perez Tomei, T R; Gregores, E M; Mercadante, P G; Moon, C S; Novaes, S F; Padula, Sandra S; Romero Abad, D; Ruiz Vargas, J C; Aleksandrov, A; Hadjiiska, R; Iaydjiev, P; Rodozov, M; Stoykova, S; Sultanov, G; Vutova, M; Dimitrov, A; Glushkov, I; Litov, L; Pavlov, B; Petkov, P; Fang, W; Ahmad, M; Bian, J G; Chen, G M; Chen, H S; Chen, M; Chen, Y; Cheng, T; Jiang, C H; Leggat, D; Liu, Z; Romeo, F; Ruan, M; Shaheen, S M; Spiezia, A; Tao, J; Wang, C; Wang, Z; Zhang, H; Zhao, J; Ban, Y; Chen, G; Li, Q; Liu, S; Mao, Y; Qian, S J; Wang, D; Xu, Z; Avila, C; Cabrera, A; Chaparro Sierra, L F; Florez, C; Gomez, J P; González Hernández, C F; Ruiz Alvarez, J D; Sanabria, J C; Godinovic, N; Lelas, D; Puljak, I; Ribeiro Cipriano, P M; Sculac, T; Antunovic, Z; Kovac, M; Brigljevic, V; Ferencek, D; Kadija, K; Mesic, B; Susa, T; Attikis, A; Mavromanolakis, G; Mousa, J; Nicolaou, C; Ptochos, F; Razis, P A; Rykaczewski, H; Tsiakkouri, D; Finger, M; Finger, M; Carrera Jarrin, E; Assran, Y; Elkafrawy, T; Mahrous, A; Kadastik, M; Perrini, L; Raidal, M; Tiko, A; Veelken, C; Eerola, P; Pekkanen, J; Voutilainen, M; Härkönen, J; Järvinen, T; Karimäki, V; Kinnunen, R; Lampén, T; Lassila-Perini, K; Lehti, S; Lindén, T; Luukka, P; Tuominiemi, J; Tuovinen, E; Wendland, L; Talvitie, J; Tuuva, T; Besancon, M; Couderc, F; Dejardin, M; Denegri, D; Fabbro, B; Faure, J L; Favaro, C; Ferri, F; Ganjour, S; Ghosh, S; Givernaud, A; Gras, P; Hamel de Monchenault, G; Jarry, P; Kucher, I; Locci, E; Machet, M; Malcles, J; Rander, J; Rosowsky, A; Titov, M; Abdulsalam, A; Antropov, I; Arleo, F; Baffioni, S; Beaudette, F; Busson, P; Cadamuro, L; Chapon, E; Charlot, C; Davignon, O; Granier de Cassagnac, R; Jo, M; Lisniak, S; Miné, P; Nguyen, M; Ochando, C; Ortona, G; Paganini, P; Pigard, P; Regnard, S; Salerno, R; Sirois, Y; Strebler, T; Yilmaz, Y; Zabi, A; Zghiche, A; Agram, J-L; Andrea, J; Aubin, A; Bloch, D; Brom, J-M; Buttignol, M; Chabert, E C; Chanon, N; Collard, C; Conte, E; Coubez, X; Fontaine, J-C; Gelé, D; Goerlach, U; Le Bihan, A-C; Van Hove, P; Gadrat, S; Beauceron, S; Bernet, C; Boudoul, G; Carrillo Montoya, C A; Chierici, R; Contardo, D; Courbon, B; Depasse, P; El Mamouni, H; Fay, J; Gascon, S; Gouzevitch, M; Grenier, G; Ille, B; Lagarde, F; Laktineh, I B; Lethuillier, M; Mirabito, L; Pequegnot, A L; Perries, S; Popov, A; Sabes, D; Sordini, V; Vander Donckt, M; Verdier, P; Viret, S; Khvedelidze, A; Tsamalaidze, Z; Autermann, C; Beranek, S; Feld, L; Kiesel, M K; Klein, K; Lipinski, M; Preuten, M; Schomakers, C; Schulz, J; Verlage, T; Albert, A; Brodski, M; Dietz-Laursonn, E; Duchardt, D; Endres, M; Erdmann, M; Erdweg, S; Esch, T; Fischer, R; Güth, A; Hamer, M; Hebbeker, T; Heidemann, C; Hoepfner, K; Knutzen, S; Merschmeyer, M; Meyer, A; Millet, P; Mukherjee, S; Olschewski, M; Padeken, K; Pook, T; Radziej, M; Reithler, H; Rieger, M; Scheuch, F; Sonnenschein, L; Teyssier, D; Thüer, S; Cherepanov, V; Flügge, G; Kargoll, B; Kress, T; Künsken, A; Lingemann, J; Müller, T; Nehrkorn, A; Nowack, A; Pistone, C; Pooth, O; Stahl, A; Aldaya Martin, M; Arndt, T; Asawatangtrakuldee, C; Beernaert, K; Behnke, O; Behrens, U; Bin Anuar, A A; Borras, K; Campbell, A; Connor, P; Contreras-Campana, C; Costanza, F; Diez Pardos, C; Dolinska, G; Eckerlin, G; Eckstein, D; Eichhorn, T; Eren, E; Gallo, E; Garay Garcia, J; Geiser, A; Gizhko, A; Grados Luyando, J M; Grohsjean, A; Gunnellini, P; Harb, A; Hauk, J; Hempel, M; Jung, H; Kalogeropoulos, A; Karacheban, O; Kasemann, M; Keaveney, J; Kleinwort, C; Korol, I; Krücker, D; Lange, W; Lelek, A; Lenz, T; Leonard, J; Lipka, K; Lobanov, A; Lohmann, W; Mankel, R; Melzer-Pellmann, I-A; Meyer, A B; Mittag, G; Mnich, J; Mussgiller, A; Pitzl, D; Placakyte, R; Raspereza, A; Roland, B; Sahin, M Ö; Saxena, P; Schoerner-Sadenius, T; Spannagel, S; Stefaniuk, N; Van Onsem, G P; Walsh, R; Wissing, C; Blobel, V; Centis Vignali, M; Draeger, A R; Dreyer, T; Garutti, E; Gonzalez, D; Haller, J; Hoffmann, M; Junkes, A; Klanner, R; Kogler, R; Kovalchuk, N; Lapsien, T; Marchesini, I; Marconi, D; Meyer, M; Niedziela, M; Nowatschin, D; Pantaleo, F; Peiffer, T; Perieanu, A; Poehlsen, J; Scharf, C; Schleper, P; Schmidt, A; Schumann, S; Schwandt, J; Stadie, H; Steinbrück, G; Stober, F M; Stöver, M; Tholen, H; Troendle, D; Usai, E; Vanelderen, L; Vanhoefer, A; Vormwald, B; Akbiyik, M; Barth, C; Baur, S; Baus, C; Berger, J; Butz, E; Caspart, R; Chwalek, T; Colombo, F; De Boer, W; Dierlamm, A; Fink, S; Freund, B; Friese, R; Giffels, M; Gilbert, A; Goldenzweig, P; Haitz, D; Hartmann, F; Heindl, S M; Husemann, U; Katkov, I; Kudella, S; Mildner, H; Mozer, M U; Müller, Th; Plagge, M; Quast, G; Rabbertz, K; Röcker, S; Roscher, F; Schröder, M; Shvetsov, I; Sieber, G; Simonis, H J; Ulrich, R; Wayand, S; Weber, M; Weiler, T; Williamson, S; Wöhrmann, C; Wolf, R; Anagnostou, G; Daskalakis, G; Geralis, T; Giakoumopoulou, V A; Kyriakis, A; Loukas, D; Topsis-Giotis, I; Kesisoglou, S; Panagiotou, A; Saoulidou, N; Tziaferi, E; Evangelou, I; Flouris, G; Foudas, C; Kokkas, P; Loukas, N; Manthos, N; Papadopoulos, I; Paradas, E; Filipovic, N; Pasztor, G; Bencze, G; Hajdu, C; Horvath, D; Sikler, F; Veszpremi, V; Vesztergombi, G; Zsigmond, A J; Beni, N; Czellar, S; Karancsi, J; Makovec, A; Molnar, J; Szillasi, Z; Bartók, M; Raics, P; Trocsanyi, Z L; Ujvari, B; Komaragiri, J R; Bahinipati, S; Bhowmik, S; Choudhury, S; Mal, P; Mandal, K; Nayak, A; Sahoo, D K; Sahoo, N; Swain, S K; Bansal, S; Beri, S B; Bhatnagar, V; Chawla, R; Bhawandeep, U; Kalsi, A K; Kaur, A; Kaur, M; Kumar, R; Kumari, P; Mehta, A; Mittal, M; Singh, J B; Walia, G; Kumar, Ashok; Bhardwaj, A; Choudhary, B C; Garg, R B; Keshri, S; Malhotra, S; Naimuddin, M; Ranjan, K; Sharma, R; Sharma, V; Bhattacharya, R; Bhattacharya, S; Chatterjee, K; Dey, S; Dutt, S; Dutta, S; Ghosh, S; Majumdar, N; Modak, A; Mondal, K; Mukhopadhyay, S; Nandan, S; Purohit, A; Roy, A; Roy, D; Roy Chowdhury, S; Sarkar, S; Sharan, M; Thakur, S; Behera, P K; Chudasama, R; Dutta, D; Jha, V; Kumar, V; Mohanty, A K; Netrakanti, P K; Pant, L M; Shukla, P; Topkar, A; Aziz, T; Dugad, S; Kole, G; Mahakud, B; Mitra, S; Mohanty, G B; Parida, B; Sur, N; Sutar, B; Banerjee, S; Dewanjee, R K; Ganguly, S; Guchait, M; Jain, Sa; Kumar, S; Maity, M; Majumder, G; Mazumdar, K; Sarkar, T; Wickramage, N; Chauhan, S; Dube, S; Hegde, V; Kapoor, A; Kothekar, K; Pandey, S; Rane, A; Sharma, S; Chenarani, S; Eskandari Tadavani, E; Etesami, S M; Khakzad, M; Mohammadi Najafabadi, M; Naseri, M; Paktinat Mehdiabadi, S; Rezaei Hosseinabadi, F; Safarzadeh, B; Zeinali, M; Felcini, M; Grunewald, M; Abbrescia, M; Calabria, C; Caputo, C; Colaleo, A; Creanza, D; Cristella, L; De Filippis, N; De Palma, M; Fiore, L; Iaselli, G; Maggi, G; Maggi, M; Miniello, G; My, S; Nuzzo, S; Pompili, A; Pugliese, G; Radogna, R; Ranieri, A; Selvaggi, G; Sharma, A; Silvestris, L; Venditti, R; Verwilligen, P; Abbiendi, G; Battilana, C; Bonacorsi, D; Braibant-Giacomelli, S; Brigliadori, L; Campanini, R; Capiluppi, P; Castro, A; Cavallo, F R; Chhibra, S S; Codispoti, G; Cuffiani, M; Dallavalle, G M; Fabbri, F; Fanfani, A; Fasanella, D; Giacomelli, P; Grandi, C; Guiducci, L; Marcellini, S; Masetti, G; Montanari, A; Navarria, F L; Perrotta, A; Rossi, A M; Rovelli, T; Siroli, G P; Tosi, N; Albergo, S; Costa, S; Di Mattia, A; Giordano, F; Potenza, R; Tricomi, A; Tuve, C; Barbagli, G; Ciulli, V; Civinini, C; D'Alessandro, R; Focardi, E; Lenzi, P; Meschini, M; Paoletti, S; Russo, L; Sguazzoni, G; Strom, D; Viliani, L; Benussi, L; Bianco, S; Fabbri, F; Piccolo, D; Primavera, F; Calvelli, V; Ferro, F; Monge, M R; Robutti, E; Tosi, S; Brianza, L; Brivio, F; Ciriolo, V; Dinardo, M E; Fiorendi, S; Gennai, S; Ghezzi, A; Govoni, P; Malberti, M; Malvezzi, S; Manzoni, R A; Menasce, D; Moroni, L; Paganoni, M; Pedrini, D; Pigazzini, S; Ragazzi, S; Tabarelli de Fatis, T; Buontempo, S; Cavallo, N; De Nardo, G; Di Guida, S; Esposito, M; Fabozzi, F; Fienga, F; Iorio, A O M; Lanza, G; Lista, L; Meola, S; Paolucci, P; Sciacca, C; Thyssen, F; Azzi, P; Bacchetta, N; Benato, L; Boletti, A; Carlin, R; Checchia, P; Dall'Osso, M; De Castro Manzano, P; Dorigo, T; Dosselli, U; Gasparini, F; Gasparini, U; Gozzelino, A; Lacaprara, S; Margoni, M; Meneguzzo, A T; Pazzini, J; Pegoraro, M; Pozzobon, N; Ronchese, P; Sgaravatto, M; Simonetto, F; Torassa, E; Ventura, S; Zanetti, M; Zotto, P; Braghieri, A; Fallavollita, F; Magnani, A; Montagna, P; Ratti, S P; Re, V; Riccardi, C; Salvini, P; Vai, I; Vitulo, P; Alunni Solestizi, L; Bilei, G M; Ciangottini, D; Fanò, L; Lariccia, P; Leonardi, R; Mantovani, G; Menichelli, M; Saha, A; Santocchia, A; Androsov, K; Azzurri, P; Bagliesi, G; Bernardini, J; Boccali, T; Castaldi, R; Ciocci, M A; Dell'Orso, R; Donato, S; Fedi, G; Giassi, A; Grippo, M T; Ligabue, F; Lomtadze, T; Martini, L; Messineo, A; Palla, F; Rizzi, A; Savoy-Navarro, A; Spagnolo, P; Tenchini, R; Tonelli, G; Venturi, A; Verdini, P G; Barone, L; Cavallari, F; Cipriani, M; Del Re, D; Diemoz, M; Gelli, S; Longo, E; Margaroli, F; Marzocchi, B; Meridiani, P; Organtini, G; Paramatti, R; Preiato, F; Rahatlou, S; Rovelli, C; Santanastasio, F; Amapane, N; Arcidiacono, R; Argiro, S; Arneodo, M; Bartosik, N; Bellan, R; Biino, C; Cartiglia, N; Cenna, F; Costa, M; Covarelli, R; Degano, A; Demaria, N; Finco, L; Kiani, B; Mariotti, C; Maselli, S; Migliore, E; Monaco, V; Monteil, E; Monteno, M; Obertino, M M; Pacher, L; Pastrone, N; Pelliccioni, M; Pinna Angioni, G L; Ravera, F; Romero, A; Ruspa, M; Sacchi, R; Shchelina, K; Sola, V; Solano, A; Staiano, A; Traczyk, P; Belforte, S; Casarsa, M; Cossutti, F; Della Ricca, G; Zanetti, A; Kim, D H; Kim, G N; Kim, M S; Lee, S; Lee, S W; Oh, Y D; Sekmen, S; Son, D C; Yang, Y C; Lee, A; Kim, H; Brochero Cifuentes, J A; Kim, T J; Cho, S; Choi, S; Go, Y; Gyun, D; Ha, S; Hong, B; Jo, Y; Kim, Y; Lee, K; Lee, K S; Lee, S; Lim, J; Park, S K; Roh, Y; Almond, J; Kim, J; Lee, H; Oh, S B; Radburn-Smith, B C; Seo, S H; Yang, U K; Yoo, H D; Yu, G B; Choi, M; Kim, H; Kim, J H; Lee, J S H; Park, I C; Ryu, G; Ryu, M S; Choi, Y; Goh, J; Hwang, C; Lee, J; Yu, I; Dudenas, V; Juodagalvis, A; Vaitkus, J; Ahmed, I; Ibrahim, Z A; Md Ali, M A B; Mohamad Idris, F; Wan Abdullah, W A T; Yusli, M N; Zolkapli, Z; Castilla-Valdez, H; De La Cruz-Burelo, E; Heredia-De La Cruz, I; Hernandez-Almada, A; Lopez-Fernandez, R; Magaña Villalba, R; Mejia Guisao, J; Sanchez-Hernandez, A; Carrillo Moreno, S; Oropeza Barrera, C; Vazquez Valencia, F; Carpinteyro, S; Pedraza, I; Salazar Ibarguen, H A; Uribe Estrada, C; Morelos Pineda, A; Krofcheck, D; Butler, P H; Ahmad, A; Ahmad, M; Hassan, Q; Hoorani, H R; Khan, W A; Saddique, A; Shah, M A; Shoaib, M; Waqas, M; Bialkowska, H; Bluj, M; Boimska, B; Frueboes, T; Górski, M; Kazana, M; Nawrocki, K; Romanowska-Rybinska, K; Szleper, M; Zalewski, P; Bunkowski, K; Byszuk, A; Doroba, K; Kalinowski, A; Konecki, M; Krolikowski, J; Misiura, M; Olszewski, M; Walczak, M; Bargassa, P; Beirão Da Cruz E Silva, C; Calpas, B; Di Francesco, A; Faccioli, P; Ferreira Parracho, P G; Gallinaro, M; Hollar, J; Leonardo, N; Lloret Iglesias, L; Nemallapudi, M V; Rodrigues Antunes, J; Seixas, J; Toldaiev, O; Vadruccio, D; Varela, J; Vischia, P; Afanasiev, S; Bunin, P; Gavrilenko, M; Golutvin, I; Gorbunov, I; Kamenev, A; Karjavin, V; Lanev, A; Malakhov, A; Matveev, V; Palichik, V; Perelygin, V; Shmatov, S; Shulha, S; Skatchkov, N; Smirnov, V; Voytishin, N; Zarubin, A; Chtchipounov, L; Golovtsov, V; Ivanov, Y; Kim, V; Kuznetsova, E; Murzin, V; Oreshkin, V; Sulimov, V; Vorobyev, A; Andreev, Yu; Dermenev, A; Gninenko, S; Golubev, N; Karneyeu, A; Kirsanov, M; Krasnikov, N; Pashenkov, A; Tlisov, D; Toropin, A; Epshteyn, V; Gavrilov, V; Lychkovskaya, N; Popov, V; Pozdnyakov, I; Safronov, G; Spiridonov, A; Toms, M; Vlasov, E; Zhokin, A; Aushev, T; Bylinkin, A; Chadeeva, M; Chistov, R; Polikarpov, S; Andreev, V; Azarkin, M; Dremin, I; Kirakosyan, M; Leonidov, A; Terkulov, A; Baskakov, A; Belyaev, A; Boos, E; Ershov, A; Gribushin, A; Kaminskiy, A; Kodolova, O; Korotkikh, V; Lokhtin, I; Miagkov, I; Obraztsov, S; Petrushanko, S; Savrin, V; Snigirev, A; Vardanyan, I; Blinov, V; Skovpen, Y; Shtol, D; Azhgirey, I; Bayshev, I; Bitioukov, S; Elumakhov, D; Kachanov, V; Kalinin, A; Konstantinov, D; Krychkine, V; Petrov, V; Ryutin, R; Sobol, A; Troshin, S; Tyurin, N; Uzunian, A; Volkov, A; Adzic, P; Cirkovic, P; Devetak, D; Dordevic, M; Milosevic, J; Rekovic, V; Alcaraz Maestre, J; Barrio Luna, M; Calvo, E; Cerrada, M; Chamizo Llatas, M; Colino, N; De La Cruz, B; Delgado Peris, A; Escalante Del Valle, A; Fernandez Bedoya, C; Fernández Ramos, J P; Flix, J; Fouz, M C; Garcia-Abia, P; Gonzalez Lopez, O; Goy Lopez, S; Hernandez, J M; Josa, M I; Navarro De Martino, E; Pérez-Calero Yzquierdo, A; Puerta Pelayo, J; Quintario Olmeda, A; Redondo, I; Romero, L; Soares, M S; de Trocóniz, J F; Missiroli, M; Moran, D; Cuevas, J; Fernandez Menendez, J; Gonzalez Caballero, I; González Fernández, J R; Palencia Cortezon, E; Sanchez Cruz, S; Suárez Andrés, I; Vizan Garcia, J M; Cabrillo, I J; Calderon, A; Curras, E; Fernandez, M; Garcia-Ferrero, J; Gomez, G; Lopez Virto, A; Marco, J; Martinez Rivero, C; Matorras, F; Piedra Gomez, J; Rodrigo, T; Ruiz-Jimeno, A; Scodellaro, L; Trevisani, N; Vila, I; Vilar Cortabitarte, R; Abbaneo, D; Auffray, E; Auzinger, G; Baillon, P; Ball, A H; Barney, D; Bloch, P; Bocci, A; Botta, C; Camporesi, T; Castello, R; Cepeda, M; Cerminara, G; Chen, Y; d'Enterria, D; Dabrowski, A; Daponte, V; David, A; De Gruttola, M; De Roeck, A; Di Marco, E; Dobson, M; Dorney, B; du Pree, T; Duggan, D; Dünser, M; Dupont, N; Elliott-Peisert, A; Everaerts, P; Fartoukh, S; Franzoni, G; Fulcher, J; Funk, W; Gigi, D; Gill, K; Girone, M; Glege, F; Gulhan, D; Gundacker, S; Guthoff, M; Harris, P; Hegeman, J; Innocente, V; Janot, P; Kieseler, J; Kirschenmann, H; Knünz, V; Kornmayer, A; Kortelainen, M J; Kousouris, K; Krammer, M; Lange, C; Lecoq, P; Lourenço, C; Lucchini, M T; Malgeri, L; Mannelli, M; Martelli, A; Meijers, F; Merlin, J A; Mersi, S; Meschi, E; Milenovic, P; Moortgat, F; Morovic, S; Mulders, M; Neugebauer, H; Orfanelli, S; Orsini, L; Pape, L; Perez, E; Peruzzi, M; Petrilli, A; Petrucciani, G; Pfeiffer, A; Pierini, M; Racz, A; Reis, T; Rolandi, G; Rovere, M; Sakulin, H; Sauvan, J B; Schäfer, C; Schwick, C; Seidel, M; Sharma, A; Silva, P; Sphicas, P; Steggemann, J; Stoye, M; Takahashi, Y; Tosi, M; Treille, D; Triossi, A; Tsirou, A; Veckalns, V; Veres, G I; Verweij, M; Wardle, N; Wöhri, H K; Zagozdzinska, A; Zeuner, W D; Bertl, W; Deiters, K; Erdmann, W; Horisberger, R; Ingram, Q; Kaestli, H C; Kotlinski, D; Langenegger, U; Rohe, T; Wiederkehr, S A; Bachmair, F; Bäni, L; Bianchini, L; Casal, B; Dissertori, G; Dittmar, M; Donegà, M; Grab, C; Heidegger, C; Hits, D; Hoss, J; Kasieczka, G; Lustermann, W; Mangano, B; Marionneau, M; Martinez Ruiz Del Arbol, P; Masciovecchio, M; Meinhard, M T; Meister, D; Micheli, F; Musella, P; Nessi-Tedaldi, F; Pandolfi, F; Pata, J; Pauss, F; Perrin, G; Perrozzi, L; Quittnat, M; Rossini, M; Schönenberger, M; Starodumov, A; Tavolaro, V R; Theofilatos, K; Wallny, R; Aarrestad, T K; Amsler, C; Caminada, L; Canelli, M F; De Cosa, A; Galloni, C; Hinzmann, A; Hreus, T; Kilminster, B; Ngadiuba, J; Pinna, D; Rauco, G; Robmann, P; Salerno, D; Seitz, C; Yang, Y; Zucchetta, A; Candelise, V; Doan, T H; Jain, Sh; Khurana, R; Konyushikhin, M; Kuo, C M; Lin, W; Pozdnyakov, A; Yu, S S; Kumar, Arun; Chang, P; Chang, Y H; Chao, Y; Chen, K F; Chen, P H; Fiori, F; Hou, W-S; Hsiung, Y; Liu, Y F; Lu, R-S; Miñano Moya, M; Paganis, E; Psallidas, A; Tsai, J F; Asavapibhop, B; Singh, G; Srimanobhas, N; Suwonjandee, N; Adiguzel, A; Cerci, S; Damarseckin, S; Demiroglu, Z S; Dozen, C; Dumanoglu, I; Girgis, S; Gokbulut, G; Guler, Y; Hos, I; Kangal, E E; Kara, O; Kayis Topaksu, A; Kiminsu, U; Oglakci, M; Onengut, G; Ozdemir, K; Sunar Cerci, D; Topakli, H; Turkcapar, S; Zorbakir, I S; Zorbilmez, C; Bilin, B; Bilmis, S; Isildak, B; Karapinar, G; Yalvac, M; Zeyrek, M; Gülmez, E; Kaya, M; Kaya, O; Yetkin, E A; Yetkin, T; Cakir, A; Cankocak, K; Sen, S; Grynyov, B; Levchuk, L; Sorokin, P; Aggleton, R; Ball, F; Beck, L; Brooke, J J; Burns, D; Clement, E; Cussans, D; Flacher, H; Goldstein, J; Grimes, M; Heath, G P; Heath, H F; Jacob, J; Kreczko, L; Lucas, C; Newbold, D M; Paramesvaran, S; Poll, A; Sakuma, T; Seif El Nasr-Storey, S; Smith, D; Smith, V J; Belyaev, A; Brew, C; Brown, R M; Calligaris, L; Cieri, D; Cockerill, D J A; Coughlan, J A; Harder, K; Harper, S; Olaiya, E; Petyt, D; Shepherd-Themistocleous, C H; Thea, A; Tomalin, I R; Williams, T; Baber, M; Bainbridge, R; Buchmuller, O; Bundock, A; Burton, D; Casasso, S; Citron, M; Colling, D; Corpe, L; Dauncey, P; Davies, G; De Wit, A; Della Negra, M; Di Maria, R; Dunne, P; Elwood, A; Futyan, D; Haddad, Y; Hall, G; Iles, G; James, T; Lane, R; Laner, C; Lucas, R; Lyons, L; Magnan, A-M; Malik, S; Mastrolorenzo, L; Nash, J; Nikitenko, A; Pela, J; Penning, B; Pesaresi, M; Raymond, D M; Richards, A; Rose, A; Scott, E; Seez, C; Summers, S; Tapper, A; Uchida, K; Vazquez Acosta, M; Virdee, T; Wright, J; Zenz, S C; Cole, J E; Hobson, P R; Khan, A; Kyberd, P; Reid, I D; Symonds, P; Teodorescu, L; Turner, M; Borzou, A; Call, K; Dittmann, J; Hatakeyama, K; Liu, H; Pastika, N; Bartek, R; Dominguez, A; Buccilli, A; Cooper, S I; Henderson, C; Rumerio, P; West, C; Arcaro, D; Avetisyan, A; Bose, T; Gastler, D; Rankin, D; Richardson, C; Rohlf, J; Sulak, L; Zou, D; Benelli, G; Cutts, D; Garabedian, A; Hakala, J; Heintz, U; Hogan, J M; Jesus, O; Kwok, K H M; Laird, E; Landsberg, G; Mao, Z; Narain, M; Piperov, S; Sagir, S; Spencer, E; Syarif, R; Breedon, R; Burns, D; Calderon De La Barca Sanchez, M; Chauhan, S; Chertok, M; Conway, J; Conway, R; Cox, P T; Erbacher, R; Flores, C; Funk, G; Gardner, M; Ko, W; Lander, R; Mclean, C; Mulhearn, M; Pellett, D; Pilot, J; Shalhout, S; Shi, M; Smith, J; Squires, M; Stolp, D; Tos, K; Tripathi, M; Bachtis, M; Bravo, C; Cousins, R; Dasgupta, A; Florent, A; Hauser, J; Ignatenko, M; Mccoll, N; Saltzberg, D; Schnaible, C; Valuev, V; Weber, M; Bouvier, E; Burt, K; Clare, R; Ellison, J; Gary, J W; Ghiasi Shirazi, S M A; Hanson, G; Heilman, J; Jandir, P; Kennedy, E; Lacroix, F; Long, O R; Olmedo Negrete, M; Paneva, M I; Shrinivas, A; Si, W; Wei, H; Wimpenny, S; Yates, B R; Branson, J G; Cerati, G B; Cittolin, S; Derdzinski, M; Gerosa, R; Holzner, A; Klein, D; Krutelyov, V; Letts, J; Macneill, I; Olivito, D; Padhi, S; Pieri, M; Sani, M; Sharma, V; Simon, S; Tadel, M; Vartak, A; Wasserbaech, S; Welke, C; Wood, J; Würthwein, F; Yagil, A; Zevi Della Porta, G; Amin, N; Bhandari, R; Bradmiller-Feld, J; Campagnari, C; Dishaw, A; Dutta, V; Franco Sevilla, M; George, C; Golf, F; Gouskos, L; Gran, J; Heller, R; Incandela, J; Mullin, S D; Ovcharova, A; Qu, H; Richman, J; Stuart, D; Suarez, I; Yoo, J; Anderson, D; Bendavid, J; Bornheim, A; Bunn, J; Duarte, J; Lawhorn, J M; Mott, A; Newman, H B; Pena, C; Spiropulu, M; Vlimant, J R; Xie, S; Zhu, R Y; Andrews, M B; Ferguson, T; Paulini, M; Russ, J; Sun, M; Vogel, H; Vorobiev, I; Weinberg, M; Cumalat, J P; Ford, W T; Jensen, F; Johnson, A; Krohn, M; Leontsinis, S; Mulholland, T; Stenson, K; Wagner, S R; Alexander, J; Chaves, J; Chu, J; Dittmer, S; Mcdermott, K; Mirman, N; Nicolas Kaufman, G; Patterson, J R; Rinkevicius, A; Ryd, A; Skinnari, L; Soffi, L; Tan, S M; Tao, Z; Thom, J; Tucker, J; Wittich, P; Zientek, M; Winn, D; Abdullin, S; Albrow, M; Apollinari, G; Apresyan, A; Banerjee, S; Bauerdick, L A T; Beretvas, A; Berryhill, J; Bhat, P C; Bolla, G; Burkett, K; Butler, J N; Cheung, H W K; Chlebana, F; Cihangir, S; Cremonesi, M; Elvira, V D; Fisk, I; Freeman, J; Gottschalk, E; Gray, L; Green, D; Grünendahl, S; Gutsche, O; Hare, D; Harris, R M; Hasegawa, S; Hirschauer, J; Hu, Z; Jayatilaka, B; Jindariani, S; Johnson, M; Joshi, U; Klima, B; Kreis, B; Lammel, S; Linacre, J; Lincoln, D; Lipton, R; Liu, M; Liu, T; Lopes De Sá, R; Lykken, J; Maeshima, K; Magini, N; Marraffino, J M; Maruyama, S; Mason, D; McBride, P; Merkel, P; Mrenna, S; Nahn, S; O'Dell, V; Pedro, K; Prokofyev, O; Rakness, G; Ristori, L; Sexton-Kennedy, E; Soha, A; Spalding, W J; Spiegel, L; Stoynev, S; Strait, J; Strobbe, N; Taylor, L; Tkaczyk, S; Tran, N V; Uplegger, L; Vaandering, E W; Vernieri, C; Verzocchi, M; Vidal, R; Wang, M; Weber, H A; Whitbeck, A; Wu, Y; Acosta, D; Avery, P; Bortignon, P; Bourilkov, D; Brinkerhoff, A; Carnes, A; Carver, M; Curry, D; Das, S; Field, R D; Furic, I K; Konigsberg, J; Korytov, A; Low, J F; Ma, P; Matchev, K; Mei, H; Mitselmakher, G; Rank, D; Shchutska, L; Sperka, D; Thomas, L; Wang, J; Wang, S; Yelton, J; Linn, S; Markowitz, P; Martinez, G; Rodriguez, J L; Ackert, A; Adams, T; Askew, A; Bein, S; Hagopian, S; Hagopian, V; Johnson, K F; Prosper, H; Santra, A; Yohay, R; Baarmand, M M; Bhopatkar, V; Colafranceschi, S; Hohlmann, M; Noonan, D; Roy, T; Yumiceva, F; Adams, M R; Apanasevich, L; Berry, D; Betts, R R; Bucinskaite, I; Cavanaugh, R; Evdokimov, O; Gauthier, L; Gerber, C E; Hofman, D J; Jung, K; Sandoval Gonzalez, I D; Varelas, N; Wang, H; Wu, Z; Zakaria, M; Zhang, J; Bilki, B; Clarida, W; Dilsiz, K; Durgut, S; Gandrajula, R P; Haytmyradov, M; Khristenko, V; Merlo, J-P; Mermerkaya, H; Mestvirishvili, A; Moeller, A; Nachtman, J; Ogul, H; Onel, Y; Ozok, F; Penzo, A; Snyder, C; Tiras, E; Wetzel, J; Yi, K; Anderson, I; Blumenfeld, B; Cocoros, A; Eminizer, N; Fehling, D; Feng, L; Gritsan, A V; Maksimovic, P; Roskes, J; Sarica, U; Swartz, M; Xiao, M; Xin, Y; You, C; Al-Bataineh, A; Baringer, P; Bean, A; Boren, S; Bowen, J; Castle, J; Forthomme, L; Kenny, R P; Khalil, S; Kropivnitskaya, A; Majumder, D; Mcbrayer, W; Murray, M; Sanders, S; Stringer, R; Tapia Takaki, J D; Wang, Q; Ivanov, A; Kaadze, K; Maravin, Y; Mohammadi, A; Saini, L K; Skhirtladze, N; Toda, S; Rebassoo, F; Wright, D; Anelli, C; Baden, A; Baron, O; Belloni, A; Calvert, B; Eno, S C; Ferraioli, C; Gomez, J A; Hadley, N J; Jabeen, S; Jeng, G Y; Kellogg, R G; Kolberg, T; Kunkle, J; Mignerey, A C; Ricci-Tam, F; Shin, Y H; Skuja, A; Tonjes, M B; Tonwar, S C; Abercrombie, D; Allen, B; Apyan, A; Azzolini, V; Barbieri, R; Baty, A; Bi, R; Bierwagen, K; Brandt, S; Busza, W; Cali, I A; D'Alfonso, M; Demiragli, Z; Di Matteo, L; Gomez Ceballos, G; Goncharov, M; Hsu, D; Iiyama, Y; Innocenti, G M; Klute, M; Kovalskyi, D; Krajczar, K; Lai, Y S; Lee, Y-J; Levin, A; Luckey, P D; Maier, B; Marini, A C; Mcginn, C; Mironov, C; Narayanan, S; Niu, X; Paus, C; Roland, C; Roland, G; Salfeld-Nebgen, J; Stephans, G S F; Tatar, K; Varma, M; Velicanu, D; Veverka, J; Wang, J; Wang, T W; Wyslouch, B; Yang, M; Benvenuti, A C; Chatterjee, R M; Evans, A; Hansen, P; Kalafut, S; Kao, S C; Kubota, Y; Lesko, Z; Mans, J; Nourbakhsh, S; Ruckstuhl, N; Rusack, R; Tambe, N; Turkewitz, J; Acosta, J G; Oliveros, S; Avdeeva, E; Bloom, K; Claes, D R; Fangmeier, C; Gonzalez Suarez, R; Kamalieddin, R; Kravchenko, I; Malta Rodrigues, A; Monroy, J; Siado, J E; Snow, G R; Stieger, B; Alyari, M; Dolen, J; Godshalk, A; Harrington, C; Iashvili, I; Kaisen, J; Nguyen, D; Parker, A; Rappoccio, S; Roozbahani, B; Alverson, G; Barberis, E; Hortiangtham, A; Massironi, A; Morse, D M; Nash, D; Orimoto, T; Teixeira De Lima, R; Trocino, D; Wang, R-J; Wood, D; Bhattacharya, S; Charaf, O; Hahn, K A; Kumar, A; Mucia, N; Odell, N; Pollack, B; Schmitt, M H; Sung, K; Trovato, M; Velasco, M; Dev, N; Hildreth, M; Hurtado Anampa, K; Jessop, C; Karmgard, D J; Kellams, N; Lannon, K; Marinelli, N; Meng, F; Mueller, C; Musienko, Y; Planer, M; Reinsvold, A; Ruchti, R; Rupprecht, N; Smith, G; Taroni, S; Wayne, M; Wolf, M; Woodard, A; Alimena, J; Antonelli, L; Bylsma, B; Durkin, L S; Flowers, S; Francis, B; Hart, A; Hill, C; Hughes, R; Ji, W; Liu, B; Luo, W; Puigh, D; Winer, B L; Wulsin, H W; Cooperstein, S; Driga, O; Elmer, P; Hardenbrook, J; Hebda, P; Lange, D; Luo, J; Marlow, D; Medvedeva, T; Mei, K; Ojalvo, I; Olsen, J; Palmer, C; Piroué, P; Stickland, D; Svyatkovskiy, A; Tully, C; Malik, S; Barker, A; Barnes, V E; Folgueras, S; Gutay, L; Jha, M K; Jones, M; Jung, A W; Khatiwada, A; Miller, D H; Neumeister, N; Schulte, J F; Shi, X; Sun, J; Wang, F; Xie, W; Parashar, N; Stupak, J; Adair, A; Akgun, B; Chen, Z; Ecklund, K M; Geurts, F J M; Guilbaud, M; Li, W; Michlin, B; Northup, M; Padley, B P; Roberts, J; Rorie, J; Tu, Z; Zabel, J; Betchart, B; Bodek, A; de Barbaro, P; Demina, R; Duh, Y T; Ferbel, T; Galanti, M; Garcia-Bellido, A; Han, J; Hindrichs, O; Khukhunaishvili, A; Lo, K H; Tan, P; Verzetti, M; Agapitos, A; Chou, J P; Gershtein, Y; Gómez Espinosa, T A; Halkiadakis, E; Heindl, M; Hughes, E; Kaplan, S; Kunnawalkam Elayavalli, R; Kyriacou, S; Lath, A; Nash, K; Osherson, M; Saka, H; Salur, S; Schnetzer, S; Sheffield, D; Somalwar, S; Stone, R; Thomas, S; Thomassen, P; Walker, M; Delannoy, A G; Foerster, M; Heideman, J; Riley, G; Rose, K; Spanier, S; Thapa, K; Bouhali, O; Celik, A; Dalchenko, M; De Mattia, M; Delgado, A; Dildick, S; Eusebi, R; Gilmore, J; Huang, T; Juska, E; Kamon, T; Mueller, R; Pakhotin, Y; Patel, R; Perloff, A; Perniè, L; Rathjens, D; Safonov, A; Tatarinov, A; Ulmer, K A; Akchurin, N; Cowden, C; Damgov, J; De Guio, F; Dragoiu, C; Dudero, P R; Faulkner, J; Gurpinar, E; Kunori, S; Lamichhane, K; Lee, S W; Libeiro, T; Peltola, T; Undleeb, S; Volobouev, I; Wang, Z; Greene, S; Gurrola, A; Janjam, R; Johns, W; Maguire, C; Melo, A; Ni, H; Sheldon, P; Tuo, S; Velkovska, J; Xu, Q; Arenton, M W; Barria, P; Cox, B; Goodell, J; Hirosky, R; Ledovskoy, A; Li, H; Neu, C; Sinthuprasith, T; Sun, X; Wang, Y; Wolfe, E; Xia, F; Clarke, C; Harr, R; Karchin, P E; Sturdy, J; Belknap, D A; Buchanan, J; Caillol, C; Dasu, S; Dodd, L; Duric, S; Gomber, B; Grothe, M; Herndon, M; Hervé, A; Klabbers, P; Lanaro, A; Levine, A; Long, K; Loveless, R; Perry, T; Pierro, G A; Polese, G; Ruggles, T; Savin, A; Smith, N; Smith, W H; Taylor, D; Woods, N

    2017-01-01

    This paper reports the measurement of [Formula: see text] meson production in proton-proton ([Formula: see text]) and proton-lead ([Formula: see text]) collisions at a center-of-mass energy per nucleon pair of [Formula: see text] by the CMS experiment at the LHC. The data samples used in the analysis correspond to integrated luminosities of 28[Formula: see text] and 35[Formula: see text] for [Formula: see text] and [Formula: see text] collisions, respectively. Prompt and nonprompt [Formula: see text] mesons, the latter produced in the decay of [Formula: see text] hadrons, are measured in their dimuon decay channels. Differential cross sections are measured in the transverse momentum range of [Formula: see text], and center-of-mass rapidity ranges of [Formula: see text] ([Formula: see text]) and [Formula: see text] ([Formula: see text]). The nuclear modification factor, [Formula: see text], is measured as a function of both [Formula: see text] and [Formula: see text]. Small modifications to the [Formula: see text] cross sections are observed in [Formula: see text] relative to [Formula: see text] collisions. The ratio of [Formula: see text] production cross sections in [Formula: see text]-going and Pb-going directions, [Formula: see text], studied as functions of [Formula: see text] and [Formula: see text], shows a significant decrease for increasing transverse energy deposited at large pseudorapidities. These results, which cover a wide kinematic range, provide new insight on the role of cold nuclear matter effects on prompt and nonprompt [Formula: see text] production.

  10. Wide diameter immediate post-extractive implants vs delayed placement of normal-diameter implants in preserved sockets in the molar region: 1-year post-loading outcome of a randomised controlled trial.

    Science.gov (United States)

    Checchi, Vittorio; Felice, Pietro; Zucchelli, Giovanni; Barausse, Carlo; Piattelli, Maurizio; Pistilli, Roberto; Grandi, Giovanni; Esposito, Marco

    2017-01-01

    To compare the effectiveness of 6.0 to 8.0 mm-wide diameter implants, placed immediately after tooth extraction, with conventional 4.0 or 5.0 mm diameter implants placed in a preserved socket after a 4-month period of healing in the molar region. Just after extraction of one or two molar teeth, and with no vertical loss of the buccal bone in relation to the palatal wall, 100 patients requiring immediate post-extractive implants were randomly allocated to immediate placement of one or two 6.0 to 8.0 mm-wide diameter implants (immediate group; 50 patients) or for socket preservation using a porcine bone substitute covered by a resorbable collagen barrier (delayed group; 50 patients), according to a parallel group design in one centre. Bone-to-implant gaps were filled with autogenous bone retrieved with a trephine drill used to prepare the implant sites for the immediate wide diameter post-extractive implants. Four months after socket preservation, one to two 4.0 or 5.0 mm-wide delayed implants were placed. Implants were loaded 4 months after placement with fixed provisional restorations in acrylic, and replaced after 4 months by fixed, definitive, metal-ceramic restorations. Patients were followed to 1 year after loading. Outcome measures were: implant failures, complications, aesthetics assessed using the pink esthetic score (PES), peri-implant marginal bone level changes, patient satisfaction, number of appointments and surgical interventions recorded, when possible, by blinded assessors. Three patients dropped out 1 year after loading from the immediate group vs six from the delayed group. Five implants out of 47 failed in the immediate group (10.6%) vs two out 44 (4.6%) in the delayed group, the difference being not statistically significant (difference in proportion = 6.0%, 95% CI: -8.8% to 20.8%, P = 0.436). In the immediate group 10 patients were affected by 10 complications, while in the delayed group four patients were affected by four complications. The

  11. New extraction technique for alkaloids

    Directory of Open Access Journals (Sweden)

    Djilani Abdelouaheb

    2006-01-01

    Full Text Available A method of extraction of natural products has been developed. Compared with existing methods, the new technique is rapid, more efficient and consumes less solvent. Extraction of alkaloids from natural products such as Hyoscyamus muticus, Datura stramonium and Ruta graveolens consists of the use of a sonicated solution containing a surfactant as extracting agent. The alkaloids are precipitated by Mayer reagent, dissolved in an alkaline solution, and then extracted with chloroform. This article compares the results obtained with other methods showing clearly the advantages of the new method.

  12. Handling uncertainty in relation extraction: a case study on tennis tournament results extraction from tweets

    NARCIS (Netherlands)

    Verburg, Jochem; Habib, Mena Badieh; van Keulen, Maurice

    2015-01-01

    Relation extraction involves different types of uncertainty due to the imperfection of the extraction tools and the inherent ambiguity of unstructured text. In this paper, we discuss several ways of handling uncertainties in relation extraction from social media. Our study case is to extract tennis

  13. AN Information Text Classification Algorithm Based on DBN

    Directory of Open Access Journals (Sweden)

    LU Shu-bao

    2017-04-01

    Full Text Available Aiming at the problem of low categorization accuracy and uneven distribution of the traditional text classification algorithms,a text classification algorithm based on deep learning has been put forward. Deep belief networks have very strong feature learning ability,which can be extracted from the high dimension of the original feature,so that the text classification can not only be considered,but also can be used to train classification model. The formula of TF-IDF is used to compute text eigenvalues,and the deep belief networks are used to construct the classifier. The experimental results show that compared with the commonly used classification algorithms such as support vector machine,neural network and extreme learning machine,the algorithm has higher accuracy and practicability,and it has opened up new ideas for the research of text classification.

  14. Event-based text mining for biology and functional genomics

    Science.gov (United States)

    Thompson, Paul; Nawaz, Raheel; McNaught, John; Kell, Douglas B.

    2015-01-01

    The assessment of genome function requires a mapping between genome-derived entities and biochemical reactions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of ‘events’, i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event extraction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction. We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research. PMID:24907365

  15. Isoflavones hydrolisis and extraction

    Directory of Open Access Journals (Sweden)

    Jozilene Fernandes Farias dos Santos

    2012-12-01

    Full Text Available Isoflavones are found in leguminous species and are used as phytoestrogens widely used by industry for its beneficial effects as estrogens mimicked, antioxidant action and anti-cancer activity. The identification and quantification of isoflavones in plants is a need due to the high demand of industry. Several methods are used for its extraction, using organic solvents (methanol, ethanol and acetonitrile. Samples from five legumes species from Instituto de Zootecnia (IZ, Forage Gene Bank were tested. All seeds received a hydrothermic treatment immersed in pure water at 50°C for 12 hours. Seeds were then oven-dryed. In this work we tested the extraction using only the hydrothermic treatment and hyfrothermic treatment allied to methanol extaction protocol. Seeds were grinded and half of the samples were ressuspended in PBS (phosphate Buffer and the other half were submited to 4 mL of methanol and 1% of acetic acid, soaked for 5 hours, shaked every 15 minutes, at room temperature. The five legume species that we quantify isoflavones by enzyme immunoassay (EIA were: Calopogonium mucunoides, Bauhinia sp., Cajanus cajan, Galactia martii, Leucaena leucocephala. The extraction procedure is a recomendation of AOAC (Association of Official Analytical Chemists for isoflavone quantification. Ours results show an increase of extraction using methanol 80% plus acetic acid 1% and was obtained using solvent extraction in comparison to hydrothermic procedure alone (figure 1.

  16. An Approach to Retrieval of OCR Degraded Text

    Directory of Open Access Journals (Sweden)

    Yuen-Hsien Tseng

    1998-12-01

    Full Text Available The major problem with retrieval of OCR text is the unpredictable distortion of characters due to recognition errors. Because users have no ideas of such distortion, the terms they query can hardly match the terms stored in the OCR text exactly. Thus retrieval effectiveness is significantly reduced , especially for low-quality input. To reduce the losses from retrieving such noisy OCR text, a fault-tolerant retrieval strategy based on automatic keyword extraction and fuzzy matching is proposed. In this strategy, terms, correct or not, and their term frequencies are extracted from the noisy text and presented for browsing and selection in response to users' initial queries , With theunderstanding of the real terms stored in the noisy text and of their estimated frequency distributions, users may then choose appropriate terms for a more effective searching, A text retrieval system based on this strategy has been built. Examples to show the effectiveness are demonstrated. Finally, some OCR issues for further enhancingretrieval effectiveness are discussed.

  17. Term extraction from sparse, ungrammatical domain-specific documents

    NARCIS (Netherlands)

    Ittoo, Ashwin; Bouma, Gosse

    2013-01-01

    Existing term extraction systems have predominantly targeted large and well-written document collections, which provide reliable statistical and linguistic evidence to support term extraction. In this article, we address the term extraction challenges posed by sparse, ungrammatical texts with

  18. A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts.

    Science.gov (United States)

    Westergaard, David; Stærfeldt, Hans-Henrik; Tønsberg, Christian; Jensen, Lars Juhl; Brunak, Søren

    2018-02-01

    Across academia and industry, text mining has become a popular strategy for keeping up with the rapid growth of the scientific literature. Text mining of the scientific literature has mostly been carried out on collections of abstracts, due to their availability. Here we present an analysis of 15 million English scientific full-text articles published during the period 1823-2016. We describe the development in article length and publication sub-topics during these nearly 250 years. We showcase the potential of text mining by extracting published protein-protein, disease-gene, and protein subcellular associations using a named entity recognition system, and quantitatively report on their accuracy using gold standard benchmark data sets. We subsequently compare the findings to corresponding results obtained on 16.5 million abstracts included in MEDLINE and show that text mining of full-text articles consistently outperforms using abstracts only.

  19. ACCELERATED SOLVENT EXTRACTION COMBINED WITH ...

    Science.gov (United States)

    A research project was initiated to address a recurring problem of elevated detection limits above required risk-based concentrations for the determination of semivolatile organic compounds in high moisture content solid samples. This project was initiated, in cooperation with the EPA Region 1 Laboratory, under the Regional Methods Program administered through the ORD Office of Science Policy. The aim of the project was to develop an approach for the rapid removal of water in high moisture content solids (e.g., wetland sediments) in preparation for analysis via Method 8270. Alternative methods for water removal have been investigated to enhance compound solid concentrations and improve extraction efficiency, with the use of pressure filtration providing a high-throughput alternative for removal of the majority of free water in sediments and sludges. In order to eliminate problems with phase separation during extraction of solids using Accelerated Solvent Extraction, a variation of a water-isopropanol extraction method developed at the USGS National Water Quality Laboratory in Denver, CO is being employed. The concentrations of target compounds in water-isopropanol extraction fluids are subsequently analyzed using an automated Solid Phase Extraction (SPE)-GC/MS method developed in our laboratory. The coupled approaches for dewatering, extraction, and target compound identification-quantitation provide a useful alternative to enhance sample throughput for Me

  20. Inferring Group Processes from Computer-Mediated Affective Text Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Schryver, Jack C [ORNL; Begoli, Edmon [ORNL; Jose, Ajith [Missouri University of Science and Technology; Griffin, Christopher [Pennsylvania State University

    2011-02-01

    Political communications in the form of unstructured text convey rich connotative meaning that can reveal underlying group social processes. Previous research has focused on sentiment analysis at the document level, but we extend this analysis to sub-document levels through a detailed analysis of affective relationships between entities extracted from a document. Instead of pure sentiment analysis, which is just positive or negative, we explore nuances of affective meaning in 22 affect categories. Our affect propagation algorithm automatically calculates and displays extracted affective relationships among entities in graphical form in our prototype (TEAMSTER), starting with seed lists of affect terms. Several useful metrics are defined to infer underlying group processes by aggregating affective relationships discovered in a text. Our approach has been validated with annotated documents from the MPQA corpus, achieving a performance gain of 74% over comparable random guessers.

  1. Extraction Techniques for Polycyclic Aromatic Hydrocarbons in Soils

    Directory of Open Access Journals (Sweden)

    E. V. Lau

    2010-01-01

    Full Text Available This paper aims to provide a review of the analytical extraction techniques for polycyclic aromatic hydrocarbons (PAHs in soils. The extraction technologies described here include Soxhlet extraction, ultrasonic and mechanical agitation, accelerated solvent extraction, supercritical and subcritical fluid extraction, microwave-assisted extraction, solid phase extraction and microextraction, thermal desorption and flash pyrolysis, as well as fluidised-bed extraction. The influencing factors in the extraction of PAHs from soil such as temperature, type of solvent, soil moisture, and other soil characteristics are also discussed. The paper concludes with a review of the models used to describe the kinetics of PAH desorption from soils during solvent extraction.

  2. Text Analytics: the convergence of Big Data and Artificial Intelligence

    Directory of Open Access Journals (Sweden)

    Antonio Moreno

    2016-03-01

    Full Text Available The analysis of the text content in emails, blogs, tweets, forums and other forms of textual communication constitutes what we call text analytics. Text analytics is applicable to most industries: it can help analyze millions of emails; you can analyze customers’ comments and questions in forums; you can perform sentiment analysis using text analytics by measuring positive or negative perceptions of a company, brand, or product. Text Analytics has also been called text mining, and is a subcategory of the Natural Language Processing (NLP field, which is one of the founding branches of Artificial Intelligence, back in the 1950s, when an interest in understanding text originally developed. Currently Text Analytics is often considered as the next step in Big Data analysis. Text Analytics has a number of subdivisions: Information Extraction, Named Entity Recognition, Semantic Web annotated domain’s representation, and many more. Several techniques are currently used and some of them have gained a lot of attention, such as Machine Learning, to show a semisupervised enhancement of systems, but they also present a number of limitations which make them not always the only or the best choice. We conclude with current and near future applications of Text Analytics.

  3. Using texts in science education: cognitive processes and knowledge representation.

    Science.gov (United States)

    van den Broek, Paul

    2010-04-23

    Texts form a powerful tool in teaching concepts and principles in science. How do readers extract information from a text, and what are the limitations in this process? Central to comprehension of and learning from a text is the construction of a coherent mental representation that integrates the textual information and relevant background knowledge. This representation engenders learning if it expands the reader's existing knowledge base or if it corrects misconceptions in this knowledge base. The Landscape Model captures the reading process and the influences of reader characteristics (such as working-memory capacity, reading goal, prior knowledge, and inferential skills) and text characteristics (such as content/structure of presented information, processing demands, and textual cues). The model suggests factors that can optimize--or jeopardize--learning science from text.

  4. An Invisible Text Watermarking Algorithm using Image Watermark

    Science.gov (United States)

    Jalil, Zunera; Mirza, Anwar M.

    Copyright protection of digital contents is very necessary in today's digital world with efficient communication mediums as internet. Text is the dominant part of the internet contents and there are very limited techniques available for text protection. This paper presents a novel algorithm for protection of plain text, which embeds the logo image of the copyright owner in the text and this logo can be extracted from the text later to prove ownership. The algorithm is robust against content-preserving modifications and at the same time, is capable of detecting malicious tampering. Experimental results demonstrate the effectiveness of the algorithm against tampering attacks by calculating normalized hamming distances. The results are also compared with a recent work in this domain

  5. Data mining of text as a tool in authorship attribution

    Science.gov (United States)

    Visa, Ari J. E.; Toivonen, Jarmo; Autio, Sami; Maekinen, Jarno; Back, Barbro; Vanharanta, Hannu

    2001-03-01

    It is common that text documents are characterized and classified by keywords that the authors use to give them. Visa et al. have developed a new methodology based on prototype matching. The prototype is an interesting document or a part of an extracted, interesting text. This prototype is matched with the document database of the monitored document flow. The new methodology is capable of extracting the meaning of the document in a certain degree. Our claim is that the new methodology is also capable of authenticating the authorship. To verify this claim two tests were designed. The test hypothesis was that the words and the word order in the sentences could authenticate the author. In the first test three authors were selected. The selected authors were William Shakespeare, Edgar Allan Poe, and George Bernard Shaw. Three texts from each author were examined. Every text was one by one used as a prototype. The two nearest matches with the prototype were noted. The second test uses the Reuters-21578 financial news database. A group of 25 short financial news reports from five different authors are examined. Our new methodology and the interesting results from the two tests are reported in this paper. In the first test, for Shakespeare and for Poe all cases were successful. For Shaw one text was confused with Poe. In the second test the Reuters-21578 financial news were identified by the author relatively well. The resolution is that our text mining methodology seems to be capable of authorship attribution.

  6. SIAM 2007 Text Mining Competition dataset

    Data.gov (United States)

    National Aeronautics and Space Administration — Subject Area: Text Mining Description: This is the dataset used for the SIAM 2007 Text Mining competition. This competition focused on developing text mining...

  7. Text Categorization with Latent Dirichlet Allocation

    Directory of Open Access Journals (Sweden)

    ZLACKÝ Daniel

    2014-05-01

    Full Text Available This paper focuses on the text categorization of Slovak text corpora using latent Dirichlet allocation. Our goal is to build text subcorpora that contain similar text documents. We want to use these better organized text subcorpora to build more robust language models that can be used in the area of speech recognition systems. Our previous research in the area of text categorization showed that we can achieve better results with categorized text corpora. In this paper we used latent Dirichlet allocation for text categorization. We divided initial text corpus into 2, 5, 10, 20 or 100 subcorpora with various iterations and save steps. Language models were built on these subcorpora and adapted with linear interpolation to judicial domain. The experiment results showed that text categorization using latent Dirichlet allocation can improve the system for automatic speech recognition by creating the language models from organized text corpora.

  8. Effect of two different germplasm of Mucuna pruriens seed extracts against some fish pathogens

    Directory of Open Access Journals (Sweden)

    M. Marimuthu M

    2015-07-01

    Full Text Available To investigate the two different germplasm of Mucuna seeds were collected from agro geographical regions was evaluated for its antibacterial activities. Antibacterial activity of the seed extracts was studied against the fish pathogens of Aeromonas hydrophila, Pseudomonas fluorescens, Vibrio cholera and Klebsiella pneumonia using agar well diffusion method. Results showed that methanol and ethanol extracts showed more potent antibacterial activity than other solvent extracts. The results were expressed as mean ± SD. The results obtained in the study shows that velvet bean black seed extract has more antibacterial activity against fish pathogens. The antibacterial activity of all the Mucuna seed extracts are comparable ad their potential as alternative in the treatment of infectious by these microorganisms was present in the fish. Susceptibility testing is conducted on isolates using drug selected on the basis of their importance to human medicine and use I fish production.

  9. [Systematic Readability Analysis of Medical Texts on Websites of German University Clinics for General and Abdominal Surgery].

    Science.gov (United States)

    Esfahani, B Janghorban; Faron, A; Roth, K S; Grimminger, P P; Luers, J C

    2016-12-01

    Background: Besides the function as one of the main contact points, websites of hospitals serve as medical information portals. As medical information texts should be understood by any patients independent of the literacy skills and educational level, online texts should have an appropriate structure to ease understandability. Materials and Methods: Patient information texts on websites of clinics for general surgery at German university hospitals (n = 36) were systematically analysed. For 9 different surgical topics representative medical information texts were extracted from each website. Using common readability tools and 5 different readability indices the texts were analysed concerning their readability and structure. The analysis was furthermore stratified in relation to geographical regions in Germany. Results: For the definite analysis the texts of 196 internet websites could be used. On average the texts consisted of 25 sentences and 368 words. The reading analysis tools congruously showed that all texts showed a rather low readability demanding a high literacy level from the readers. Conclusion: Patient information texts on German university hospital websites are difficult to understand for most patients. To fulfill the ambition of informing the general population in an adequate way about medical issues, a revision of most medical texts on websites of German surgical hospitals is recommended. Georg Thieme Verlag KG Stuttgart · New York.

  10. ANTHOCYANINS ALIPHATIC ALCOHOLS EXTRACTION FEATURES

    Directory of Open Access Journals (Sweden)

    P. N. Savvin

    2015-01-01

    Full Text Available Anthocyanins red pigments that give color a wide range of fruits, berries and flowers. In the food industry it is widely known as a dye a food additive E163. To extract from natural vegetable raw materials traditionally used ethanol or acidified water, but in same technologies it’s unacceptable. In order to expand the use of anthocyanins as colorants and antioxidants were explored extracting pigments alcohols with different structures of the carbon skeleton, and the position and number of hydroxyl groups. For the isolation anthocyanins raw materials were extracted sequentially twice with t = 60 C for 1.5 hours. The evaluation was performed using extracts of classical spectrophotometric methods and modern express chromaticity. Color black currant extracts depends on the length of the carbon skeleton and position of the hydroxyl group, with the alcohols of normal structure have higher alcohols compared to the isomeric structure of the optical density and index of the red color component. This is due to the different ability to form hydrogen bonds when allocating anthocyanins and other intermolecular interactions. During storage blackcurrant extracts are significant structural changes recoverable pigments, which leads to a significant change in color. In this variation, the stronger the higher the length of the carbon skeleton and branched molecules extractant. Extraction polyols (ethyleneglycol, glycerol are less effective than the corresponding monohydric alcohols. However these extracts saved significantly higher because of their reducing ability at interacting with polyphenolic compounds.

  11. Handwriting segmentation of unconstrained Oriya text

    Indian Academy of Sciences (India)

    Segmentation of handwritten text into lines, words and characters is one of the important steps in the handwritten text recognition process. In this paper we propose a water reservoir concept-based scheme for segmentation of unconstrained Oriya handwritten text into individual characters. Here, at first, the text image is ...

  12. SMIL 3.0 smilText

    OpenAIRE

    Bulterman, Dick; Mullender, Sjoerd; Cruz-Lara, S.

    2008-01-01

    htmlabstractThe functionality described in these modules provide a new media type for use in SMIL presentations. This functionality is called smilText. Unlike other media types defined in the media object module, all of which are synonyms of the ref element, the smilText modules provide a text container element with an explicit content model for defining timed text. The smilText modules also define a set of additional elements and attributes to control timed text rendering. All smilText conte...

  13. Understanding a reader's attraction to a literary short text

    Directory of Open Access Journals (Sweden)

    Darío Luis Banegas

    2014-09-01

    Full Text Available The aim of this article is to understand why a reader may feel attracted to a short stretch of fictional discourse. I analyse a short extract taken from Khaled Hosseini’s novel A Thousand Splendid Suns through the integration of different perspectives in discourse analysis. First, I analyse the text in terms of contexts of culture and situation including field, tenor, mode, participants’ social world, setting, channel, and key. In the second section I attempt to examine the text line by line following my interdisciplinary framework of reference. Secondly, I offer a line-by-line analysis through Grice’s maxims, topicality, deixis, coding time, types of utterances and verbal processes, and metaphors. Through my analysis I discovered that my reader’s attraction was based on the combination and integration of different textual devices and my personal interpretation of the pragmatics behind the text.

  14. Arabic Text Categorization Using Improved k-Nearest neighbour Algorithm

    Directory of Open Access Journals (Sweden)

    Wail Hamood KHALED

    2014-10-01

    Full Text Available The quantity of text information published in Arabic language on the net requires the implementation of effective techniques for the extraction and classifying of relevant information contained in large corpus of texts. In this paper we presented an implementation of an enhanced k-NN Arabic text classifier. We apply the traditional k-NN and Naive Bayes from Weka Toolkit for comparison purpose. Our proposed modified k-NN algorithm features an improved decision rule to skip the classes that are less similar and identify the right class from k nearest neighbours which increases the accuracy. The study evaluates the improved decision rule technique using the standard of recall, precision and f-measure as the basis of comparison. We concluded that the effectiveness of the proposed classifier is promising and outperforms the classical k-NN classifier.

  15. Assessing semantic similarity of texts - Methods and algorithms

    Science.gov (United States)

    Rozeva, Anna; Zerkova, Silvia

    2017-12-01

    Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.

  16. Automatic extraction of planetary image features

    Science.gov (United States)

    LeMoigne-Stewart, Jacqueline J. (Inventor); Troglio, Giulia (Inventor); Benediktsson, Jon A. (Inventor); Serpico, Sebastiano B. (Inventor); Moser, Gabriele (Inventor)

    2013-01-01

    A method for the extraction of Lunar data and/or planetary features is provided. The feature extraction method can include one or more image processing techniques, including, but not limited to, a watershed segmentation and/or the generalized Hough Transform. According to some embodiments, the feature extraction method can include extracting features, such as, small rocks. According to some embodiments, small rocks can be extracted by applying a watershed segmentation algorithm to the Canny gradient. According to some embodiments, applying a watershed segmentation algorithm to the Canny gradient can allow regions that appear as close contours in the gradient to be segmented.

  17. Antioxidant, Antibacterial, and Cytotoxic Activities of the Ethanolic Origanum vulgare Extract and Its Major Constituents

    Directory of Open Access Journals (Sweden)

    John Coccimiglio

    2016-01-01

    Full Text Available Oregano is a perennial shrub that grows in the mountains of the Mediterranean and Euro/Irano-Siberian regions. This study was conducted to identify the major constituents of the ethanolic Origanum vulgare extract and examine the cytotoxic, antioxidant, and antibacterial properties of the extract but more importantly the contribution of its specific major constituent(s or their combination to the overall extract biological activity. Gas chromatography/mass spectroscopy analysis showed that the extract contained monoterpene hydrocarbons and phenolic compounds, the major ones being carvacrol and thymol and to a lesser extent p-cymene, 1-octacosanol, creosol, and phytol. A549 epithelial cells challenged with the extract showed a concentration-dependent increase in cytotoxicity. A combination of thymol and carvacrol at equimolar concentrations to those present in the extract was less cytotoxic. The A549 cells pretreated with nonlethal extract concentrations protected against hydrogen-peroxide-induced cytotoxicity, an antioxidant effect more effective than the combination of equimolar concentrations of thymol/carvacrol. Inclusion of p-cymene and/or 1-octacosanol did not alter the synergistic antioxidant effects of the carvacrol/thymol mixture. The extract also exhibited antimicrobial properties against Gram-positive and Gram-negative bacterial strains including clinical isolates. In conclusion, the oregano extract has cytotoxic, antioxidant, and antibacterial activities mostly attributed to carvacrol and thymol.

  18. Caracterização da saúde de trabalhadores florestais envolvidos na extração de madeira em regiões montanhosas Characterization of the health of workers involved in the extraction of wood in mountainous regions

    Directory of Open Access Journals (Sweden)

    Emilia Pio da Silva

    2009-12-01

    Full Text Available Este estudo teve como objetivo caracterizar a saúde dos trabalhadores florestais envolvidos na atividade de extração de madeira em regiões montanhosas. A pesquisa foi realizada em uma empresa florestal localizada no Distrito Florestal do Vale do Rio Doce, sendo estudados 100% dos trabalhadores. Para caracterização da saúde destes, foi utilizado um questionário estruturado em forma de entrevista, baseado na Pesquisa Nacional por Amostra de Domicílio (PNAD. Os resultados evidenciaram que as atividades de extração florestal têm causado impactos negativos sobre a saúde dos trabalhadores, visto que 66% deles disseram sentir dor em alguma parte do corpo, 79% afirmaram ter algum problema dentário, 86% relataram ficar expostos a fatores que prejudicavam sua saúde, 20% apresentaram algum distúrbio do sono, 9% não tinham acesso a saneamento básico e 29% já havia sofrido acidentes de trabalho. Ao término deste estudo, conclui-se que os trabalhadores florestais da extração de madeira estão expostos a situações de vida e trabalho que não contribuem para a promoção e manutenção da saúde desse pessoal.This study aimed to characterize the health of workers involved in the forest for the extraction of wood in the mountainous regions. The study was conducted in a forestry company, located in the Forestry District of Vale do Rio Doce, where 100% of the workers participated in the analysis. To characterize the health of the workers, a structured questionnaire in the form of interview was used, based on the PNAD (National Survey by Household Sample. The results showed that the activities of forest extraction have caused negative impacts on the health of workers, since 66% of workers declared to feel pain in some part of the body, 79% reported to have some dental problem, 86% said that they are exposed to factors that damage their health, 20% reported to present some disturbances of sleep, 9% did not have access to sanitation and 29

  19. Ten Guidelines for Translating Legal Texts

    Directory of Open Access Journals (Sweden)

    Alenka Kocbek

    2017-12-01

    Full Text Available The paper proposes a targeted model for translating legal texts, developed by the author by combining translation science (i.e. functionalist approaches with the findings of comparative law and legal linguistics. It consists of ten guidelines directing the translator from defining the intended function of the target text and selecting the corresponding translation type, through comparing the legal systems involved in the translation and analysing the memetic structure of the source text and parallel texts in the target culture to designing the target text as a cultureme and ensuring its legal security.

  20. Text To Speech System for Telugu Language

    OpenAIRE

    M. Siva Kumar; E. Prakash Babu

    2014-01-01

    Telugu is one of the oldest languages in India. This paper describes the development of Telugu Text-to-Speech System (TTS).In Telugu TTS the input is Telugu text in Unicode. The voices are sampled from real recorded speech. The objective of a text to speech system is to convert an arbitrary text into its corresponding spoken waveform. Speech synthesis is a process of building machinery that can generate human-like speech from any text input to imitate human speakers. Text proc...

  1. Building a glaucoma interaction network using a text mining approach.

    Science.gov (United States)

    Soliman, Maha; Nasraoui, Olfa; Cooper, Nigel G F

    2016-01-01

    The volume of biomedical literature and its underlying knowledge base is rapidly expanding, making it beyond the ability of a single human being to read through all the literature. Several automated methods have been developed to help make sense of this dilemma. The present study reports on the results of a text mining approach to extract gene interactions from the data warehouse of published experimental results which are then used to benchmark an interaction network associated with glaucoma. To the best of our knowledge, there is, as yet, no glaucoma interaction network derived solely from text mining approaches. The presence of such a network could provide a useful summative knowledge base to complement other forms of clinical information related to this disease. A glaucoma corpus was constructed from PubMed Central and a text mining approach was applied to extract genes and their relations from this corpus. The extracted relations between genes were checked using reference interaction databases and classified generally as known or new relations. The extracted genes and relations were then used to construct a glaucoma interaction network. Analysis of the resulting network indicated that it bears the characteristics of a small world interaction network. Our analysis showed the presence of seven glaucoma linked genes that defined the network modularity. A web-based system for browsing and visualizing the extracted glaucoma related interaction networks is made available at http://neurogene.spd.louisville.edu/GlaucomaINViewer/Form1.aspx. This study has reported the first version of a glaucoma interaction network using a text mining approach. The power of such an approach is in its ability to cover a wide range of glaucoma related studies published over many years. Hence, a bigger picture of the disease can be established. To the best of our knowledge, this is the first glaucoma interaction network to summarize the known literature. The major findings were a set of

  2. Angle Class I malocclusion treated with lower incisor extraction

    Directory of Open Access Journals (Sweden)

    Vanessa Leal Tavares Barbosa

    2013-06-01

    Full Text Available In planning orthodontic cases that include extractions as an alternative to solve the problem of negative space discrepancy, the critical decision is to determine which teeth will be extracted. Several aspects must be considered, such as periodontal health, orthodontic mechanics, functional and esthetic alterations, and treatment stability. Despite controversies, extraction of teeth to solve dental crowding is a therapy that has been used for decades. Premolar extractions are the most common, but there are situations in which atypical extractions facilitate mechanics, preserve periodontal health and favor maintenance of the facial profile, which tends to unfavorably change due to facial changes with age. The extraction of a lower incisor, in selected cases, is an effective approach, and literature describes greater post-treatment stability when compared with premolar extractions. This article reports the clinical case of a patient with Angle Class I malocclusion and upper and lower anterior crowding, a balanced face and harmonious facial profile. The presence of gingival and bone recession limited large orthodontic movements. The molars and premolars were well occluded, and the discrepancy was mainly concentrated in the anterior region of the lower dental arch. The extraction of a lower incisor in the most ectopic position and with compromised periodontium, associated with interproximal stripping in the upper and lower arches, was the alternative of choice for this treatment, which restored function, providing improved periodontal health, maintained facial esthetics and allowed finishing with a stable and balanced occlusion. This case was presented to the Brazilian Board of Orthodontics and Dentofacial Orthopedics (BBO, as part of the requirements for obtaining the BBO Diplomate title.

  3. Embedded Bernoulli Mixture HMMs for Continuous Handwritten Text Recognition

    Science.gov (United States)

    Giménez, Adrià; Juan, Alfons

    Hidden Markov Models (HMMs) are now widely used in off-line handwritten text recognition. As in speech recognition, they are usually built from shared, embedded HMMs at symbol level, in which state-conditional probability density functions are modelled with Gaussian mixtures. In contrast to speech recognition, however, it is unclear which kind of real-valued features should be used and, indeed, very different features sets are in use today. In this paper, we propose to by-pass feature extraction and directly fed columns of raw, binary image pixels into embedded Bernoulli mixture HMMs, that is, embedded HMMs in which the emission probabilities are modelled with Bernoulli mixtures. The idea is to ensure that no discriminative information is filtered out during feature extraction, which in some sense is integrated into the recognition model. Good empirical results are reported on the well-known IAM database.

  4. Stylistic devices in french advertising texts

    Directory of Open Access Journals (Sweden)

    А С Борисова

    2009-09-01

    Full Text Available This article deals with the problem of stylistic devices of the lexical level, widely applied in modern French advertising texts as means of linguistic manipulation on mass consciousness.

  5. MORPHOLOGICAL STRATEGIES IN TEXT MESSAGING AMONG ...

    African Journals Online (AJOL)

    MORPHOLOGICAL STRATEGIES IN TEXT MESSAGING AMONG NIGERIAN USERS OF GLOBAL SYSTEM FOR MOBILE COMMUNICATION. ... becoming standardized forms being adopted by the Nigerian users of GSM. Keywords: Morphology and Syntax, SMS texts, GSM (Mobile Phone), Semantics, Nigerian English.

  6. Automatic prediction of text aesthetics and interestingness

    OpenAIRE

    Ganguly, Debasis; Leveling, Johannes; Jones, Gareth J.F.

    2014-01-01

    This paper investigates the problem of automated text aesthetics prediction. The availability of user generated content and ratings, e.g. Flickr, has induced research in aesthetics prediction for non-text domains, particularly for photographic images. This problem, however, has yet not been explored for the text domain. Due to the very subjective nature of text aesthetics, it is dicult to compile human annotated data by methods such as crowd sourcing with a fair degree of inter-annotator agre...

  7. Text comprehension dependence on reading experience

    OpenAIRE

    Tilmantaitė, Kamilė

    2016-01-01

    In bachelor thesis „Text comprehension dependence on reading experience“ – is researching, how students text comprehension is dependent on reading experience. In theoretical part discussed the reading conception and reading methods are discussed as well as the text comprehension, models and reading capacity. The practical part contains of pupils of eighth and tenth classes text comprehension test analysis, questionnaire about reading experience analysis and how they both interdependent. In th...

  8. Multimodal Diversity of Postmodernist Fiction Text

    Directory of Open Access Journals (Sweden)

    U. I. Tykha

    2016-12-01

    Full Text Available The article is devoted to the analysis of structural and functional manifestations of multimodal diversity in postmodernist fiction texts. Multimodality is defined as the coexistence of more than one semiotic mode within a certain context. Multimodal texts feature a diversity of semiotic modes in the communication and development of their narrative. Such experimental texts subvert conventional patterns by introducing various semiotic resources – verbal or non-verbal.

  9. Der Text im Mittelpunkt des Fremdsprachenunterrichts

    OpenAIRE

    KOZŁOWSKI, Aleksander

    1990-01-01

    The paper presents an analysis of the importance and function of text in foreign language teaching. Two types of texts are distinguished: primary (original) and secondary (contrived) ones. The use of text in teaching is recommended mainly for two reasons: communicative-linguistic and psycholinguistic-pedagogical. In teaching texts can be used to establish a basis for the development of receptive competence (e.g. reading), or, to provide direct impulse for creating verbal statement (e.g. inter...

  10. Extractable Work from Correlations

    Directory of Open Access Journals (Sweden)

    Martí Perarnau-Llobet

    2015-10-01

    Full Text Available Work and quantum correlations are two fundamental resources in thermodynamics and quantum information theory. In this work, we study how to use correlations among quantum systems to optimally store work. We analyze this question for isolated quantum ensembles, where the work can be naturally divided into two contributions: a local contribution from each system and a global contribution originating from correlations among systems. We focus on the latter and consider quantum systems that are locally thermal, thus from which any extractable work can only come from correlations. We compute the maximum extractable work for general entangled states, separable states, and states with fixed entropy. Our results show that while entanglement gives an advantage for small quantum ensembles, this gain vanishes for a large number of systems.

  11. Validating Automated Measures of Text Complexity

    Science.gov (United States)

    Sheehan, Kathleen M.

    2017-01-01

    Automated text complexity measurement tools (also called readability metrics) have been proposed as a way to help teachers, textbook publishers, and assessment developers select texts that are closely aligned with the new, more demanding text complexity expectations specified in the Common Core State Standards. This article examines a critical…

  12. Object reading: text recognition for object recognition

    NARCIS (Netherlands)

    Karaoglu, S.; van Gemert, J.C.; Gevers, T.

    2012-01-01

    We propose to use text recognition to aid in visual object class recognition. To this end we first propose a new algorithm for text detection in natural images. The proposed text detection is based on saliency cues and a context fusion step. The algorithm does not need any parameter tuning and can

  13. The text as a linguistic object lesson

    OpenAIRE

    Zhazheva Saida; Zhazheva Dariet

    2014-01-01

    The article considers the question of the text as a linguistic object lesson. Given the ways of the organization of thetext, the main purpose of the text is reduced to the solution of the most various tasks, and to create a variety of meanings setup by the author of the text.

  14. Refutation Texts for Effective Climate Change Education

    Science.gov (United States)

    Nussbaum, E. Michael; Cordova, Jacqueline R.; Rehmat, Abeera P.

    2017-01-01

    Refutation texts, which are texts that rebut scientific misconceptions and explain the normative concept, can be effective devices for addressing misconceptions and affecting conceptual change. However, few, if any, refutation texts specifically related to climate change have been validated for effectiveness. In this project, we developed and…

  15. The Costs of Texting in the Classroom

    Science.gov (United States)

    Lawson, Dakota; Henderson, Bruce B.

    2015-01-01

    Many college students seem to find it impossible to resist the temptation to text on electronic devices during class lectures and discussions. One common response of college professors is to yield to the inevitable and try to ignore student texting. However, research indicates that because of limited cognitive capacities, even simple texting can…

  16. Effects of Text Messaging on Academic Performance

    Directory of Open Access Journals (Sweden)

    Barks Amanda

    2011-12-01

    Full Text Available University students frequently send and receive cellular phone text messages during classroominstruction. Cognitive psychology research indicates that multi-tasking is frequently associatedwith performance cost. However, university students often have considerable experience withelectronic multi-tasking and may believe that they can devote necessary attention to a classroomlecture while sending and receiving text messages. In the current study, university students whoused text messaging were randomly assigned to one of two conditions: 1. a group that sent andreceived text messages during a lecture or, 2. a group that did not engage in text messagingduring the lecture. Participants who engaged in text messaging demonstrated significantlypoorer performance on a test covering lecture content compared with the group that did notsend and receive text messages. Participants exhibiting higher levels of text messaging skill hadsignificantly lower test scores than participants who were less proficient at text messaging. It ishypothesized that in terms of retention of lecture material, more frequent task shifting by thosewith greater text messaging proficiency contributed to poorer performance. Overall, the findingsdo not support the view, held by many university students, that this form of multitasking has littleeffect on the acquisition of lecture content. Results provide empirical support for teachers andprofessors who ban text messaging in the classroom.

  17. Female gender stereotype in French advertising texts

    Directory of Open Access Journals (Sweden)

    А С Борисова

    2008-06-01

    Full Text Available This article deals with the problem of female gender stereotypes in French advertising texts. On the ground of the practical analysis of advertising texts published in some modern French periodicals, we managed to expose and define general and national-cultural female gender stereotypes fixed in collective consciousness of the French.

  18. Teacher Modeling Using Complex Informational Texts

    Science.gov (United States)

    Fisher, Douglas; Frey, Nancy

    2015-01-01

    Modeling in complex texts requires that teachers analyze the text for factors of qualitative complexity and then design lessons that introduce students to that complexity. In addition, teachers can model the disciplinary nature of content area texts as well as word solving and comprehension strategies. Included is a planning guide for think aloud.

  19. Creating and Using Culturally Sustaining Informational Texts

    Science.gov (United States)

    Watanabe Kganetso, Lynne M.

    2017-01-01

    Current standards and assessments emphasize the importance of a variety of genres in students' literacy diets, which has placed increased attention on informational texts. Unfortunately, young students' current exposure to and experiences with informational texts are often limited by the texts' availability, quality, and relevance to children's…

  20. Center for Electronic Texts in the Humanities.

    Science.gov (United States)

    Gaunt, Marianne I.

    1994-01-01

    Describes the development and activities of the Center for Electronic Texts in the Humanities, established by Princeton University and Rutgers University to provide a national focus for the development, dissemination, and use of electronic texts in the humanities. Sidebars explain the Text Encoding Initiative and Standard Generalized Markup…

  1. Biomedical text mining and its applications in cancer research.

    Science.gov (United States)

    Zhu, Fei; Patumcharoenpol, Preecha; Zhang, Cheng; Yang, Yang; Chan, Jonathan; Meechai, Asawin; Vongsangnak, Wanwipa; Shen, Bairong

    2013-04-01

    Cancer is a malignant disease that has caused millions of human deaths. Its study has a long history of well over 100years. There have been an enormous number of publications on cancer research. This integrated but unstructured biomedical text is of great value for cancer diagnostics, treatment, and prevention. The immense body and rapid growth of biomedical text on cancer has led to the appearance of a large number of text mining techniques aimed at extracting novel knowledge from scientific text. Biomedical text mining on cancer research is computationally automatic and high-throughput in nature. However, it is error-prone due to the complexity of natural language processing. In this review, we introduce the basic concepts underlying text mining and examine some frequently used algorithms, tools, and data sets, as well as assessing how much these algorithms have been utilized. We then discuss the current state-of-the-art text mining applications in cancer research and we also provide some resources for cancer text mining. With the development of systems biology, researchers tend to understand complex biomedical systems from a systems biology viewpoint. Thus, the full utilization of text mining to facilitate cancer systems biology research is fast becoming a major concern. To address this issue, we describe the general workflow of text mining in cancer systems biology and each phase of the workflow. We hope that this review can (i) provide a useful overview of the current work of this field; (ii) help researchers to choose text mining tools and datasets; and (iii) highlight how to apply text mining to assist cancer systems biology research. Copyright © 2012 Elsevier Inc. All rights reserved.

  2. Bilingual Text Messaging Translation: Translating Text Messages From English Into Spanish for the Text4Walking Program.

    Science.gov (United States)

    Buchholz, Susan Weber; Sandi, Giselle; Ingram, Diana; Welch, Mary Jane; Ocampo, Edith V

    2015-05-06

    Hispanic adults in the United States are at particular risk for diabetes and inadequate blood pressure control. Physical activity improves these health problems; however Hispanic adults also have a low rate of recommended aerobic physical activity. To address improving physical inactivity, one area of rapidly growing technology that can be utilized is text messaging (short message service, SMS). A physical activity research team, Text4Walking, had previously developed an initial database of motivational physical activity text messages in English that could be used for physical activity text messaging interventions. However, the team needed to translate these existing English physical activity text messages into Spanish in order to have culturally meaningful and useful text messages for those adults within the Hispanic population who would prefer to receive text messages in Spanish. The aim of this study was to translate a database of English motivational physical activity messages into Spanish and review these text messages with a group of Spanish speaking adults to inform the use of these text messages in an intervention study. The consent form and study documents, including the existing English physical activity text messages, were translated from English into Spanish, and received translation certification as well as Institutional Review Board approval. The translated text messages were placed into PowerPoint, accompanied by a set of culturally appropriate photos depicting barriers to walking, as well as walking scenarios. At the focus group, eligibility criteria for this study included being an adult between 30 to 65 years old who spoke Spanish as their primary language. After a general group introduction, participants were placed into smaller groups of two or three. Each small group was asked to review a segment of the translated text messages for accuracy and meaningfulness. After the break out, the group was brought back together to review the text messages

  3. The Instructional Text like a Textual Genre

    Directory of Open Access Journals (Sweden)

    Adiane Fogali Marinello

    2011-07-01

    Full Text Available This article analyses the instructional text as a textual genre and is part of the research called Reading and text production from the textual genre perspective, done at Universidade de Caxias do Sul, Campus Universitário da Região dos Vinhedos. Firstly, some theoretical assumptions about textual genre are presented, then, the instructional text is characterized. After that an instructional text is analyzed and, finally, some activities related to reading and writing of the mentioned genre directed to High School and University students are suggested.

  4. An Embedded Application for Degraded Text Recognition

    Directory of Open Access Journals (Sweden)

    Thillou Céline

    2005-01-01

    Full Text Available This paper describes a mobile device which tries to give the blind or visually impaired access to text information. Three key technologies are required for this system: text detection, optical character recognition, and speech synthesis. Blind users and the mobile environment imply two strong constraints. First, pictures will be taken without control on camera settings and a priori information on text (font or size and background. The second issue is to link several techniques together with an optimal compromise between computational constraints and recognition efficiency. We will present the overall description of the system from text detection to OCR error correction.

  5. Text segmentation in degraded historical document images

    Directory of Open Access Journals (Sweden)

    A.S. Kavitha

    2016-07-01

    Full Text Available Text segmentation from degraded Historical Indus script images helps Optical Character Recognizer (OCR to achieve good recognition rates for Hindus scripts; however, it is challenging due to complex background in such images. In this paper, we present a new method for segmenting text and non-text in Indus documents based on the fact that text components are less cursive compared to non-text ones. To achieve this, we propose a new combination of Sobel and Laplacian for enhancing degraded low contrast pixels. Then the proposed method generates skeletons for text components in enhanced images to reduce computational burdens, which in turn helps in studying component structures efficiently. We propose to study the cursiveness of components based on branch information to remove false text components. The proposed method introduces the nearest neighbor criterion for grouping components in the same line, which results in clusters. Furthermore, the proposed method classifies these clusters into text and non-text cluster based on characteristics of text components. We evaluate the proposed method on a large dataset containing varieties of images. The results are compared with the existing methods to show that the proposed method is effective in terms of recall and precision.

  6. Text-speak processing impairs tactile location.

    Science.gov (United States)

    Head, James; Helton, William; Russell, Paul; Neumann, Ewald

    2012-09-01

    Dual task experiments have highlighted that driving while having a conversation on a cell phone can have negative impacts on driving (Strayer & Drews, 2007). It has also been noted that this negative impact is greater when reading a text-message (Lee, 2007). Commonly used in text-messaging are shortening devices collectively known as text-speak (e.g.,Ys I wll ttyl 2nite, Yes I will talk to you later tonight). To the authors' knowledge, there has been no investigation into the potential negative impacts of reading text-speak on concurrent performance on other tasks. Forty participants read a correctly spelled story and a story presented in text-speak while concurrently monitoring for a vibration around their waist. Slower reaction times and fewer correct vibration detections occurred while reading text-speak than while reading a correctly spelled story. The results suggest that reading text-speak imposes greater cognitive load than reading correctly spelled text. These findings suggest that the negative impact of text messaging on driving may be compounded by the messages being in text-speak, instead of orthographically correct text. Copyright © 2012 Elsevier B.V. All rights reserved.

  7. The nuclear modification of charged particles in Pb-Pb at $\\sqrt{\\text{s}_\\text{NN}} = \\text{5.02}\\,\\text{TeV}$ measured with ALICE

    CERN Document Server

    Gronefeld, Julius

    2016-09-21

    The study of inclusive charged-particle production in heavy-ion collisions provides insights into the density of the medium and the energy-loss mechanisms. The observed suppression of high-$\\textit{p}_\\text{T}$ yield is generally attributed to energy loss of partons as they propagate through a deconfined state of quarks and gluons - Quark-Gluon Plasma (QGP) - predicted by QCD. Such measurements allow the characterization of the QGP by comparison with models. In these proceedings, results on high-$\\textit{p}_\\text{T}$ particle production measured by ALICE in Pb-Pb collisions at $ \\sqrt{\\text{s}_\\text{NN}}\\, = 5.02\\ \\rm{TeV}$ as well as well in pp at $\\sqrt{\\text{s}}\\,=5.02\\ \\rm{TeV}$ are presented for the first time. The nuclear modification factors ($\\text{R}_\\text{AA}$) in Pb-Pb collisions are presented and compared with model calculations.