WorldWideScience

Sample records for robust richly annotated

  1. Robust Optical Richness Estimation with Reduced Scatter

    Energy Technology Data Exchange (ETDEWEB)

    Rykoff, E.S.; /LBL, Berkeley; Koester, B.P.; /Chicago U. /Chicago U., KICP; Rozo, E.; /Chicago U. /Chicago U., KICP; Annis, J.; /Fermilab; Evrard, A.E.; /Michigan U. /Michigan U., MCTP; Hansen, S.M.; /Lick Observ.; Hao, J.; /Fermilab; Johnston, D.E.; /Fermilab; McKay, T.A.; /Michigan U. /Michigan U., MCTP; Wechsler, R.H.; /KIPAC, Menlo Park /SLAC

    2012-06-07

    Reducing the scatter between cluster mass and optical richness is a key goal for cluster cosmology from photometric catalogs. We consider various modifications to the red-sequence matched filter richness estimator of Rozo et al. (2009b), and evaluate their impact on the scatter in X-ray luminosity at fixed richness. Most significantly, we find that deeper luminosity cuts can reduce the recovered scatter, finding that {sigma}{sub ln L{sub X}|{lambda}} = 0.63 {+-} 0.02 for clusters with M{sub 500c} {approx}> 1.6 x 10{sup 14} h{sub 70}{sup -1} M{sub {circle_dot}}. The corresponding scatter in mass at fixed richness is {sigma}{sub ln M|{lambda}} {approx} 0.2-0.3 depending on the richness, comparable to that for total X-ray luminosity. We find that including blue galaxies in the richness estimate increases the scatter, as does weighting galaxies by their optical luminosity. We further demonstrate that our richness estimator is very robust. Specifically, the filter employed when estimating richness can be calibrated directly from the data, without requiring a-priori calibrations of the red-sequence. We also demonstrate that the recovered richness is robust to up to 50% uncertainties in the galaxy background, as well as to the choice of photometric filter employed, so long as the filters span the 4000 {angstrom} break of red-sequence galaxies. Consequently, our richness estimator can be used to compare richness estimates of different clusters, even if they do not share the same photometric data. Appendix A includes 'easy-bake' instructions for implementing our optimal richness estimator, and we are releasing an implementation of the code that works with SDSS data, as well as an augmented maxBCG catalog with the {lambda} richness measured for each cluster.

  2. Automatically annotating web pages using Google Rich Snippets

    NARCIS (Netherlands)

    Hogenboom, F.P.; Frasincar, F.; Vandic, D.; Meer, van der J.; Boon, F.; Kaymak, U.

    2011-01-01

    We propose the Automatic Review Recognition and annO- tation of Web pages (ARROW) framework, a framework for Web page review identification and annotation using RDFa Google Rich Snippets. The ARROW framework consists of four steps: hotspot identification, subjectivity analysis, in- formation

  3. A robust data-driven approach for gene ontology annotation.

    Science.gov (United States)

    Li, Yanpeng; Yu, Hong

    2014-01-01

    Gene ontology (GO) and GO annotation are important resources for biological information management and knowledge discovery, but the speed of manual annotation became a major bottleneck of database curation. BioCreative IV GO annotation task aims to evaluate the performance of system that automatically assigns GO terms to genes based on the narrative sentences in biomedical literature. This article presents our work in this task as well as the experimental results after the competition. For the evidence sentence extraction subtask, we built a binary classifier to identify evidence sentences using reference distance estimator (RDE), a recently proposed semi-supervised learning method that learns new features from around 10 million unlabeled sentences, achieving an F1 of 19.3% in exact match and 32.5% in relaxed match. In the post-submission experiment, we obtained 22.1% and 35.7% F1 performance by incorporating bigram features in RDE learning. In both development and test sets, RDE-based method achieved over 20% relative improvement on F1 and AUC performance against classical supervised learning methods, e.g. support vector machine and logistic regression. For the GO term prediction subtask, we developed an information retrieval-based method to retrieve the GO term most relevant to each evidence sentence using a ranking function that combined cosine similarity and the frequency of GO terms in documents, and a filtering method based on high-level GO classes. The best performance of our submitted runs was 7.8% F1 and 22.2% hierarchy F1. We found that the incorporation of frequency information and hierarchy filtering substantially improved the performance. In the post-submission evaluation, we obtained a 10.6% F1 using a simpler setting. Overall, the experimental analysis showed our approaches were robust in both the two tasks. © The Author(s) 2014. Published by Oxford University Press.

  4. A framework for automatic annotation of web pages using the Google Rich Snippets vocabulary

    NARCIS (Netherlands)

    Meer, van der J.; Boon, F.; Hogenboom, F.P.; Frasincar, F.; Kaymak, U.

    2011-01-01

    One of the latest developments for the Semantic Web is Google Rich Snippets, a service that uses Web page annotations for displaying search results in a visually appealing manner. In this paper we propose the Automatic Review Recognition and annOtation of Web pages (ARROW) framework, which is able

  5. Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles.

    Science.gov (United States)

    Cohen, K Bretonnel; Lanfranchi, Arrick; Choi, Miji Joo-Young; Bada, Michael; Baumgartner, William A; Panteleyeva, Natalya; Verspoor, Karin; Palmer, Martha; Hunter, Lawrence E

    2017-08-17

    Coreference resolution is the task of finding strings in text that have the same referent as other strings. Failures of coreference resolution are a common cause of false negatives in information extraction from the scientific literature. In order to better understand the nature of the phenomenon of coreference in biomedical publications and to increase performance on the task, we annotated the Colorado Richly Annotated Full Text (CRAFT) corpus with coreference relations. The corpus was manually annotated with coreference relations, including identity and appositives for all coreferring base noun phrases. The OntoNotes annotation guidelines, with minor adaptations, were used. Interannotator agreement ranges from 0.480 (entity-based CEAF) to 0.858 (Class-B3), depending on the metric that is used to assess it. The resulting corpus adds nearly 30,000 annotations to the previous release of the CRAFT corpus. Differences from related projects include a much broader definition of markables, connection to extensive annotation of several domain-relevant semantic classes, and connection to complete syntactic annotation. Tool performance was benchmarked on the data. A publicly available out-of-the-box, general-domain coreference resolution system achieved an F-measure of 0.14 (B3), while a simple domain-adapted rule-based system achieved an F-measure of 0.42. An ensemble of the two reached F of 0.46. Following the IDENTITY chains in the data would add 106,263 additional named entities in the full 97-paper corpus, for an increase of 76% percent in the semantic classes of the eight ontologies that have been annotated in earlier versions of the CRAFT corpus. The project produced a large data set for further investigation of coreference and coreference resolution in the scientific literature. The work raised issues in the phenomenon of reference in this domain and genre, and the paper proposes that many mentions that would be considered generic in the general domain are not

  6. Robust detection and tracking of annotations for outdoor augmented reality browsing

    Science.gov (United States)

    Langlotz, Tobias; Degendorfer, Claus; Mulloni, Alessandro; Schall, Gerhard; Reitmayr, Gerhard; Schmalstieg, Dieter

    2011-01-01

    A common goal of outdoor augmented reality (AR) is the presentation of annotations that are registered to anchor points in the real world. We present an enhanced approach for registering and tracking such anchor points, which is suitable for current generation mobile phones and can also successfully deal with the wide variety of viewing conditions encountered in real life outdoor use. The approach is based on on-the-fly generation of panoramic images by sweeping the camera over the scene. The panoramas are then used for stable orientation tracking, while the user is performing only rotational movements. This basic approach is improved by several new techniques for the re-detection and tracking of anchor points. For the re-detection, specifically after temporal variations, we first compute a panoramic image with extended dynamic range, which can better represent varying illumination conditions. The panorama is then searched for known anchor points, while orientation tracking continues uninterrupted. We then use information from an internal orientation sensor to prime an active search scheme for the anchor points, which improves matching results. Finally, global consistency is enhanced by statistical estimation of a global rotation that minimizes the overall position error of anchor points when transforming them from the source panorama in which they were created, to the current view represented by a new panorama. Once the anchor points are redetected, we track the user's movement using a novel 3-degree-of-freedom orientation tracking approach that combines vision tracking with the absolute orientation from inertial and magnetic sensors. We tested our system using an AR campus guide as an example application and provide detailed results for our approach using an off-the-shelf smartphone. Results show that the re-detection rate is improved by a factor of 2 compared to previous work and reaches almost 90% for a wide variety of test cases while still keeping the ability

  7. Robust detection and tracking of annotations for outdoor augmented reality browsing.

    Science.gov (United States)

    Langlotz, Tobias; Degendorfer, Claus; Mulloni, Alessandro; Schall, Gerhard; Reitmayr, Gerhard; Schmalstieg, Dieter

    2011-08-01

    A common goal of outdoor augmented reality (AR) is the presentation of annotations that are registered to anchor points in the real world. We present an enhanced approach for registering and tracking such anchor points, which is suitable for current generation mobile phones and can also successfully deal with the wide variety of viewing conditions encountered in real life outdoor use. The approach is based on on-the-fly generation of panoramic images by sweeping the camera over the scene. The panoramas are then used for stable orientation tracking, while the user is performing only rotational movements. This basic approach is improved by several new techniques for the re-detection and tracking of anchor points. For the re-detection, specifically after temporal variations, we first compute a panoramic image with extended dynamic range, which can better represent varying illumination conditions. The panorama is then searched for known anchor points, while orientation tracking continues uninterrupted. We then use information from an internal orientation sensor to prime an active search scheme for the anchor points, which improves matching results. Finally, global consistency is enhanced by statistical estimation of a global rotation that minimizes the overall position error of anchor points when transforming them from the source panorama in which they were created, to the current view represented by a new panorama. Once the anchor points are redetected, we track the user's movement using a novel 3-degree-of-freedom orientation tracking approach that combines vision tracking with the absolute orientation from inertial and magnetic sensors. We tested our system using an AR campus guide as an example application and provide detailed results for our approach using an off-the-shelf smartphone. Results show that the re-detection rate is improved by a factor of 2 compared to previous work and reaches almost 90% for a wide variety of test cases while still keeping the ability

  8. Robust

    DEFF Research Database (Denmark)

    2017-01-01

    Robust – Reflections on Resilient Architecture’, is a scientific publication following the conference of the same name in November of 2017. Researches and PhD-Fellows, associated with the Masters programme: Cultural Heritage, Transformation and Restoration (Transformation), at The Royal Danish...

  9. Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases.

    Science.gov (United States)

    Wollbrett, Julien; Larmande, Pierre; de Lamotte, Frédéric; Ruiz, Manuel

    2013-04-15

    In recent years, a large amount of "-omics" data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic.

  10. Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases

    Science.gov (United States)

    2013-01-01

    Background In recent years, a large amount of “-omics” data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. Results We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. Conclusions BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic. PMID:23586394

  11. Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob; Hohimer, Ryan E.; White, Amanda M.

    2006-06-06

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  12. Automating Ontological Annotation with WordNet

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob L.; Hohimer, Ryan E.; White, Amanda M.

    2006-01-22

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  13. BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments.

    Science.gov (United States)

    López-Fernández, H; Reboiro-Jato, M; Glez-Peña, D; Aparicio, F; Gachet, D; Buenaga, M; Fdez-Riverola, F

    2013-07-01

    Automatic term annotation from biomedical documents and external information linking are becoming a necessary prerequisite in modern computer-aided medical learning systems. In this context, this paper presents BioAnnote, a flexible and extensible open-source platform for automatically annotating biomedical resources. Apart from other valuable features, the software platform includes (i) a rich client enabling users to annotate multiple documents in a user friendly environment, (ii) an extensible and embeddable annotation meta-server allowing for the annotation of documents with local or remote vocabularies and (iii) a simple client/server protocol which facilitates the use of our meta-server from any other third-party application. In addition, BioAnnote implements a powerful scripting engine able to perform advanced batch annotations. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  14. Robust high pressure stability and negative thermal expansion in sodium-rich antiperovskites Na{sub 3}OBr and Na{sub 4}OI{sub 2}

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Yonggang, E-mail: yyggwang@gmail.com, E-mail: yangwg@hpstar.ac.cn, E-mail: yusheng.zhao@unlv.edu [High Pressure Science and Engineering Center, University of Nevada, Las Vegas, Nevada 89154 (United States); Institute of Nanostructured Functional Materials, Huanghe Science and Technology College, Zhengzhou, Henan 450006 (China); High Pressure Synergetic Consortium (HPSynC), Geophysical Laboratory, Carnegie Institution of Washington, Argonne, Illinois 60439 (United States); Wen, Ting [Institute of Nanostructured Functional Materials, Huanghe Science and Technology College, Zhengzhou, Henan 450006 (China); Park, Changyong; Kenney-Benson, Curtis [High Pressure Collaborative Access Team (HPCAT), Geophysical Laboratory, Carnegie Institution of Washington, Argonne, Illinois 60439 (United States); Pravica, Michael; Zhao, Yusheng, E-mail: yyggwang@gmail.com, E-mail: yangwg@hpstar.ac.cn, E-mail: yusheng.zhao@unlv.edu [High Pressure Science and Engineering Center, University of Nevada, Las Vegas, Nevada 89154 (United States); Yang, Wenge, E-mail: yyggwang@gmail.com, E-mail: yangwg@hpstar.ac.cn, E-mail: yusheng.zhao@unlv.edu [High Pressure Synergetic Consortium (HPSynC), Geophysical Laboratory, Carnegie Institution of Washington, Argonne, Illinois 60439 (United States); Center for High Pressure Science and Technology Advanced Research (HPSTAR), Shanghai 201203 (China)

    2016-01-14

    The structure stability under high pressure and thermal expansion behavior of Na{sub 3}OBr and Na{sub 4}OI{sub 2}, two prototypes of alkali-metal-rich antiperovskites, were investigated by in situ synchrotron X-ray diffraction techniques under high pressure and low temperature. Both are soft materials with bulk modulus of 58.6 GPa and 52.0 GPa for Na{sub 3}OBr and Na{sub 4}OI{sub 2}, respectively. The cubic Na{sub 3}OBr structure and tetragonal Na{sub 4}OI{sub 2} with intergrowth K{sub 2}NiF{sub 4} structure are stable under high pressure up to 23 GPa. Although being a characteristic layered structure, Na{sub 4}OI{sub 2} exhibits nearly isotropic compressibility. Negative thermal expansion was observed at low temperature range (20–80 K) in both transition-metal-free antiperovskites for the first time. The robust high pressure structure stability was examined and confirmed by first-principles calculations among various possible polymorphisms qualitatively. The results provide in-depth understanding of the negative thermal expansion and robust crystal structure stability of these antiperovskite systems and their potential applications.

  15. Robust high pressure stability and negative thermal expansion in sodium-rich antiperovskites Na3OBr and Na4OI2

    International Nuclear Information System (INIS)

    Wang, Yonggang; Wen, Ting; Park, Changyong; Kenney-Benson, Curtis; Pravica, Michael; Zhao, Yusheng; Yang, Wenge

    2016-01-01

    The structure stability under high pressure and thermal expansion behavior of Na 3 OBr and Na 4 OI 2 , two prototypes of alkali-metal-rich antiperovskites, were investigated by in situ synchrotron X-ray diffraction techniques under high pressure and low temperature. Both are soft materials with bulk modulus of 58.6 GPa and 52.0 GPa for Na 3 OBr and Na 4 OI 2 , respectively. The cubic Na 3 OBr structure and tetragonal Na 4 OI 2 with intergrowth K 2 NiF 4 structure are stable under high pressure up to 23 GPa. Although being a characteristic layered structure, Na 4 OI 2 exhibits nearly isotropic compressibility. Negative thermal expansion was observed at low temperature range (20–80 K) in both transition-metal-free antiperovskites for the first time. The robust high pressure structure stability was examined and confirmed by first-principles calculations among various possible polymorphisms qualitatively. The results provide in-depth understanding of the negative thermal expansion and robust crystal structure stability of these antiperovskite systems and their potential applications

  16. Annotated bibliography

    International Nuclear Information System (INIS)

    1997-08-01

    Under a cooperative agreement with the U.S. Department of Energy's Office of Science and Technology, Waste Policy Institute (WPI) is conducting a five-year research project to develop a research-based approach for integrating communication products in stakeholder involvement related to innovative technology. As part of the research, WPI developed this annotated bibliography which contains almost 100 citations of articles/books/resources involving topics related to communication and public involvement aspects of deploying innovative cleanup technology. To compile the bibliography, WPI performed on-line literature searches (e.g., Dialog, International Association of Business Communicators Public Relations Society of America, Chemical Manufacturers Association, etc.), consulted past years proceedings of major environmental waste cleanup conferences (e.g., Waste Management), networked with professional colleagues and DOE sites to gather reports or case studies, and received input during the August 1996 Research Design Team meeting held to discuss the project's research methodology. Articles were selected for annotation based upon their perceived usefulness to the broad range of public involvement and communication practitioners

  17. Phylogenetic molecular function annotation

    International Nuclear Information System (INIS)

    Engelhardt, Barbara E; Jordan, Michael I; Repo, Susanna T; Brenner, Steven E

    2009-01-01

    It is now easier to discover thousands of protein sequences in a new microbial genome than it is to biochemically characterize the specific activity of a single protein of unknown function. The molecular functions of protein sequences have typically been predicted using homology-based computational methods, which rely on the principle that homologous proteins share a similar function. However, some protein families include groups of proteins with different molecular functions. A phylogenetic approach for predicting molecular function (sometimes called 'phylogenomics') is an effective means to predict protein molecular function. These methods incorporate functional evidence from all members of a family that have functional characterizations using the evolutionary history of the protein family to make robust predictions for the uncharacterized proteins. However, they are often difficult to apply on a genome-wide scale because of the time-consuming step of reconstructing the phylogenies of each protein to be annotated. Our automated approach for function annotation using phylogeny, the SIFTER (Statistical Inference of Function Through Evolutionary Relationships) methodology, uses a statistical graphical model to compute the probabilities of molecular functions for unannotated proteins. Our benchmark tests showed that SIFTER provides accurate functional predictions on various protein families, outperforming other available methods.

  18. Essential Requirements for Digital Annotation Systems

    Directory of Open Access Journals (Sweden)

    ADRIANO, C. M.

    2012-06-01

    Full Text Available Digital annotation systems are usually based on partial scenarios and arbitrary requirements. Accidental and essential characteristics are usually mixed in non explicit models. Documents and annotations are linked together accidentally according to the current technology, allowing for the development of disposable prototypes, but not to the support of non-functional requirements such as extensibility, robustness and interactivity. In this paper we perform a careful analysis on the concept of annotation, studying the scenarios supported by digital annotation tools. We also derived essential requirements based on a classification of annotation systems applied to existing tools. The analysis performed and the proposed classification can be applied and extended to other type of collaborative systems.

  19. Robust Exploration and Commercial Missions to the Moon Using Nuclear Thermal Rocket Propulsion and Lunar Liquid Oxygen Derived from FeO-Rich Pyroclasitc Deposits

    Science.gov (United States)

    Borowski, Stanley K.; Ryan, Stephen W.; Burke, Laura M.; McCurdy, David R.; Fittje, James E.; Joyner, Claude R.

    2018-01-01

    The nuclear thermal rocket (NTR) has frequently been identified as a key space asset required for the human exploration of Mars. This proven technology can also provide the affordable access through cislunar space necessary for commercial development and sustained human presence on the Moon. It is a demonstrated technology capable of generating both high thrust and high specific impulse (I(sub sp) approx. 900 s) twice that of today's best chemical rockets. Nuclear lunar transfer vehicles-consisting of a propulsion stage using three approx. 16.5-klb(sub f) small nuclear rocket engines (SNREs), an in-line propellant tank, plus the payload-are reusable, enabling a variety of lunar missions. These include cargo delivery and crewed lunar landing missions. Even weeklong ''tourism'' missions carrying passengers into lunar orbit for a day of sightseeing and picture taking are possible. The NTR can play an important role in the next phase of lunar exploration and development by providing a robust in-space lunar transportation system (LTS) that can allow initial outposts to evolve into settlements supported by a variety of commercial activities such as in-situ propellant production used to supply strategically located propellant depots and transportation nodes. The use of lunar liquid oxygen (LLO2) derived from iron oxide (FeO)-rich volcanic glass beads, found in numerous pyroclastic deposits on the Moon, can significantly reduce the launch mass requirements from Earth by enabling reusable, surface-based lunar landing vehicles (LLVs)that use liquid oxygen and hydrogen (LO2/LH2) chemical rocket engines. Afterwards, a LO2/LH2 propellant depot can be established in lunar equatorial orbit to supply the LTS. At this point a modified version of the conventional NTR-called the LO2-augmented NTR, or LANTR-is introduced into the LTS allowing bipropellant operation and leveraging the mission benefits of refueling with lunar-derived propellants for Earth return. The bipropellant LANTR

  20. Robust Exploration and Commercial Missions to the Moon Using LANTR Propulsion and Lunar Liquid Oxygen Derived from FeO-Rich Pyroclastic Deposits

    Science.gov (United States)

    Borowski, Stanley K.; Ryan, Stephen W.; Burke, Laura M.; McCurdy, David R.; Fittje, James E.; Joyner, Claude R.

    2017-01-01

    The nuclear thermal rocket (NTR) has frequently been identified as a key space asset required for the human exploration of Mars. This proven technology can also provide the affordable access through cislunar space necessary for commercial development and sustained human presence on the Moon. It is a demonstrated technology capable of generating both high thrust and high specific impulse (Isp approx.900 s) twice that of todays best chemical rockets. Nuclear lunar transfer vehicles consisting of a propulsion stage using three approx.16.5 klbf Small Nuclear Rocket Engines (SNREs), an in-line propellant tank, plus the payload can enable a variety of reusable lunar missions. These include cargo delivery and crewed lunar landing missions. Even weeklong tourism missions carrying passengers into lunar orbit for a day of sightseeing and picture taking are possible. The NTR can play an important role in the next phase of lunar exploration and development by providing a robust in-space lunar transportation system (LTS) that can allow initial outposts to evolve into settlements supported by a variety of commercial activities such as in-situ propellant production used to supply strategically located propellant depots and transportation nodes. The use of lunar liquid oxygen (LLO2) derived from iron oxide (FeO)-rich volcanic glass beads, found in numerous pyroclastic deposits on the Moon, can significantly reduce the launch mass requirements from Earth by enabling reusable, surface-based lunar landing vehicles (LLVs) using liquid oxygen/hydrogen (LO2/H2) chemical rocket engines. Afterwards, a LO2/H2 propellant depot can be established in lunar equatorial orbit to supply the LTS. At this point a modified version of the conventional NTR called the LOX-augmented NTR, or LANTR is introduced into the LTS allowing bipropellant operation and leveraging the mission benefits of refueling with lunar-derived propellants for Earth return. The bipropellant LANTR engine utilizes the large

  1. Concept annotation in the CRAFT corpus.

    Science.gov (United States)

    Bada, Michael; Eckert, Miriam; Evans, Donald; Garcia, Kristin; Shipley, Krista; Sitnikov, Dmitry; Baumgartner, William A; Cohen, K Bretonnel; Verspoor, Karin; Blake, Judith A; Hunter, Lawrence E

    2012-07-09

    Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text. This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement. As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.

  2. Ubiquitous Annotation Systems

    DEFF Research Database (Denmark)

    Hansen, Frank Allan

    2006-01-01

    Ubiquitous annotation systems allow users to annotate physical places, objects, and persons with digital information. Especially in the field of location based information systems much work has been done to implement adaptive and context-aware systems, but few efforts have focused on the general...... requirements for linking information to objects in both physical and digital space. This paper surveys annotation techniques from open hypermedia systems, Web based annotation systems, and mobile and augmented reality systems to illustrate different approaches to four central challenges ubiquitous annotation...... systems have to deal with: anchoring, structuring, presentation, and authoring. Through a number of examples each challenge is discussed and HyCon, a context-aware hypermedia framework developed at the University of Aarhus, Denmark, is used to illustrate an integrated approach to ubiquitous annotations...

  3. CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L. methylation filtered genomic genespace sequences

    Directory of Open Access Journals (Sweden)

    Spraggins Thomas A

    2007-04-01

    Full Text Available Abstract Background Cowpea [Vigna unguiculata (L. Walp.] is one of the most important food and forage legumes in the semi-arid tropics because of its ability to tolerate drought and grow on poor soils. It is cultivated mostly by poor farmers in developing countries, with 80% of production taking place in the dry savannah of tropical West and Central Africa. Cowpea is largely an underexploited crop with relatively little genomic information available for use in applied plant breeding. The goal of the Cowpea Genomics Initiative (CGI, funded by the Kirkhouse Trust, a UK-based charitable organization, is to leverage modern molecular genetic tools for gene discovery and cowpea improvement. One aspect of the initiative is the sequencing of the gene-rich region of the cowpea genome (termed the genespace recovered using methylation filtration technology and providing annotation and analysis of the sequence data. Description CGKB, Cowpea Genespace/Genomics Knowledge Base, is an annotation knowledge base developed under the CGI. The database is based on information derived from 298,848 cowpea genespace sequences (GSS isolated by methylation filtering of genomic DNA. The CGKB consists of three knowledge bases: GSS annotation and comparative genomics knowledge base, GSS enzyme and metabolic pathway knowledge base, and GSS simple sequence repeats (SSRs knowledge base for molecular marker discovery. A homology-based approach was applied for annotations of the GSS, mainly using BLASTX against four public FASTA formatted protein databases (NCBI GenBank Proteins, UniProtKB-Swiss-Prot, UniprotKB-PIR (Protein Information Resource, and UniProtKB-TrEMBL. Comparative genome analysis was done by BLASTX searches of the cowpea GSS against four plant proteomes from Arabidopsis thaliana, Oryza sativa, Medicago truncatula, and Populus trichocarpa. The possible exons and introns on each cowpea GSS were predicted using the HMM-based Genscan gene predication program and the

  4. Gene Ontology annotation of the rice blast fungus, Magnaporthe oryzae

    Directory of Open Access Journals (Sweden)

    Deng Jixin

    2009-02-01

    were assigned to the 3 root terms. The Version 5 GO annotation is publically queryable via the GO site http://amigo.geneontology.org/cgi-bin/amigo/go.cgi. Additionally, the genome of M. oryzae is constantly being refined and updated as new information is incorporated. For the latest GO annotation of Version 6 genome, please visit our website http://scotland.fgl.ncsu.edu/smeng/GoAnnotationMagnaporthegrisea.html. The preliminary GO annotation of Version 6 genome is placed at a local MySql database that is publically queryable via a user-friendly interface Adhoc Query System. Conclusion Our analysis provides comprehensive and robust GO annotations of the M. oryzae genome assemblies that will be solid foundations for further functional interrogation of M. oryzae.

  5. Annotating images by mining image search results.

    Science.gov (United States)

    Wang, Xin-Jing; Zhang, Lei; Li, Xirong; Ma, Wei-Ying

    2008-11-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search results. Some 2.4 million images with their surrounding text are collected from a few photo forums to support this approach. The entire process is formulated in a divide-and-conquer framework where a query keyword is provided along with the uncaptioned image to improve both the effectiveness and efficiency. This is helpful when the collected data set is not dense everywhere. In this sense, our approach contains three steps: 1) the search process to discover visually and semantically similar search results, 2) the mining process to identify salient terms from textual descriptions of the search results, and 3) the annotation rejection process to filter out noisy terms yielded by Step 2. To ensure real-time annotation, two key techniques are leveraged-one is to map the high-dimensional image visual features into hash codes, the other is to implement it as a distributed system, of which the search and mining processes are provided as Web services. As a typical result, the entire process finishes in less than 1 second. Since no training data set is required, our approach enables annotating with unlimited vocabulary and is highly scalable and robust to outliers. Experimental results on both real Web images and a benchmark image data set show the effectiveness and efficiency of the proposed algorithm. It is also worth noting that, although the entire approach is illustrated within the divide-and conquer framework, a query keyword is not crucial to our current implementation. We provide experimental results to prove this.

  6. Annotating individual human genomes.

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A; Topol, Eric J; Schork, Nicholas J

    2011-10-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. Copyright © 2011 Elsevier Inc. All rights reserved.

  7. ANNOTATING INDIVIDUAL HUMAN GENOMES*

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A.; Topol, Eric J.; Schork, Nicholas J.

    2014-01-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely to amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. PMID:21839162

  8. GSV Annotated Bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)

    2010-09-14

    The following annotated bibliography was developed as part of the geospatial algorithm verification and validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Verification and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following five topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models. Many other papers were studied during the course of the investigation including. The annotations for these articles can be found in the paper "On the verification and validation of geospatial image analysis algorithms".

  9. Annotation: The Savant Syndrome

    Science.gov (United States)

    Heaton, Pamela; Wallace, Gregory L.

    2004-01-01

    Background: Whilst interest has focused on the origin and nature of the savant syndrome for over a century, it is only within the past two decades that empirical group studies have been carried out. Methods: The following annotation briefly reviews relevant research and also attempts to address outstanding issues in this research area.…

  10. Annotating Emotions in Meetings

    NARCIS (Netherlands)

    Reidsma, Dennis; Heylen, Dirk K.J.; Ordelman, Roeland J.F.

    We present the results of two trials testing procedures for the annotation of emotion and mental state of the AMI corpus. The first procedure is an adaptation of the FeelTrace method, focusing on a continuous labelling of emotion dimensions. The second method is centered around more discrete

  11. Reasoning with Annotations of Texts

    OpenAIRE

    Ma , Yue; Lévy , François; Ghimire , Sudeep

    2011-01-01

    International audience; Linguistic and semantic annotations are important features for text-based applications. However, achieving and maintaining a good quality of a set of annotations is known to be a complex task. Many ad hoc approaches have been developed to produce various types of annotations, while comparing those annotations to improve their quality is still rare. In this paper, we propose a framework in which both linguistic and domain information can cooperate to reason with annotat...

  12. GSV Annotated Bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, Randy S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pope, Paul A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Jiang, Ming [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Trucano, Timothy G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aragon, Cecilia R. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ni, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wei, Thomas [Argonne National Lab. (ANL), Argonne, IL (United States); Chilton, Lawrence K. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Bakel, Alan [Argonne National Lab. (ANL), Argonne, IL (United States)

    2011-06-14

    The following annotated bibliography was developed as part of the Geospatial Algorithm Veri cation and Validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Veri cation and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following ve topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models.

  13. Diverse Image Annotation

    KAUST Repository

    Wu, Baoyuan

    2017-11-09

    In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not only corresponding to the contents of the image, but also be different to each other. To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly. We further explore the semantic hierarchy and synonyms among the candidate tags, and require that two tags in a semantic hierarchy or in a pair of synonyms should not be selected simultaneously. This requirement is then embedded into the sampling algorithm according to the learned conditional DPP model. Besides, we find that traditional metrics for image annotation (e.g., precision, recall and F1 score) only consider the representation, but ignore the diversity. Thus we propose new metrics to evaluate the quality of the selected subset (i.e., the tag list), based on the semantic hierarchy and synonyms. Human study through Amazon Mechanical Turk verifies that the proposed metrics are more close to the humans judgment than traditional metrics. Experiments on two benchmark datasets show that the proposed method can produce more representative and diverse tags, compared with existing image annotation methods.

  14. Diverse Image Annotation

    KAUST Repository

    Wu, Baoyuan; Jia, Fan; Liu, Wei; Ghanem, Bernard

    2017-01-01

    In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not only corresponding to the contents of the image, but also be different to each other. To this end, we treat the image annotation as a subset selection problem based on the conditional determinantal point process (DPP) model, which formulates the representation and diversity jointly. We further explore the semantic hierarchy and synonyms among the candidate tags, and require that two tags in a semantic hierarchy or in a pair of synonyms should not be selected simultaneously. This requirement is then embedded into the sampling algorithm according to the learned conditional DPP model. Besides, we find that traditional metrics for image annotation (e.g., precision, recall and F1 score) only consider the representation, but ignore the diversity. Thus we propose new metrics to evaluate the quality of the selected subset (i.e., the tag list), based on the semantic hierarchy and synonyms. Human study through Amazon Mechanical Turk verifies that the proposed metrics are more close to the humans judgment than traditional metrics. Experiments on two benchmark datasets show that the proposed method can produce more representative and diverse tags, compared with existing image annotation methods.

  15. Annotation of Regular Polysemy

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector

    Regular polysemy has received a lot of attention from the theory of lexical semantics and from computational linguistics. However, there is no consensus on how to represent the sense of underspecified examples at the token level, namely when annotating or disambiguating senses of metonymic words...... and metonymic. We have conducted an analysis in English, Danish and Spanish. Later on, we have tried to replicate the human judgments by means of unsupervised and semi-supervised sense prediction. The automatic sense-prediction systems have been unable to find empiric evidence for the underspecified sense, even...

  16. Impingement: an annotated bibliography

    International Nuclear Information System (INIS)

    Uziel, M.S.; Hannon, E.H.

    1979-04-01

    This bibliography of 655 annotated references on impingement of aquatic organisms at intake structures of thermal-power-plant cooling systems was compiled from the published and unpublished literature. The bibliography includes references from 1928 to 1978 on impingement monitoring programs; impingement impact assessment; applicable law; location and design of intake structures, screens, louvers, and other barriers; fish behavior and swim speed as related to impingement susceptibility; and the effects of light, sound, bubbles, currents, and temperature on fish behavior. References are arranged alphabetically by author or corporate author. Indexes are provided for author, keywords, subject category, geographic location, taxon, and title

  17. Predicting word sense annotation agreement

    DEFF Research Database (Denmark)

    Martinez Alonso, Hector; Johannsen, Anders Trærup; Lopez de Lacalle, Oier

    2015-01-01

    High agreement is a common objective when annotating data for word senses. However, a number of factors make perfect agreement impossible, e.g. the limitations of the sense inventories, the difficulty of the examples or the interpretation preferences of the annotations. Estimating potential...... agreement is thus a relevant task to supplement the evaluation of sense annotations. In this article we propose two methods to predict agreement on word-annotation instances. We experiment with a continuous representation and a three-way discretization of observed agreement. In spite of the difficulty...

  18. PANNZER2: a rapid functional annotation web server.

    Science.gov (United States)

    Törönen, Petri; Medlar, Alan; Holm, Liisa

    2018-05-08

    The unprecedented growth of high-throughput sequencing has led to an ever-widening annotation gap in protein databases. While computational prediction methods are available to make up the shortfall, a majority of public web servers are hindered by practical limitations and poor performance. Here, we introduce PANNZER2 (Protein ANNotation with Z-scoRE), a fast functional annotation web server that provides both Gene Ontology (GO) annotations and free text description predictions. PANNZER2 uses SANSparallel to perform high-performance homology searches, making bulk annotation based on sequence similarity practical. PANNZER2 can output GO annotations from multiple scoring functions, enabling users to see which predictions are robust across predictors. Finally, PANNZER2 predictions scored within the top 10 methods for molecular function and biological process in the CAFA2 NK-full benchmark. The PANNZER2 web server is updated on a monthly schedule and is accessible at http://ekhidna2.biocenter.helsinki.fi/sanspanz/. The source code is available under the GNU Public Licence v3.

  19. Mesotext. Framing and exploring annotations

    NARCIS (Netherlands)

    Boot, P.; Boot, P.; Stronks, E.

    2007-01-01

    From the introduction: Annotation is an important item on the wish list for digital scholarly tools. It is one of John Unsworth’s primitives of scholarship (Unsworth 2000). Especially in linguistics,a number of tools have been developed that facilitate the creation of annotations to source material

  20. THE DIMENSIONS OF COMPOSITION ANNOTATION.

    Science.gov (United States)

    MCCOLLY, WILLIAM

    ENGLISH TEACHER ANNOTATIONS WERE STUDIED TO DETERMINE THE DIMENSIONS AND PROPERTIES OF THE ENTIRE SYSTEM FOR WRITING CORRECTIONS AND CRITICISMS ON COMPOSITIONS. FOUR SETS OF COMPOSITIONS WERE WRITTEN BY STUDENTS IN GRADES 9 THROUGH 13. TYPESCRIPTS OF THE COMPOSITIONS WERE ANNOTATED BY CLASSROOM ENGLISH TEACHERS. THEN, 32 ENGLISH TEACHERS JUDGED…

  1. Chado controller: advanced annotation management with a community annotation system.

    Science.gov (United States)

    Guignon, Valentin; Droc, Gaëtan; Alaux, Michael; Baurens, Franc-Christophe; Garsmeur, Olivier; Poiron, Claire; Carver, Tim; Rouard, Mathieu; Bocs, Stéphanie

    2012-04-01

    We developed a controller that is compliant with the Chado database schema, GBrowse and genome annotation-editing tools such as Artemis and Apollo. It enables the management of public and private data, monitors manual annotation (with controlled vocabularies, structural and functional annotation controls) and stores versions of annotation for all modified features. The Chado controller uses PostgreSQL and Perl. The Chado Controller package is available for download at http://www.gnpannot.org/content/chado-controller and runs on any Unix-like operating system, and documentation is available at http://www.gnpannot.org/content/chado-controller-doc The system can be tested using the GNPAnnot Sandbox at http://www.gnpannot.org/content/gnpannot-sandbox-form valentin.guignon@cirad.fr; stephanie.sidibe-bocs@cirad.fr Supplementary data are available at Bioinformatics online.

  2. Robust Scientists

    DEFF Research Database (Denmark)

    Gorm Hansen, Birgitte

    their core i nterests, 2) developing a selfsupply of industry interests by becoming entrepreneurs and thus creating their own compliant industry partner and 3) balancing resources within a larger collective of researchers, thus countering changes in the influx of funding caused by shifts in political...... knowledge", Danish research policy seems to have helped develop politically and economically "robust scientists". Scientific robustness is acquired by way of three strategies: 1) tasting and discriminating between resources so as to avoid funding that erodes academic profiles and push scientists away from...

  3. Displaying Annotations for Digitised Globes

    Science.gov (United States)

    Gede, Mátyás; Farbinger, Anna

    2018-05-01

    Thanks to the efforts of the various globe digitising projects, nowadays there are plenty of old globes that can be examined as 3D models on the computer screen. These globes usually contain a lot of interesting details that an average observer would not entirely discover for the first time. The authors developed a website that can display annotations for such digitised globes. These annotations help observers of the globe to discover all the important, interesting details. Annotations consist of a plain text title, a HTML formatted descriptive text and a corresponding polygon and are stored in KML format. The website is powered by the Cesium virtual globe engine.

  4. Annotation-based feature extraction from sets of SBML models.

    Science.gov (United States)

    Alm, Rebekka; Waltemath, Dagmar; Wolfien, Markus; Wolkenhauer, Olaf; Henkel, Ron

    2015-01-01

    Model repositories such as BioModels Database provide computational models of biological systems for the scientific community. These models contain rich semantic annotations that link model entities to concepts in well-established bio-ontologies such as Gene Ontology. Consequently, thematically similar models are likely to share similar annotations. Based on this assumption, we argue that semantic annotations are a suitable tool to characterize sets of models. These characteristics improve model classification, allow to identify additional features for model retrieval tasks, and enable the comparison of sets of models. In this paper we discuss four methods for annotation-based feature extraction from model sets. We tested all methods on sets of models in SBML format which were composed from BioModels Database. To characterize each of these sets, we analyzed and extracted concepts from three frequently used ontologies, namely Gene Ontology, ChEBI and SBO. We find that three out of the methods are suitable to determine characteristic features for arbitrary sets of models: The selected features vary depending on the underlying model set, and they are also specific to the chosen model set. We show that the identified features map on concepts that are higher up in the hierarchy of the ontologies than the concepts used for model annotations. Our analysis also reveals that the information content of concepts in ontologies and their usage for model annotation do not correlate. Annotation-based feature extraction enables the comparison of model sets, as opposed to existing methods for model-to-keyword comparison, or model-to-model comparison.

  5. A Set of Annotation Interfaces for Alignment of Parallel Corpora

    Directory of Open Access Journals (Sweden)

    Singh Anil Kumar

    2014-09-01

    Full Text Available Annotation interfaces for parallel corpora which fit in well with other tools can be very useful. We describe a set of annotation interfaces which fulfill this criterion. This set includes a sentence alignment interface, two different word or word group alignment interfaces and an initial version of a parallel syntactic annotation alignment interface. These tools can be used for manual alignment, or they can be used to correct automatic alignments. Manual alignment can be performed in combination with certain kinds of linguistic annotation. Most of these interfaces use a representation called the Shakti Standard Format that has been found to be very robust and has been used for large and successful projects. It ties together the different interfaces, so that the data created by them is portable across all tools which support this representation. The existence of a query language for data stored in this representation makes it possible to build tools that allow easy search and modification of annotated parallel data.

  6. Phenex: ontological annotation of phenotypic diversity.

    Directory of Open Access Journals (Sweden)

    James P Balhoff

    2010-05-01

    Full Text Available Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge.Here we describe Phenex, a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic similarities and differences using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Phenex can be configured to load only those ontologies pertinent to a taxonomic group of interest. The graphical user interface was optimized for evolutionary biologists accustomed to working with lists of taxa, characters, character states, and character-by-taxon matrices.Annotation of phenotypic data using ontologies and globally unique taxonomic identifiers will allow biologists to integrate phenotypic data from different organisms and studies, leveraging decades of work in systematics and comparative morphology.

  7. Phenex: ontological annotation of phenotypic diversity.

    Science.gov (United States)

    Balhoff, James P; Dahdul, Wasila M; Kothari, Cartik R; Lapp, Hilmar; Lundberg, John G; Mabee, Paula; Midford, Peter E; Westerfield, Monte; Vision, Todd J

    2010-05-05

    Phenotypic differences among species have long been systematically itemized and described by biologists in the process of investigating phylogenetic relationships and trait evolution. Traditionally, these descriptions have been expressed in natural language within the context of individual journal publications or monographs. As such, this rich store of phenotype data has been largely unavailable for statistical and computational comparisons across studies or integration with other biological knowledge. Here we describe Phenex, a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic similarities and differences using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Phenex can be configured to load only those ontologies pertinent to a taxonomic group of interest. The graphical user interface was optimized for evolutionary biologists accustomed to working with lists of taxa, characters, character states, and character-by-taxon matrices. Annotation of phenotypic data using ontologies and globally unique taxonomic identifiers will allow biologists to integrate phenotypic data from different organisms and studies, leveraging decades of work in systematics and comparative morphology.

  8. The leucine-rich repeat structure.

    Science.gov (United States)

    Bella, J; Hindle, K L; McEwan, P A; Lovell, S C

    2008-08-01

    The leucine-rich repeat is a widespread structural motif of 20-30 amino acids with a characteristic repetitive sequence pattern rich in leucines. Leucine-rich repeat domains are built from tandems of two or more repeats and form curved solenoid structures that are particularly suitable for protein-protein interactions. Thousands of protein sequences containing leucine-rich repeats have been identified by automatic annotation methods. Three-dimensional structures of leucine-rich repeat domains determined to date reveal a degree of structural variability that translates into the considerable functional versatility of this protein superfamily. As the essential structural principles become well established, the leucine-rich repeat architecture is emerging as an attractive framework for structural prediction and protein engineering. This review presents an update of the current understanding of leucine-rich repeat structure at the primary, secondary, tertiary and quaternary levels and discusses specific examples from recently determined three-dimensional structures.

  9. Adaptive Reactive Rich Internet Applications

    Science.gov (United States)

    Schmidt, Kay-Uwe; Stühmer, Roland; Dörflinger, Jörg; Rahmani, Tirdad; Thomas, Susan; Stojanovic, Ljiljana

    Rich Internet Applications significantly raise the user experience compared with legacy page-based Web applications because of their highly responsive user interfaces. Although this is a tremendous advance, it does not solve the problem of the one-size-fits-all approach1 of current Web applications. So although Rich Internet Applications put the user in a position to interact seamlessly with the Web application, they do not adapt to the context in which the user is currently working. In this paper we address the on-the-fly personalization of Rich Internet Applications. We introduce the concept of ARRIAs: Adaptive Reactive Rich Internet Applications and elaborate on how they are able to adapt to the current working context the user is engaged in. An architecture for the ad hoc adaptation of Rich Internet Applications is presented as well as a holistic framework and tools for the realization of our on-the-fly personalization approach. We divided both the architecture and the framework into two levels: offline/design-time and online/run-time. For design-time we explain how to use ontologies in order to annotate Rich Internet Applications and how to use these annotations for conceptual Web usage mining. Furthermore, we describe how to create client-side executable rules from the semantic data mining results. We present our declarative lightweight rule language tailored to the needs of being executed directly on the client. Because of the event-driven nature of the user interfaces of Rich Internet Applications, we designed a lightweight rule language based on the event-condition-action paradigm.2 At run-time the interactions of a user are tracked directly on the client and in real-time a user model is built up. The user model then acts as input to and is evaluated by our client-side complex event processing and rule engine.

  10. Objective-guided image annotation.

    Science.gov (United States)

    Mao, Qi; Tsang, Ivor Wai-Hung; Gao, Shenghua

    2013-04-01

    Automatic image annotation, which is usually formulated as a multi-label classification problem, is one of the major tools used to enhance the semantic understanding of web images. Many multimedia applications (e.g., tag-based image retrieval) can greatly benefit from image annotation. However, the insufficient performance of image annotation methods prevents these applications from being practical. On the other hand, specific measures are usually designed to evaluate how well one annotation method performs for a specific objective or application, but most image annotation methods do not consider optimization of these measures, so that they are inevitably trapped into suboptimal performance of these objective-specific measures. To address this issue, we first summarize a variety of objective-guided performance measures under a unified representation. Our analysis reveals that macro-averaging measures are very sensitive to infrequent keywords, and hamming measure is easily affected by skewed distributions. We then propose a unified multi-label learning framework, which directly optimizes a variety of objective-specific measures of multi-label learning tasks. Specifically, we first present a multilayer hierarchical structure of learning hypotheses for multi-label problems based on which a variety of loss functions with respect to objective-guided measures are defined. And then, we formulate these loss functions as relaxed surrogate functions and optimize them by structural SVMs. According to the analysis of various measures and the high time complexity of optimizing micro-averaging measures, in this paper, we focus on example-based measures that are tailor-made for image annotation tasks but are seldom explored in the literature. Experiments show consistency with the formal analysis on two widely used multi-label datasets, and demonstrate the superior performance of our proposed method over state-of-the-art baseline methods in terms of example-based measures on four

  11. Image annotation under X Windows

    Science.gov (United States)

    Pothier, Steven

    1991-08-01

    A mechanism for attaching graphic and overlay annotation to multiple bits/pixel imagery while providing levels of performance approaching that of native mode graphics systems is presented. This mechanism isolates programming complexity from the application programmer through software encapsulation under the X Window System. It ensures display accuracy throughout operations on the imagery and annotation including zooms, pans, and modifications of the annotation. Trade-offs that affect speed of display, consumption of memory, and system functionality are explored. The use of resource files to tune the display system is discussed. The mechanism makes use of an abstraction consisting of four parts; a graphics overlay, a dithered overlay, an image overly, and a physical display window. Data structures are maintained that retain the distinction between the four parts so that they can be modified independently, providing system flexibility. A unique technique for associating user color preferences with annotation is introduced. An interface that allows interactive modification of the mapping between image value and color is discussed. A procedure that provides for the colorization of imagery on 8-bit display systems using pixel dithering is explained. Finally, the application of annotation mechanisms to various applications is discussed.

  12. A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data.

    Science.gov (United States)

    Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu

    2015-05-27

    Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu.

  13. Alignment-Annotator web server: rendering and annotating sequence alignments.

    Science.gov (United States)

    Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas

    2014-07-01

    Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Public Relations: Selected, Annotated Bibliography.

    Science.gov (United States)

    Demo, Penny

    Designed for students and practitioners of public relations (PR), this annotated bibliography focuses on recent journal articles and ERIC documents. The 34 citations include the following: (1) surveys of public relations professionals on career-related education; (2) literature reviews of research on measurement and evaluation of PR and…

  15. Persuasion: A Selected, Annotated Bibliography.

    Science.gov (United States)

    McDermott, Steven T.

    Designed to reflect the diversity of approaches to persuasion, this annotated bibliography cites materials selected for their contribution to that diversity as well as for being relatively current and/or especially significant representatives of particular approaches. The bibliography starts with a list of 17 general textbooks on approaches to…

  16. [Prescription annotations in Welfare Pharmacy].

    Science.gov (United States)

    Han, Yi

    2018-03-01

    Welfare Pharmacy contains medical formulas documented by the government and official prescriptions used by the official pharmacy in the pharmaceutical process. In the last years of Southern Song Dynasty, anonyms gave a lot of prescription annotations, made textual researches for the name, source, composition and origin of the prescriptions, and supplemented important historical data of medical cases and researched historical facts. The annotations of Welfare Pharmacy gathered the essence of medical theory, and can be used as precious materials to correctly understand the syndrome differentiation, compatibility regularity and clinical application of prescriptions. This article deeply investigated the style and form of the prescription annotations in Welfare Pharmacy, the name of prescriptions and the evolution of terminology, the major functions of the prescriptions, processing methods, instructions for taking medicine and taboos of prescriptions, the medical cases and clinical efficacy of prescriptions, the backgrounds, sources, composition and cultural meanings of prescriptions, proposed that the prescription annotations played an active role in the textual dissemination, patent medicine production and clinical diagnosis and treatment of Welfare Pharmacy. This not only helps understand the changes in the names and terms of traditional Chinese medicines in Welfare Pharmacy, but also provides the basis for understanding the knowledge sources, compatibility regularity, important drug innovations and clinical medications of prescriptions in Welfare Pharmacy. Copyright© by the Chinese Pharmaceutical Association.

  17. The surplus value of semantic annotations

    NARCIS (Netherlands)

    Marx, M.

    2010-01-01

    We compare the costs of semantic annotation of textual documents to its benefits for information processing tasks. Semantic annotation can improve the performance of retrieval tasks and facilitates an improved search experience through faceted search, focused retrieval, better document summaries,

  18. Systems Theory and Communication. Annotated Bibliography.

    Science.gov (United States)

    Covington, William G., Jr.

    This annotated bibliography presents annotations of 31 books and journal articles dealing with systems theory and its relation to organizational communication, marketing, information theory, and cybernetics. Materials were published between 1963 and 1992 and are listed alphabetically by author. (RS)

  19. Annotating images by mining image search results

    NARCIS (Netherlands)

    Wang, X.J.; Zhang, L.; Li, X.; Ma, W.Y.

    2008-01-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search

  20. Dictionary-driven protein annotation.

    Science.gov (United States)

    Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

    2002-09-01

    Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were

  1. Evaluating Hierarchical Structure in Music Annotations.

    Science.gov (United States)

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  2. Evaluating Hierarchical Structure in Music Annotations

    Directory of Open Access Journals (Sweden)

    Brian McFee

    2017-08-01

    Full Text Available Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR, it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  3. Functional annotation of hierarchical modularity.

    Directory of Open Access Journals (Sweden)

    Kanchana Padmanabhan

    Full Text Available In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modularity. Arguably, hierarchical modularity has not been explicitly taken into consideration by most, if not all, functional annotation systems. As a result, the existing methods would often fail to assign a statistically significant functional coherence score to biologically relevant molecular machines. We developed a methodology for hierarchical functional annotation. Given the hierarchical taxonomy of functional concepts (e.g., Gene Ontology and the association of individual genes or proteins with these concepts (e.g., GO terms, our method will assign a Hierarchical Modularity Score (HMS to each node in the hierarchy of functional modules; the HMS score and its p-value measure functional coherence of each module in the hierarchy. While existing methods annotate each module with a set of "enriched" functional terms in a bag of genes, our complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components. A hierarchical organization of functional modules often comes as a bi-product of cluster analysis of gene expression data or protein interaction data. Otherwise, our method will automatically build such a hierarchy by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. In addition, its underlying HMS scoring metric ensures that functional specificity of the terms across different levels of the hierarchical taxonomy is properly treated. We have evaluated our

  4. An open annotation ontology for science on web 3.0.

    Science.gov (United States)

    Ciccarese, Paolo; Ocana, Marco; Garcia Castro, Leyla Jael; Das, Sudeshna; Clark, Tim

    2011-05-17

    There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges. Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work. This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables "stand-off" or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at http://purl.org/ao/, and extensive documentation including screencasts is available on AO's Google Code page: http://code.google.com/p/annotation-ontology/ . The Annotation Ontology meets critical requirements for

  5. Annotation of rule-based models with formal semantics to enable creation, analysis, reuse and visualization

    Science.gov (United States)

    Misirli, Goksel; Cavaliere, Matteo; Waites, William; Pocock, Matthew; Madsen, Curtis; Gilfellon, Owen; Honorato-Zimmer, Ricardo; Zuliani, Paolo; Danos, Vincent; Wipat, Anil

    2016-01-01

    Motivation: Biological systems are complex and challenging to model and therefore model reuse is highly desirable. To promote model reuse, models should include both information about the specifics of simulations and the underlying biology in the form of metadata. The availability of computationally tractable metadata is especially important for the effective automated interpretation and processing of models. Metadata are typically represented as machine-readable annotations which enhance programmatic access to information about models. Rule-based languages have emerged as a modelling framework to represent the complexity of biological systems. Annotation approaches have been widely used for reaction-based formalisms such as SBML. However, rule-based languages still lack a rich annotation framework to add semantic information, such as machine-readable descriptions, to the components of a model. Results: We present an annotation framework and guidelines for annotating rule-based models, encoded in the commonly used Kappa and BioNetGen languages. We adapt widely adopted annotation approaches to rule-based models. We initially propose a syntax to store machine-readable annotations and describe a mapping between rule-based modelling entities, such as agents and rules, and their annotations. We then describe an ontology to both annotate these models and capture the information contained therein, and demonstrate annotating these models using examples. Finally, we present a proof of concept tool for extracting annotations from a model that can be queried and analyzed in a uniform way. The uniform representation of the annotations can be used to facilitate the creation, analysis, reuse and visualization of rule-based models. Although examples are given, using specific implementations the proposed techniques can be applied to rule-based models in general. Availability and implementation: The annotation ontology for rule-based models can be found at http

  6. Semantic annotation in biomedicine: the current landscape.

    Science.gov (United States)

    Jovanović, Jelena; Bagheri, Ebrahim

    2017-09-22

    The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.

  7. Pipeline to upgrade the genome annotations

    Directory of Open Access Journals (Sweden)

    Lijin K. Gopi

    2017-12-01

    Full Text Available Current era of functional genomics is enriched with good quality draft genomes and annotations for many thousands of species and varieties with the support of the advancements in the next generation sequencing technologies (NGS. Around 25,250 genomes, of the organisms from various kingdoms, are submitted in the NCBI genome resource till date. Each of these genomes was annotated using various tools and knowledge-bases that were available during the period of the annotation. It is obvious that these annotations will be improved if the same genome is annotated using improved tools and knowledge-bases. Here we present a new genome annotation pipeline, strengthened with various tools and knowledge-bases that are capable of producing better quality annotations from the consensus of the predictions from different tools. This resource also perform various additional annotations, apart from the usual gene predictions and functional annotations, which involve SSRs, novel repeats, paralogs, proteins with transmembrane helices, signal peptides etc. This new annotation resource is trained to evaluate and integrate all the predictions together to resolve the overlaps and ambiguities of the boundaries. One of the important highlights of this resource is the capability of predicting the phylogenetic relations of the repeats using the evolutionary trace analysis and orthologous gene clusters. We also present a case study, of the pipeline, in which we upgrade the genome annotation of Nelumbo nucifera (sacred lotus. It is demonstrated that this resource is capable of producing an improved annotation for a better understanding of the biology of various organisms.

  8. Annotating temporal information in clinical narratives.

    Science.gov (United States)

    Sun, Weiyi; Rumshisky, Anna; Uzuner, Ozlem

    2013-12-01

    Temporal information in clinical narratives plays an important role in patients' diagnosis, treatment and prognosis. In order to represent narrative information accurately, medical natural language processing (MLP) systems need to correctly identify and interpret temporal information. To promote research in this area, the Informatics for Integrating Biology and the Bedside (i2b2) project developed a temporally annotated corpus of clinical narratives. This corpus contains 310 de-identified discharge summaries, with annotations of clinical events, temporal expressions and temporal relations. This paper describes the process followed for the development of this corpus and discusses annotation guideline development, annotation methodology, and corpus quality. Copyright © 2013 Elsevier Inc. All rights reserved.

  9. Estimating the annotation error rate of curated GO database sequence annotations

    Directory of Open Access Journals (Sweden)

    Brown Alfred L

    2007-05-01

    Full Text Available Abstract Background Annotations that describe the function of sequences are enormously important to researchers during laboratory investigations and when making computational inferences. However, there has been little investigation into the data quality of sequence function annotations. Here we have developed a new method of estimating the error rate of curated sequence annotations, and applied this to the Gene Ontology (GO sequence database (GOSeqLite. This method involved artificially adding errors to sequence annotations at known rates, and used regression to model the impact on the precision of annotations based on BLAST matched sequences. Results We estimated the error rate of curated GO sequence annotations in the GOSeqLite database (March 2006 at between 28% and 30%. Annotations made without use of sequence similarity based methods (non-ISS had an estimated error rate of between 13% and 18%. Annotations made with the use of sequence similarity methodology (ISS had an estimated error rate of 49%. Conclusion While the overall error rate is reasonably low, it would be prudent to treat all ISS annotations with caution. Electronic annotators that use ISS annotations as the basis of predictions are likely to have higher false prediction rates, and for this reason designers of these systems should consider avoiding ISS annotations where possible. Electronic annotators that use ISS annotations to make predictions should be viewed sceptically. We recommend that curators thoroughly review ISS annotations before accepting them as valid. Overall, users of curated sequence annotations from the GO database should feel assured that they are using a comparatively high quality source of information.

  10. ANNOTATION SUPPORTED OCCLUDED OBJECT TRACKING

    Directory of Open Access Journals (Sweden)

    Devinder Kumar

    2012-08-01

    Full Text Available Tracking occluded objects at different depths has become as extremely important component of study for any video sequence having wide applications in object tracking, scene recognition, coding, editing the videos and mosaicking. The paper studies the ability of annotation to track the occluded object based on pyramids with variation in depth further establishing a threshold at which the ability of the system to track the occluded object fails. Image annotation is applied on 3 similar video sequences varying in depth. In the experiment, one bike occludes the other at a depth of 60cm, 80cm and 100cm respectively. Another experiment is performed on tracking humans with similar depth to authenticate the results. The paper also computes the frame by frame error incurred by the system, supported by detailed simulations. This system can be effectively used to analyze the error in motion tracking and further correcting the error leading to flawless tracking. This can be of great interest to computer scientists while designing surveillance systems etc.

  11. Robust Subjective Visual Property Prediction from Crowdsourced Pairwise Labels.

    Science.gov (United States)

    Fu, Yanwei; Hospedales, Timothy M; Xiang, Tao; Xiong, Jiechao; Gong, Shaogang; Wang, Yizhou; Yao, Yuan

    2016-03-01

    The problem of estimating subjective visual properties from image and video has attracted increasing interest. A subjective visual property is useful either on its own (e.g. image and video interestingness) or as an intermediate representation for visual recognition (e.g. a relative attribute). Due to its ambiguous nature, annotating the value of a subjective visual property for learning a prediction model is challenging. To make the annotation more reliable, recent studies employ crowdsourcing tools to collect pairwise comparison labels. However, using crowdsourced data also introduces outliers. Existing methods rely on majority voting to prune the annotation outliers/errors. They thus require a large amount of pairwise labels to be collected. More importantly as a local outlier detection method, majority voting is ineffective in identifying outliers that can cause global ranking inconsistencies. In this paper, we propose a more principled way to identify annotation outliers by formulating the subjective visual property prediction task as a unified robust learning to rank problem, tackling both the outlier detection and learning to rank jointly. This differs from existing methods in that (1) the proposed method integrates local pairwise comparison labels together to minimise a cost that corresponds to global inconsistency of ranking order, and (2) the outlier detection and learning to rank problems are solved jointly. This not only leads to better detection of annotation outliers but also enables learning with extremely sparse annotations.

  12. Creating Gaze Annotations in Head Mounted Displays

    DEFF Research Database (Denmark)

    Mardanbeigi, Diako; Qvarfordt, Pernilla

    2015-01-01

    To facilitate distributed communication in mobile settings, we developed GazeNote for creating and sharing gaze annotations in head mounted displays (HMDs). With gaze annotations it possible to point out objects of interest within an image and add a verbal description. To create an annota- tion...

  13. Ground Truth Annotation in T Analyst

    DEFF Research Database (Denmark)

    2015-01-01

    This video shows how to annotate the ground truth tracks in the thermal videos. The ground truth tracks are produced to be able to compare them to tracks obtained from a Computer Vision tracking approach. The program used for annotation is T-Analyst, which is developed by Aliaksei Laureshyn, Ph...

  14. Annotation of regular polysemy and underspecification

    DEFF Research Database (Denmark)

    Martínez Alonso, Héctor; Pedersen, Bolette Sandford; Bel, Núria

    2013-01-01

    We present the result of an annotation task on regular polysemy for a series of seman- tic classes or dot types in English, Dan- ish and Spanish. This article describes the annotation process, the results in terms of inter-encoder agreement, and the sense distributions obtained with two methods...

  15. Black English Annotations for Elementary Reading Programs.

    Science.gov (United States)

    Prasad, Sandre

    This report describes a program that uses annotations in the teacher's editions of existing reading programs to indicate the characteristics of black English that may interfere with the reading process of black children. The first part of the report provides a rationale for the annotation approach, explaining that the discrepancy between written…

  16. Harnessing Collaborative Annotations on Online Formative Assessments

    Science.gov (United States)

    Lin, Jian-Wei; Lai, Yuan-Cheng

    2013-01-01

    This paper harnesses collaborative annotations by students as learning feedback on online formative assessments to improve the learning achievements of students. Through the developed Web platform, students can conduct formative assessments, collaboratively annotate, and review historical records in a convenient way, while teachers can generate…

  17. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop.

    Science.gov (United States)

    Brister, James Rodney; Bao, Yiming; Kuiken, Carla; Lefkowitz, Elliot J; Le Mercier, Philippe; Leplae, Raphael; Madupu, Ramana; Scheuermann, Richard H; Schobel, Seth; Seto, Donald; Shrivastava, Susmita; Sterk, Peter; Zeng, Qiandong; Klimke, William; Tatusova, Tatiana

    2010-10-01

    Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world's biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  18. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop

    Directory of Open Access Journals (Sweden)

    Qiandong Zeng

    2010-10-01

    Full Text Available Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world’s biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  19. MIPS bacterial genomes functional annotation benchmark dataset.

    Science.gov (United States)

    Tetko, Igor V; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Fobo, Gisela; Ruepp, Andreas; Antonov, Alexey V; Surmeli, Dimitrij; Mewes, Hans-Wernen

    2005-05-15

    Any development of new methods for automatic functional annotation of proteins according to their sequences requires high-quality data (as benchmark) as well as tedious preparatory work to generate sequence parameters required as input data for the machine learning methods. Different program settings and incompatible protocols make a comparison of the analyzed methods difficult. The MIPS Bacterial Functional Annotation Benchmark dataset (MIPS-BFAB) is a new, high-quality resource comprising four bacterial genomes manually annotated according to the MIPS functional catalogue (FunCat). These resources include precalculated sequence parameters, such as sequence similarity scores, InterPro domain composition and other parameters that could be used to develop and benchmark methods for functional annotation of bacterial protein sequences. These data are provided in XML format and can be used by scientists who are not necessarily experts in genome annotation. BFAB is available at http://mips.gsf.de/proj/bfab

  20. Interoperable Multimedia Annotation and Retrieval for the Tourism Sector

    NARCIS (Netherlands)

    Chatzitoulousis, Antonios; Efraimidis, Pavlos S.; Athanasiadis, I.N.

    2015-01-01

    The Atlas Metadata System (AMS) employs semantic web annotation techniques in order to create an interoperable information annotation and retrieval platform for the tourism sector. AMS adopts state-of-the-art metadata vocabularies, annotation techniques and semantic web technologies.

  1. Methods for robustness programming

    NARCIS (Netherlands)

    Olieman, N.J.

    2008-01-01

    Robustness of an object is defined as the probability that an object will have properties as required. Robustness Programming (RP) is a mathematical approach for Robustness estimation and Robustness optimisation. An example in the context of designing a food product, is finding the best composition

  2. Robustness in laying hens

    NARCIS (Netherlands)

    Star, L.

    2008-01-01

    The aim of the project ‘The genetics of robustness in laying hens’ was to investigate nature and regulation of robustness in laying hens under sub-optimal conditions and the possibility to increase robustness by using animal breeding without loss of production. At the start of the project, a robust

  3. Ion implantation: an annotated bibliography

    International Nuclear Information System (INIS)

    Ting, R.N.; Subramanyam, K.

    1975-10-01

    Ion implantation is a technique for introducing controlled amounts of dopants into target substrates, and has been successfully used for the manufacture of silicon semiconductor devices. Ion implantation is superior to other methods of doping such as thermal diffusion and epitaxy, in view of its advantages such as high degree of control, flexibility, and amenability to automation. This annotated bibliography of 416 references consists of journal articles, books, and conference papers in English and foreign languages published during 1973-74, on all aspects of ion implantation including range distribution and concentration profile, channeling, radiation damage and annealing, compound semiconductors, structural and electrical characterization, applications, equipment and ion sources. Earlier bibliographies on ion implantation, and national and international conferences in which papers on ion implantation were presented have also been listed separately

  4. An Improved Cluster Richness Estimator

    Energy Technology Data Exchange (ETDEWEB)

    Rozo, Eduardo; /Ohio State U.; Rykoff, Eli S.; /UC, Santa Barbara; Koester, Benjamin P.; /Chicago U. /KICP, Chicago; McKay, Timothy; /Michigan U.; Hao, Jiangang; /Michigan U.; Evrard, August; /Michigan U.; Wechsler, Risa H.; /SLAC; Hansen, Sarah; /Chicago U. /KICP, Chicago; Sheldon, Erin; /New York U.; Johnston, David; /Houston U.; Becker, Matthew R.; /Chicago U. /KICP, Chicago; Annis, James T.; /Fermilab; Bleem, Lindsey; /Chicago U.; Scranton, Ryan; /Pittsburgh U.

    2009-08-03

    Minimizing the scatter between cluster mass and accessible observables is an important goal for cluster cosmology. In this work, we introduce a new matched filter richness estimator, and test its performance using the maxBCG cluster catalog. Our new estimator significantly reduces the variance in the L{sub X}-richness relation, from {sigma}{sub lnL{sub X}}{sup 2} = (0.86 {+-} 0.02){sup 2} to {sigma}{sub lnL{sub X}}{sup 2} = (0.69 {+-} 0.02){sup 2}. Relative to the maxBCG richness estimate, it also removes the strong redshift dependence of the richness scaling relations, and is significantly more robust to photometric and redshift errors. These improvements are largely due to our more sophisticated treatment of galaxy color data. We also demonstrate the scatter in the L{sub X}-richness relation depends on the aperture used to estimate cluster richness, and introduce a novel approach for optimizing said aperture which can be easily generalized to other mass tracers.

  5. Teaching and Learning Communities through Online Annotation

    Science.gov (United States)

    van der Pluijm, B.

    2016-12-01

    What do colleagues do with your assigned textbook? What they say or think about the material? Want students to be more engaged in their learning experience? If so, online materials that complement standard lecture format provide new opportunity through managed, online group annotation that leverages the ubiquity of internet access, while personalizing learning. The concept is illustrated with the new online textbook "Processes in Structural Geology and Tectonics", by Ben van der Pluijm and Stephen Marshak, which offers a platform for sharing of experiences, supplementary materials and approaches, including readings, mathematical applications, exercises, challenge questions, quizzes, alternative explanations, and more. The annotation framework used is Hypothes.is, which offers a free, open platform markup environment for annotation of websites and PDF postings. The annotations can be public, grouped or individualized, as desired, including export access and download of annotations. A teacher group, hosted by a moderator/owner, limits access to members of a user group of teachers, so that its members can use, copy or transcribe annotations for their own lesson material. Likewise, an instructor can host a student group that encourages sharing of observations, questions and answers among students and instructor. Also, the instructor can create one or more closed groups that offers study help and hints to students. Options galore, all of which aim to engage students and to promote greater responsibility for their learning experience. Beyond new capacity, the ability to analyze student annotation supports individual learners and their needs. For example, student notes can be analyzed for key phrases and concepts, and identify misunderstandings, omissions and problems. Also, example annotations can be shared to enhance notetaking skills and to help with studying. Lastly, online annotation allows active application to lecture posted slides, supporting real-time notetaking

  6. Facilitating functional annotation of chicken microarray data

    Directory of Open Access Journals (Sweden)

    Gresham Cathy R

    2009-10-01

    Full Text Available Abstract Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO. However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and

  7. MicroScope: a platform for microbial genome annotation and comparative genomics.

    Science.gov (United States)

    Vallenet, D; Engelen, S; Mornico, D; Cruveiller, S; Fleury, L; Lajus, A; Rouy, Z; Roche, D; Salvignol, G; Scarpelli, C; Médigue, C

    2009-01-01

    The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope's rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of

  8. Extending eScience Provenance with User-Submitted Semantic Annotations

    Science.gov (United States)

    Michaelis, J.; Zednik, S.; West, P.; Fox, P. A.; McGuinness, D. L.

    2010-12-01

    eScience based systems generate provenance of their data products, related to such things as: data processing, data collection conditions, expert evaluation, and data product quality. Recent advances in web-based technology offer users the possibility of making annotations to both data products and steps in accompanying provenance traces, thereby expanding the utility of such provenance for others. These contributing users may have varying backgrounds, ranging from system experts to outside domain experts to citizen scientists. Furthermore, such users may wish to make varying types of annotations - ranging from documenting the purpose of a provenance step to raising concerns about the quality of data dependencies. Semantic Web technologies allow for such kinds of rich annotations to be made to provenance through the use of ontology vocabularies for (i) organizing provenance, and (ii) organizing user/annotation classifications. Furthermore, through Linked Data practices, Semantic linkages may be made from provenance steps to external data of interest. A desire for Semantically-annotated provenance has been motivated by data management issues in the Mauna Loa Solar Observatory’s (MLSO) Advanced Coronal Observing System (ACOS). In ACOS, photomoeter-based readings are taken of solar activity and subsequently processed into final data products consumable by end users. At intermediate stages of ACOS processing, factors such as evaluations by human experts and weather conditions are logged, which could impact data product quality. If such factors are linked via user-submitted annotations to provenance, it could be significantly beneficial for other users. Likewise, the background of a user could impact the credibility of their annotations. For example, an annotation made by a citizen scientist describing the purpose of a provenance step may not be as reliable as a similar annotation made by an ACOS project member. For this work, we have developed a software package that

  9. Automatic annotation of head velocity and acceleration in Anvil

    DEFF Research Database (Denmark)

    Jongejan, Bart

    2012-01-01

    We describe an automatic face tracker plugin for the ANVIL annotation tool. The face tracker produces data for velocity and for acceleration in two dimensions. We compare the annotations generated by the face tracking algorithm with independently made manual annotations for head movements....... The annotations are a useful supplement to manual annotations and may help human annotators to quickly and reliably determine onset of head movements and to suggest which kind of head movement is taking place....

  10. Semantic annotation of consumer health questions.

    Science.gov (United States)

    Kilicoglu, Halil; Ben Abacha, Asma; Mrabet, Yassine; Shooshan, Sonya E; Rodriguez, Laritza; Masterton, Kate; Demner-Fushman, Dina

    2018-02-06

    Consumers increasingly use online resources for their health information needs. While current search engines can address these needs to some extent, they generally do not take into account that most health information needs are complex and can only fully be expressed in natural language. Consumer health question answering (QA) systems aim to fill this gap. A major challenge in developing consumer health QA systems is extracting relevant semantic content from the natural language questions (question understanding). To develop effective question understanding tools, question corpora semantically annotated for relevant question elements are needed. In this paper, we present a two-part consumer health question corpus annotated with several semantic categories: named entities, question triggers/types, question frames, and question topic. The first part (CHQA-email) consists of relatively long email requests received by the U.S. National Library of Medicine (NLM) customer service, while the second part (CHQA-web) consists of shorter questions posed to MedlinePlus search engine as queries. Each question has been annotated by two annotators. The annotation methodology is largely the same between the two parts of the corpus; however, we also explain and justify the differences between them. Additionally, we provide information about corpus characteristics, inter-annotator agreement, and our attempts to measure annotation confidence in the absence of adjudication of annotations. The resulting corpus consists of 2614 questions (CHQA-email: 1740, CHQA-web: 874). Problems are the most frequent named entities, while treatment and general information questions are the most common question types. Inter-annotator agreement was generally modest: question types and topics yielded highest agreement, while the agreement for more complex frame annotations was lower. Agreement in CHQA-web was consistently higher than that in CHQA-email. Pairwise inter-annotator agreement proved most

  11. Making web annotations persistent over time

    Energy Technology Data Exchange (ETDEWEB)

    Sanderson, Robert [Los Alamos National Laboratory; Van De Sompel, Herbert [Los Alamos National Laboratory

    2010-01-01

    As Digital Libraries (DL) become more aligned with the web architecture, their functional components need to be fundamentally rethought in terms of URIs and HTTP. Annotation, a core scholarly activity enabled by many DL solutions, exhibits a clearly unacceptable characteristic when existing models are applied to the web: due to the representations of web resources changing over time, an annotation made about a web resource today may no longer be relevant to the representation that is served from that same resource tomorrow. We assume the existence of archived versions of resources, and combine the temporal features of the emerging Open Annotation data model with the capability offered by the Memento framework that allows seamless navigation from the URI of a resource to archived versions of that resource, and arrive at a solution that provides guarantees regarding the persistence of web annotations over time. More specifically, we provide theoretical solutions and proof-of-concept experimental evaluations for two problems: reconstructing an existing annotation so that the correct archived version is displayed for all resources involved in the annotation, and retrieving all annotations that involve a given archived version of a web resource.

  12. COGNATE: comparative gene annotation characterizer.

    Science.gov (United States)

    Wilbrandt, Jeanne; Misof, Bernhard; Niehuis, Oliver

    2017-07-17

    The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https

  13. Crowdsourcing and annotating NER for Twitter #drift

    DEFF Research Database (Denmark)

    Fromreide, Hege; Hovy, Dirk; Søgaard, Anders

    2014-01-01

    We present two new NER datasets for Twitter; a manually annotated set of 1,467 tweets (kappa=0.942) and a set of 2,975 expert-corrected, crowdsourced NER annotated tweets from the dataset described in Finin et al. (2010). In our experiments with these datasets, we observe two important points: (a......) language drift on Twitter is significant, and while off-the-shelf systems have been reported to perform well on in-sample data, they often perform poorly on new samples of tweets, (b) state-of-the-art performance across various datasets can beobtained from crowdsourced annotations, making it more feasible...

  14. DFAST: a flexible prokaryotic genome annotation pipeline for faster genome publication.

    Science.gov (United States)

    Tanizawa, Yasuhiro; Fujisawa, Takatomo; Nakamura, Yasukazu

    2018-03-15

    We developed a prokaryotic genome annotation pipeline, DFAST, that also supports genome submission to public sequence databases. DFAST was originally started as an on-line annotation server, and to date, over 7000 jobs have been processed since its first launch in 2016. Here, we present a newly implemented background annotation engine for DFAST, which is also available as a standalone command-line program. The new engine can annotate a typical-sized bacterial genome within 10 min, with rich information such as pseudogenes, translation exceptions and orthologous gene assignment between given reference genomes. In addition, the modular framework of DFAST allows users to customize the annotation workflow easily and will also facilitate extensions for new functions and incorporation of new tools in the future. The software is implemented in Python 3 and runs in both Python 2.7 and 3.4-on Macintosh and Linux systems. It is freely available at https://github.com/nigyta/dfast_core/under the GPLv3 license with external binaries bundled in the software distribution. An on-line version is also available at https://dfast.nig.ac.jp/. yn@nig.ac.jp. Supplementary data are available at Bioinformatics online.

  15. Annotations to quantum statistical mechanics

    CERN Document Server

    Kim, In-Gee

    2018-01-01

    This book is a rewritten and annotated version of Leo P. Kadanoff and Gordon Baym’s lectures that were presented in the book Quantum Statistical Mechanics: Green’s Function Methods in Equilibrium and Nonequilibrium Problems. The lectures were devoted to a discussion on the use of thermodynamic Green’s functions in describing the properties of many-particle systems. The functions provided a method for discussing finite-temperature problems with no more conceptual difficulty than ground-state problems, and the method was equally applicable to boson and fermion systems and equilibrium and nonequilibrium problems. The lectures also explained nonequilibrium statistical physics in a systematic way and contained essential concepts on statistical physics in terms of Green’s functions with sufficient and rigorous details. In-Gee Kim thoroughly studied the lectures during one of his research projects but found that the unspecialized method used to present them in the form of a book reduced their readability. He st...

  16. Meteor showers an annotated catalog

    CERN Document Server

    Kronk, Gary W

    2014-01-01

    Meteor showers are among the most spectacular celestial events that may be observed by the naked eye, and have been the object of fascination throughout human history. In “Meteor Showers: An Annotated Catalog,” the interested observer can access detailed research on over 100 annual and periodic meteor streams in order to capitalize on these majestic spectacles. Each meteor shower entry includes details of their discovery, important observations and orbits, and gives a full picture of duration, location in the sky, and expected hourly rates. Armed with a fuller understanding, the amateur observer can better view and appreciate the shower of their choice. The original book, published in 1988, has been updated with over 25 years of research in this new and improved edition. Almost every meteor shower study is expanded, with some original minor showers being dropped while new ones are added. The book also includes breakthroughs in the study of meteor showers, such as accurate predictions of outbursts as well ...

  17. Perceptual Robust Design

    DEFF Research Database (Denmark)

    Pedersen, Søren Nygaard

    The research presented in this PhD thesis has focused on a perceptual approach to robust design. The results of the research and the original contribution to knowledge is a preliminary framework for understanding, positioning, and applying perceptual robust design. Product quality is a topic...... been presented. Therefore, this study set out to contribute to the understanding and application of perceptual robust design. To achieve this, a state-of-the-art and current practice review was performed. From the review two main research problems were identified. Firstly, a lack of tools...... for perceptual robustness was found to overlap with the optimum for functional robustness and at most approximately 2.2% out of the 14.74% could be ascribed solely to the perceptual robustness optimisation. In conclusion, the thesis have offered a new perspective on robust design by merging robust design...

  18. The influence of annotation in graphical organizers

    NARCIS (Netherlands)

    Bezdan, Eniko; Kester, Liesbeth; Kirschner, Paul A.

    2013-01-01

    Bezdan, E., Kester, L., & Kirschner, P. A. (2012, 29-31 August). The influence of annotation in graphical organizers. Poster presented at the biannual meeting of the EARLI Special Interest Group Comprehension of Text and Graphics, Grenoble, France.

  19. An Informally Annotated Bibliography of Sociolinguistics.

    Science.gov (United States)

    Tannen, Deborah

    This annotated bibliography of sociolinguistics is divided into the following sections: speech events, ethnography of speaking and anthropological approaches to analysis of conversation; discourse analysis (including analysis of conversation and narrative), ethnomethodology and nonverbal communication; sociolinguistics; pragmatics (including…

  20. The Community Junior College: An Annotated Bibliography.

    Science.gov (United States)

    Rarig, Emory W., Jr., Ed.

    This annotated bibliography on the junior college is arranged by topic: research tools, history, functions and purposes, organization and administration, students, programs, personnel, facilities, and research. It covers publications through the fall of 1965 and has an author index. (HH)

  1. WormBase: Annotating many nematode genomes.

    Science.gov (United States)

    Howe, Kevin; Davis, Paul; Paulini, Michael; Tuli, Mary Ann; Williams, Gary; Yook, Karen; Durbin, Richard; Kersey, Paul; Sternberg, Paul W

    2012-01-01

    WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase's role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.

  2. Annotated Tsunami bibliography: 1962-1976

    International Nuclear Information System (INIS)

    Pararas-Carayannis, G.; Dong, B.; Farmer, R.

    1982-08-01

    This compilation contains annotated citations to nearly 3000 tsunami-related publications from 1962 to 1976 in English and several other languages. The foreign-language citations have English titles and abstracts

  3. GRADUATE AND PROFESSIONAL EDUCATION, AN ANNOTATED BIBLIOGRAPHY.

    Science.gov (United States)

    HEISS, ANN M.; AND OTHERS

    THIS ANNOTATED BIBLIOGRAPHY CONTAINS REFERENCES TO GENERAL GRADUATE EDUCATION AND TO EDUCATION FOR THE FOLLOWING PROFESSIONAL FIELDS--ARCHITECTURE, BUSINESS, CLINICAL PSYCHOLOGY, DENTISTRY, ENGINEERING, LAW, LIBRARY SCIENCE, MEDICINE, NURSING, SOCIAL WORK, TEACHING, AND THEOLOGY. (HW)

  4. Robust Statistical Face Frontalization

    NARCIS (Netherlands)

    Sagonas, Christos; Panagakis, Yannis; Zafeiriou, Stefanos; Pantic, Maja

    2015-01-01

    Recently, it has been shown that excellent results can be achieved in both facial landmark localization and pose-invariant face recognition. These breakthroughs are attributed to the efforts of the community to manually annotate facial images in many different poses and to collect 3D facial data. In

  5. Contributions to In Silico Genome Annotation

    KAUST Repository

    Kalkatawi, Manal M.

    2017-11-30

    Genome annotation is an important topic since it provides information for the foundation of downstream genomic and biological research. It is considered as a way of summarizing part of existing knowledge about the genomic characteristics of an organism. Annotating different regions of a genome sequence is known as structural annotation, while identifying functions of these regions is considered as a functional annotation. In silico approaches can facilitate both tasks that otherwise would be difficult and timeconsuming. This study contributes to genome annotation by introducing several novel bioinformatics methods, some based on machine learning (ML) approaches. First, we present Dragon PolyA Spotter (DPS), a method for accurate identification of the polyadenylation signals (PAS) within human genomic DNA sequences. For this, we derived a novel feature-set able to characterize properties of the genomic region surrounding the PAS, enabling development of high accuracy optimized ML predictive models. DPS considerably outperformed the state-of-the-art results. The second contribution concerns developing generic models for structural annotation, i.e., the recognition of different genomic signals and regions (GSR) within eukaryotic DNA. We developed DeepGSR, a systematic framework that facilitates generating ML models to predict GSR with high accuracy. To the best of our knowledge, no available generic and automated method exists for such task that could facilitate the studies of newly sequenced organisms. The prediction module of DeepGSR uses deep learning algorithms to derive highly abstract features that depend mainly on proper data representation and hyperparameters calibration. DeepGSR, which was evaluated on recognition of PAS and translation initiation sites (TIS) in different organisms, yields a simpler and more precise representation of the problem under study, compared to some other hand-tailored models, while producing high accuracy prediction results. Finally

  6. Fluid Annotations in a Open World

    DEFF Research Database (Denmark)

    Zellweger, Polle Trescott; Bouvin, Niels Olof; Jehøj, Henning

    2001-01-01

    Fluid Documents use animated typographical changes to provide a novel and appealing user experience for hypertext browsing and for viewing document annotations in context. This paper describes an effort to broaden the utility of Fluid Documents by using the open hypermedia Arakne Environment to l...... to layer fluid annotations and links on top of abitrary HTML pages on the World Wide Web. Changes to both Fluid Documents and Arakne are required....

  7. Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.

  8. Community annotation and bioinformatics workforce development in concert—Little Skate Genome Annotation Workshops and Jamborees

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N.; King, Benjamin L.; Polson, Shawn W.; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F.; Page, Shallee T.; Farnum Rendino, Marc; Thomas, William Kelley; Udwary, Daniel W.; Wu, Cathy H.

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome. PMID:22434832

  9. JGI Plant Genomics Gene Annotation Pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Shu, Shengqiang; Rokhsar, Dan; Goodstein, David; Hayes, David; Mitros, Therese

    2014-07-14

    Plant genomes vary in size and are highly complex with a high amount of repeats, genome duplication and tandem duplication. Gene encodes a wealth of information useful in studying organism and it is critical to have high quality and stable gene annotation. Thanks to advancement of sequencing technology, many plant species genomes have been sequenced and transcriptomes are also sequenced. To use these vastly large amounts of sequence data to make gene annotation or re-annotation in a timely fashion, an automatic pipeline is needed. JGI plant genomics gene annotation pipeline, called integrated gene call (IGC), is our effort toward this aim with aid of a RNA-seq transcriptome assembly pipeline. It utilizes several gene predictors based on homolog peptides and transcript ORFs. See Methods for detail. Here we present genome annotation of JGI flagship green plants produced by this pipeline plus Arabidopsis and rice except for chlamy which is done by a third party. The genome annotations of these species and others are used in our gene family build pipeline and accessible via JGI Phytozome portal whose URL and front page snapshot are shown below.

  10. Annotating the human genome with Disease Ontology

    Science.gov (United States)

    Osborne, John D; Flatow, Jared; Holko, Michelle; Lin, Simon M; Kibbe, Warren A; Zhu, Lihua (Julie); Danila, Maria I; Feng, Gang; Chisholm, Rex L

    2009-01-01

    Background The human genome has been extensively annotated with Gene Ontology for biological functions, but minimally computationally annotated for diseases. Results We used the Unified Medical Language System (UMLS) MetaMap Transfer tool (MMTx) to discover gene-disease relationships from the GeneRIF database. We utilized a comprehensive subset of UMLS, which is disease-focused and structured as a directed acyclic graph (the Disease Ontology), to filter and interpret results from MMTx. The results were validated against the Homayouni gene collection using recall and precision measurements. We compared our results with the widely used Online Mendelian Inheritance in Man (OMIM) annotations. Conclusion The validation data set suggests a 91% recall rate and 97% precision rate of disease annotation using GeneRIF, in contrast with a 22% recall and 98% precision using OMIM. Our thesaurus-based approach allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching. The much higher recall rate of our approach demonstrates that annotating human genome with Disease Ontology and GeneRIF for diseases dramatically increases the coverage of the disease annotation of human genome. PMID:19594883

  11. Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger

    OpenAIRE

    Wright, James C.; Sugden, Deana; Francis-McIntyre, Sue; Riba Garcia, Isabel; Gaskell, Simon J.; Grigoriev, Igor V.; Baker, Scott E.; Beynon, Robert J.; Hubbard, Simon J.

    2009-01-01

    Abstract Background Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were ac...

  12. Annotated checklist of Albanian butterflies (Lepidoptera, Papilionoidea and Hesperioidea

    Directory of Open Access Journals (Sweden)

    Rudi Verovnik

    2013-08-01

    Full Text Available The Republic of Albania has a rich diversity of flora and fauna. However, due to its political isolation, it has never been studied in great depth, and consequently, the existing list of butterfly species is outdated and in need of radical amendment. In addition to our personal data, we have studied the available literature, and can report a total of 196 butterfly species recorded from the country. For some of the species in the list we have given explanations for their inclusion and made other annotations. Doubtful records have been removed from the list, and changes in taxonomy have been updated and discussed separately. The purpose of our paper is to remove confusion and conflict regarding published records. However, the revised checklist should not be considered complete: it represents a starting point for further research.

  13. Robustness of Structural Systems

    DEFF Research Database (Denmark)

    Canisius, T.D.G.; Sørensen, John Dalsgaard; Baker, J.W.

    2007-01-01

    The importance of robustness as a property of structural systems has been recognised following several structural failures, such as that at Ronan Point in 1968,where the consequenceswere deemed unacceptable relative to the initiating damage. A variety of research efforts in the past decades have...... attempted to quantify aspects of robustness such as redundancy and identify design principles that can improve robustness. This paper outlines the progress of recent work by the Joint Committee on Structural Safety (JCSS) to develop comprehensive guidance on assessing and providing robustness in structural...... systems. Guidance is provided regarding the assessment of robustness in a framework that considers potential hazards to the system, vulnerability of system components, and failure consequences. Several proposed methods for quantifying robustness are reviewed, and guidelines for robust design...

  14. Robust multivariate analysis

    CERN Document Server

    J Olive, David

    2017-01-01

    This text presents methods that are robust to the assumption of a multivariate normal distribution or methods that are robust to certain types of outliers. Instead of using exact theory based on the multivariate normal distribution, the simpler and more applicable large sample theory is given.  The text develops among the first practical robust regression and robust multivariate location and dispersion estimators backed by theory.   The robust techniques  are illustrated for methods such as principal component analysis, canonical correlation analysis, and factor analysis.  A simple way to bootstrap confidence regions is also provided. Much of the research on robust multivariate analysis in this book is being published for the first time. The text is suitable for a first course in Multivariate Statistical Analysis or a first course in Robust Statistics. This graduate text is also useful for people who are familiar with the traditional multivariate topics, but want to know more about handling data sets with...

  15. annot8r: GO, EC and KEGG annotation of EST datasets

    Directory of Open Access Journals (Sweden)

    Schmid Ralf

    2008-04-01

    Full Text Available Abstract Background The expressed sequence tag (EST methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways. Results annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO, Enzyme Commission (EC and Kyoto Encyclopaedia of Genes and Genomes (KEGG annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools. Conclusion annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non

  16. Discovering gene annotations in biomedical text databases

    Directory of Open Access Journals (Sweden)

    Ozsoyoglu Gultekin

    2008-03-01

    Full Text Available Abstract Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i automating the annotation of genomic entities with Gene Ontology concepts, and (ii providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate

  17. Annotated chemical patent corpus: a gold standard for text mining.

    Directory of Open Access Journals (Sweden)

    Saber A Akhondi

    Full Text Available Exploring the chemical and biological space covered by patent applications is crucial in early-stage medicinal chemistry activities. Patent analysis can provide understanding of compound prior art, novelty checking, validation of biological assays, and identification of new starting points for chemical exploration. Extracting chemical and biological entities from patents through manual extraction by expert curators can take substantial amount of time and resources. Text mining methods can help to ease this process. To validate the performance of such methods, a manually annotated patent corpus is essential. In this study we have produced a large gold standard chemical patent corpus. We developed annotation guidelines and selected 200 full patents from the World Intellectual Property Organization, United States Patent and Trademark Office, and European Patent Office. The patents were pre-annotated automatically and made available to four independent annotator groups each consisting of two to ten annotators. The annotators marked chemicals in different subclasses, diseases, targets, and modes of action. Spelling mistakes and spurious line break due to optical character recognition errors were also annotated. A subset of 47 patents was annotated by at least three annotator groups, from which harmonized annotations and inter-annotator agreement scores were derived. One group annotated the full set. The patent corpus includes 400,125 annotations for the full set and 36,537 annotations for the harmonized set. All patents and annotated entities are publicly available at www.biosemantics.org.

  18. Semi-Semantic Annotation: A guideline for the URDU.KON-TB treebank POS annotation

    Directory of Open Access Journals (Sweden)

    Qaiser ABBAS

    2016-12-01

    Full Text Available This work elaborates the semi-semantic part of speech annotation guidelines for the URDU.KON-TB treebank: an annotated corpus. A hierarchical annotation scheme was designed to label the part of speech and then applied on the corpus. This raw corpus was collected from the Urdu Wikipedia and the Jang newspaper and then annotated with the proposed semi-semantic part of speech labels. The corpus contains text of local & international news, social stories, sports, culture, finance, religion, traveling, etc. This exercise finally contributed a part of speech annotation to the URDU.KON-TB treebank. Twenty-two main part of speech categories are divided into subcategories, which conclude the morphological, and semantical information encoded in it. This article reports the annotation guidelines in major; however, it also briefs the development of the URDU.KON-TB treebank, which includes the raw corpus collection, designing & employment of annotation scheme and finally, its statistical evaluation and results. The guidelines presented as follows, will be useful for linguistic community to annotate the sentences not only for the national language Urdu but for the other indigenous languages like Punjab, Sindhi, Pashto, etc., as well.

  19. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    Directory of Open Access Journals (Sweden)

    Shu-Chuan Chen

    Full Text Available The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.

  20. Active learning reduces annotation time for clinical concept extraction.

    Science.gov (United States)

    Kholghi, Mahnoosh; Sitbon, Laurianne; Zuccon, Guido; Nguyen, Anthony

    2017-10-01

    To investigate: (1) the annotation time savings by various active learning query strategies compared to supervised learning and a random sampling baseline, and (2) the benefits of active learning-assisted pre-annotations in accelerating the manual annotation process compared to de novo annotation. There are 73 and 120 discharge summary reports provided by Beth Israel institute in the train and test sets of the concept extraction task in the i2b2/VA 2010 challenge, respectively. The 73 reports were used in user study experiments for manual annotation. First, all sequences within the 73 reports were manually annotated from scratch. Next, active learning models were built to generate pre-annotations for the sequences selected by a query strategy. The annotation/reviewing time per sequence was recorded. The 120 test reports were used to measure the effectiveness of the active learning models. When annotating from scratch, active learning reduced the annotation time up to 35% and 28% compared to a fully supervised approach and a random sampling baseline, respectively. Reviewing active learning-assisted pre-annotations resulted in 20% further reduction of the annotation time when compared to de novo annotation. The number of concepts that require manual annotation is a good indicator of the annotation time for various active learning approaches as demonstrated by high correlation between time rate and concept annotation rate. Active learning has a key role in reducing the time required to manually annotate domain concepts from clinical free text, either when annotating from scratch or reviewing active learning-assisted pre-annotations. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Robustness of Structures

    DEFF Research Database (Denmark)

    Faber, Michael Havbro; Vrouwenvelder, A.C.W.M.; Sørensen, John Dalsgaard

    2011-01-01

    In 2005, the Joint Committee on Structural Safety (JCSS) together with Working Commission (WC) 1 of the International Association of Bridge and Structural Engineering (IABSE) organized a workshop on robustness of structures. Two important decisions resulted from this workshop, namely...... ‘COST TU0601: Robustness of Structures’ was initiated in February 2007, aiming to provide a platform for exchanging and promoting research in the area of structural robustness and to provide a basic framework, together with methods, strategies and guidelines enhancing robustness of structures...... the development of a joint European project on structural robustness under the COST (European Cooperation in Science and Technology) programme and the decision to develop a more elaborate document on structural robustness in collaboration between experts from the JCSS and the IABSE. Accordingly, a project titled...

  2. MPEG-7 based video annotation and browsing

    Science.gov (United States)

    Hoeynck, Michael; Auweiler, Thorsten; Wellhausen, Jens

    2003-11-01

    The huge amount of multimedia data produced worldwide requires annotation in order to enable universal content access and to provide content-based search-and-retrieval functionalities. Since manual video annotation can be time consuming, automatic annotation systems are required. We review recent approaches to content-based indexing and annotation of videos for different kind of sports and describe our approach to automatic annotation of equestrian sports videos. We especially concentrate on MPEG-7 based feature extraction and content description, where we apply different visual descriptors for cut detection. Further, we extract the temporal positions of single obstacles on the course by analyzing MPEG-7 edge information. Having determined single shot positions as well as the visual highlights, the information is jointly stored with meta-textual information in an MPEG-7 description scheme. Based on this information, we generate content summaries which can be utilized in a user-interface in order to provide content-based access to the video stream, but further for media browsing on a streaming server.

  3. ACID: annotation of cassette and integron data

    Directory of Open Access Journals (Sweden)

    Stokes Harold W

    2009-04-01

    Full Text Available Abstract Background Although integrons and their associated gene cassettes are present in ~10% of bacteria and can represent up to 3% of the genome in which they are found, very few have been properly identified and annotated in public databases. These genetic elements have been overlooked in comparison to other vectors that facilitate lateral gene transfer between microorganisms. Description By automating the identification of integron integrase genes and of the non-coding cassette-associated attC recombination sites, we were able to assemble a database containing all publicly available sequence information regarding these genetic elements. Specialists manually curated the database and this information was used to improve the automated detection and annotation of integrons and their encoded gene cassettes. ACID (annotation of cassette and integron data can be searched using a range of queries and the data can be downloaded in a number of formats. Users can readily annotate their own data and integrate it into ACID using the tools provided. Conclusion ACID is a community resource providing easy access to annotations of integrons and making tools available to detect them in novel sequence data. ACID also hosts a forum to prompt integron-related discussion, which can hopefully lead to a more universal definition of this genetic element.

  4. Robust Growth Determinants

    OpenAIRE

    Doppelhofer, Gernot; Weeks, Melvyn

    2011-01-01

    This paper investigates the robustness of determinants of economic growth in the presence of model uncertainty, parameter heterogeneity and outliers. The robust model averaging approach introduced in the paper uses a flexible and parsi- monious mixture modeling that allows for fat-tailed errors compared to the normal benchmark case. Applying robust model averaging to growth determinants, the paper finds that eight out of eighteen variables found to be significantly related to economic growth ...

  5. Robust Programming by Example

    OpenAIRE

    Bishop , Matt; Elliott , Chip

    2011-01-01

    Part 2: WISE 7; International audience; Robust programming lies at the heart of the type of coding called “secure programming”. Yet it is rarely taught in academia. More commonly, the focus is on how to avoid creating well-known vulnerabilities. While important, that misses the point: a well-structured, robust program should anticipate where problems might arise and compensate for them. This paper discusses one view of robust programming and gives an example of how it may be taught.

  6. Annotating Logical Forms for EHR Questions.

    Science.gov (United States)

    Roberts, Kirk; Demner-Fushman, Dina

    2016-05-01

    This paper discusses the creation of a semantically annotated corpus of questions about patient data in electronic health records (EHRs). The goal is to provide the training data necessary for semantic parsers to automatically convert EHR questions into a structured query. A layered annotation strategy is used which mirrors a typical natural language processing (NLP) pipeline. First, questions are syntactically analyzed to identify multi-part questions. Second, medical concepts are recognized and normalized to a clinical ontology. Finally, logical forms are created using a lambda calculus representation. We use a corpus of 446 questions asking for patient-specific information. From these, 468 specific questions are found containing 259 unique medical concepts and requiring 53 unique predicates to represent the logical forms. We further present detailed characteristics of the corpus, including inter-annotator agreement results, and describe the challenges automatic NLP systems will face on this task.

  7. Motion lecture annotation system to learn Naginata performances

    Science.gov (United States)

    Kobayashi, Daisuke; Sakamoto, Ryota; Nomura, Yoshihiko

    2013-12-01

    This paper describes a learning assistant system using motion capture data and annotation to teach "Naginata-jutsu" (a skill to practice Japanese halberd) performance. There are some video annotation tools such as YouTube. However these video based tools have only single angle of view. Our approach that uses motion-captured data allows us to view any angle. A lecturer can write annotations related to parts of body. We have made a comparison of effectiveness between the annotation tool of YouTube and the proposed system. The experimental result showed that our system triggered more annotations than the annotation tool of YouTube.

  8. An Annotated Dataset of 14 Meat Images

    DEFF Research Database (Denmark)

    Stegmann, Mikkel Bille

    2002-01-01

    This note describes a dataset consisting of 14 annotated images of meat. Points of correspondence are placed on each image. As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given.......This note describes a dataset consisting of 14 annotated images of meat. Points of correspondence are placed on each image. As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given....

  9. Software for computing and annotating genomic ranges.

    Directory of Open Access Journals (Sweden)

    Michael Lawrence

    Full Text Available We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  10. Software for computing and annotating genomic ranges.

    Science.gov (United States)

    Lawrence, Michael; Huber, Wolfgang; Pagès, Hervé; Aboyoun, Patrick; Carlson, Marc; Gentleman, Robert; Morgan, Martin T; Carey, Vincent J

    2013-01-01

    We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  11. Robust procedures in chemometrics

    DEFF Research Database (Denmark)

    Kotwa, Ewelina

    properties of the analysed data. The broad theoretical background of robust procedures was given as a very useful supplement to the classical methods, and a new tool, based on robust PCA, aiming at identifying Rayleigh and Raman scatters in excitation-mission (EEM) data was developed. The results show...

  12. Solar Tutorial and Annotation Resource (STAR)

    Science.gov (United States)

    Showalter, C.; Rex, R.; Hurlburt, N. E.; Zita, E. J.

    2009-12-01

    We have written a software suite designed to facilitate solar data analysis by scientists, students, and the public, anticipating enormous datasets from future instruments. Our “STAR" suite includes an interactive learning section explaining 15 classes of solar events. Users learn software tools that exploit humans’ superior ability (over computers) to identify many events. Annotation tools include time slice generation to quantify loop oscillations, the interpolation of event shapes using natural cubic splines (for loops, sigmoids, and filaments) and closed cubic splines (for coronal holes). Learning these tools in an environment where examples are provided prepares new users to comfortably utilize annotation software with new data. Upon completion of our tutorial, users are presented with media of various solar events and asked to identify and annotate the images, to test their mastery of the system. Goals of the project include public input into the data analysis of very large datasets from future solar satellites, and increased public interest and knowledge about the Sun. In 2010, the Solar Dynamics Observatory (SDO) will be launched into orbit. SDO’s advancements in solar telescope technology will generate a terabyte per day of high-quality data, requiring innovation in data management. While major projects develop automated feature recognition software, so that computers can complete much of the initial event tagging and analysis, still, that software cannot annotate features such as sigmoids, coronal magnetic loops, coronal dimming, etc., due to large amounts of data concentrated in relatively small areas. Previously, solar physicists manually annotated these features, but with the imminent influx of data it is unrealistic to expect specialized researchers to examine every image that computers cannot fully process. A new approach is needed to efficiently process these data. Providing analysis tools and data access to students and the public have proven

  13. Desiderata for ontologies to be used in semantic annotation of biomedical documents.

    Science.gov (United States)

    Bada, Michael; Hunter, Lawrence

    2011-02-01

    A wealth of knowledge valuable to the translational research scientist is contained within the vast biomedical literature, but this knowledge is typically in the form of natural language. Sophisticated natural-language-processing systems are needed to translate text into unambiguous formal representations grounded in high-quality consensus ontologies, and these systems in turn rely on gold-standard corpora of annotated documents for training and testing. To this end, we are constructing the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-text biomedical journal articles that are being manually annotated with the entire sets of terms from select vocabularies, predominantly from the Open Biomedical Ontologies (OBO) library. Our efforts in building this corpus has illuminated infelicities of these ontologies with respect to the semantic annotation of biomedical documents, and we propose desiderata whose implementation could substantially improve their utility in this task; these include the integration of overlapping terms across OBOs, the resolution of OBO-specific ambiguities, the integration of the BFO with the OBOs and the use of mid-level ontologies, the inclusion of noncanonical instances, and the expansion of relations and realizable entities. Copyright © 2010 Elsevier Inc. All rights reserved.

  14. Legal Information Sources: An Annotated Bibliography.

    Science.gov (United States)

    Conner, Ronald C.

    This 25-page annotated bibliography describes the legal reference materials in the special collection of a medium-sized public library. Sources are listed in 12 categories: cases, dictionaries, directories, encyclopedias, forms, references for the lay person, general, indexes, laws and legislation, legal research aids, periodicals, and specialized…

  15. SNAD: sequence name annotation-based designer

    Directory of Open Access Journals (Sweden)

    Gorbalenya Alexander E

    2009-08-01

    Full Text Available Abstract Background A growing diversity of biological data is tagged with unique identifiers (UIDs associated with polynucleotides and proteins to ensure efficient computer-mediated data storage, maintenance, and processing. These identifiers, which are not informative for most people, are often substituted by biologically meaningful names in various presentations to facilitate utilization and dissemination of sequence-based knowledge. This substitution is commonly done manually that may be a tedious exercise prone to mistakes and omissions. Results Here we introduce SNAD (Sequence Name Annotation-based Designer that mediates automatic conversion of sequence UIDs (associated with multiple alignment or phylogenetic tree, or supplied as plain text list into biologically meaningful names and acronyms. This conversion is directed by precompiled or user-defined templates that exploit wealth of annotation available in cognate entries of external databases. Using examples, we demonstrate how this tool can be used to generate names for practical purposes, particularly in virology. Conclusion A tool for controllable annotation-based conversion of sequence UIDs into biologically meaningful names and acronyms has been developed and placed into service, fostering links between quality of sequence annotation, and efficiency of communication and knowledge dissemination among researchers.

  16. Just-in-time : on strategy annotations

    NARCIS (Netherlands)

    J.C. van de Pol (Jaco)

    2001-01-01

    textabstractA simple kind of strategy annotations is investigated, giving rise to a class of strategies, including leftmost-innermost. It is shown that under certain restrictions, an interpreter can be written which computes the normal form of a term in a bottom-up traversal. The main contribution

  17. Argumentation Theory. [A Selected Annotated Bibliography].

    Science.gov (United States)

    Benoit, William L.

    Materials dealing with aspects of argumentation theory are cited in this annotated bibliography. The 50 citations are organized by topic as follows: (1) argumentation; (2) the nature of argument; (3) traditional perspectives on argument; (4) argument diagrams; (5) Chaim Perelman's theory of rhetoric; (6) the evaluation of argument; (7) argument…

  18. Annotated Bibliography of EDGE2D Use

    Energy Technology Data Exchange (ETDEWEB)

    J.D. Strachan and G. Corrigan

    2005-06-24

    This annotated bibliography is intended to help EDGE2D users, and particularly new users, find existing published literature that has used EDGE2D. Our idea is that a person can find existing studies which may relate to his intended use, as well as gain ideas about other possible applications by scanning the attached tables.

  19. Nutrition & Adolescent Pregnancy: A Selected Annotated Bibliography.

    Science.gov (United States)

    National Agricultural Library (USDA), Washington, DC.

    This annotated bibliography on nutrition and adolescent pregnancy is intended to be a source of technical assistance for nurses, nutritionists, physicians, educators, social workers, and other personnel concerned with improving the health of teenage mothers and their babies. It is divided into two major sections. The first section lists selected…

  20. Great Basin Experimental Range: Annotated bibliography

    Science.gov (United States)

    E. Durant McArthur; Bryce A. Richardson; Stanley G. Kitchen

    2013-01-01

    This annotated bibliography documents the research that has been conducted on the Great Basin Experimental Range (GBER, also known as the Utah Experiment Station, Great Basin Station, the Great Basin Branch Experiment Station, Great Basin Experimental Center, and other similar name variants) over the 102 years of its existence. Entries were drawn from the original...

  1. Evaluating automatically annotated treebanks for linguistic research

    NARCIS (Netherlands)

    Bloem, J.; Bański, P.; Kupietz, M.; Lüngen, H.; Witt, A.; Barbaresi, A.; Biber, H.; Breiteneder, E.; Clematide, S.

    2016-01-01

    This study discusses evaluation methods for linguists to use when employing an automatically annotated treebank as a source of linguistic evidence. While treebanks are usually evaluated with a general measure over all the data, linguistic studies often focus on a particular construction or a group

  2. DIMA – Annotation guidelines for German intonation

    DEFF Research Database (Denmark)

    Kügler, Frank; Smolibocki, Bernadett; Arnold, Denis

    2015-01-01

    This paper presents newly developed guidelines for prosodic annotation of German as a consensus system agreed upon by German intonologists. The DIMA system is rooted in the framework of autosegmental-metrical phonology. One important goal of the consensus is to make exchanging data between groups...

  3. Annotated Bibliography of EDGE2D Use

    International Nuclear Information System (INIS)

    Strachan, J.D.; Corrigan, G.

    2005-01-01

    This annotated bibliography is intended to help EDGE2D users, and particularly new users, find existing published literature that has used EDGE2D. Our idea is that a person can find existing studies which may relate to his intended use, as well as gain ideas about other possible applications by scanning the attached tables

  4. Skin Cancer Education Materials: Selected Annotations.

    Science.gov (United States)

    National Cancer Inst. (NIH), Bethesda, MD.

    This annotated bibliography presents 85 entries on a variety of approaches to cancer education. The entries are grouped under three broad headings, two of which contain smaller sub-divisions. The first heading, Public Education, contains prevention and general information, and non-print materials. The second heading, Professional Education,…

  5. Book Reviews, Annotation, and Web Technology.

    Science.gov (United States)

    Schulze, Patricia

    From reading texts to annotating web pages, grade 6-8 students rely on group cooperation and individual reading and writing skills in this research project that spans six 50-minute lessons. Student objectives for this project are that they will: read, discuss, and keep a journal on a book in literature circles; understand the elements of and…

  6. Snap: an integrated SNP annotation platform

    DEFF Research Database (Denmark)

    Li, Shengting; Ma, Lijia; Li, Heng

    2007-01-01

    Snap (Single Nucleotide Polymorphism Annotation Platform) is a server designed to comprehensively analyze single genes and relationships between genes basing on SNPs in the human genome. The aim of the platform is to facilitate the study of SNP finding and analysis within the framework of medical...

  7. Annotating State of Mind in Meeting Data

    NARCIS (Netherlands)

    Heylen, Dirk K.J.; Reidsma, Dennis; Ordelman, Roeland J.F.; Devillers, L.; Martin, J-C.; Cowie, R.; Batliner, A.

    We discuss the annotation procedure for mental state and emotion that is under development for the AMI (Augmented Multiparty Interaction) corpus. The categories that were found to be most appropriate relate not only to emotions but also to (meta-)cognitive states and interpersonal variables. The

  8. ePNK Applications and Annotations

    DEFF Research Database (Denmark)

    Kindler, Ekkart

    2017-01-01

    newapplicationsfor the ePNK and, in particular, visualizing the result of an application in the graphical editor of the ePNK by singannotations, and interacting with the end user using these annotations. In this paper, we give an overview of the concepts of ePNK applications by discussing the implementation...

  9. Multiview Hessian regularization for image annotation.

    Science.gov (United States)

    Liu, Weifeng; Tao, Dacheng

    2013-07-01

    The rapid development of computer hardware and Internet technology makes large scale data dependent models computationally tractable, and opens a bright avenue for annotating images through innovative machine learning algorithms. Semisupervised learning (SSL) therefore received intensive attention in recent years and was successfully deployed in image annotation. One representative work in SSL is Laplacian regularization (LR), which smoothes the conditional distribution for classification along the manifold encoded in the graph Laplacian, however, it is observed that LR biases the classification function toward a constant function that possibly results in poor generalization. In addition, LR is developed to handle uniformly distributed data (or single-view data), although instances or objects, such as images and videos, are usually represented by multiview features, such as color, shape, and texture. In this paper, we present multiview Hessian regularization (mHR) to address the above two problems in LR-based image annotation. In particular, mHR optimally combines multiple HR, each of which is obtained from a particular view of instances, and steers the classification function that varies linearly along the data manifold. We apply mHR to kernel least squares and support vector machines as two examples for image annotation. Extensive experiments on the PASCAL VOC'07 dataset validate the effectiveness of mHR by comparing it with baseline algorithms, including LR and HR.

  10. Special Issue: Annotated Bibliography for Volumes XIX-XXXII.

    Science.gov (United States)

    Pullin, Richard A.

    1998-01-01

    This annotated bibliography lists 310 articles from the "Journal of Cooperative Education" from Volumes XIX-XXXII, 1983-1997. Annotations are presented in the order they appear in the journal; author and subject indexes are provided. (JOW)

  11. Computer systems for annotation of single molecule fragments

    Science.gov (United States)

    Schwartz, David Charles; Severin, Jessica

    2016-07-19

    There are provided computer systems for visualizing and annotating single molecule images. Annotation systems in accordance with this disclosure allow a user to mark and annotate single molecules of interest and their restriction enzyme cut sites thereby determining the restriction fragments of single nucleic acid molecules. The markings and annotations may be automatically generated by the system in certain embodiments and they may be overlaid translucently onto the single molecule images. An image caching system may be implemented in the computer annotation systems to reduce image processing time. The annotation systems include one or more connectors connecting to one or more databases capable of storing single molecule data as well as other biomedical data. Such diverse array of data can be retrieved and used to validate the markings and annotations. The annotation systems may be implemented and deployed over a computer network. They may be ergonomically optimized to facilitate user interactions.

  12. MEETING: Chlamydomonas Annotation Jamboree - October 2003

    Energy Technology Data Exchange (ETDEWEB)

    Grossman, Arthur R

    2007-04-13

    Shotgun sequencing of the nuclear genome of Chlamydomonas reinhardtii (Chlamydomonas throughout) was performed at an approximate 10X coverage by JGI. Roughly half of the genome is now contained on 26 scaffolds, all of which are at least 1.6 Mb, and the coverage of the genome is ~95%. There are now over 200,000 cDNA sequence reads that we have generated as part of the Chlamydomonas genome project (Grossman, 2003; Shrager et al., 2003; Grossman et al. 2007; Merchant et al., 2007); other sequences have also been generated by the Kasuza sequence group (Asamizu et al., 1999; Asamizu et al., 2000) or individual laboratories that have focused on specific genes. Shrager et al. (2003) placed the reads into distinct contigs (an assemblage of reads with overlapping nucleotide sequences), and contigs that group together as part of the same genes have been designated ACEs (assembly of contigs generated from EST information). All of the reads have also been mapped to the Chlamydomonas nuclear genome and the cDNAs and their corresponding genomic sequences have been reassembled, and the resulting assemblage is called an ACEG (an Assembly of contiguous EST sequences supported by genomic sequence) (Jain et al., 2007). Most of the unique genes or ACEGs are also represented by gene models that have been generated by the Joint Genome Institute (JGI, Walnut Creek, CA). These gene models have been placed onto the DNA scaffolds and are presented as a track on the Chlamydomonas genome browser associated with the genome portal (http://genome.jgi-psf.org/Chlre3/Chlre3.home.html). Ultimately, the meeting grant awarded by DOE has helped enormously in the development of an annotation pipeline (a set of guidelines used in the annotation of genes) and resulted in high quality annotation of over 4,000 genes; the annotators were from both Europe and the USA. Some of the people who led the annotation initiative were Arthur Grossman, Olivier Vallon, and Sabeeha Merchant (with many individual

  13. Robustness Beamforming Algorithms

    Directory of Open Access Journals (Sweden)

    Sajad Dehghani

    2014-04-01

    Full Text Available Adaptive beamforming methods are known to degrade in the presence of steering vector and covariance matrix uncertinity. In this paper, a new approach is presented to robust adaptive minimum variance distortionless response beamforming make robust against both uncertainties in steering vector and covariance matrix. This method minimize a optimization problem that contains a quadratic objective function and a quadratic constraint. The optimization problem is nonconvex but is converted to a convex optimization problem in this paper. It is solved by the interior-point method and optimum weight vector to robust beamforming is achieved.

  14. The CAPRICE RICH detector

    Energy Technology Data Exchange (ETDEWEB)

    Basini, G. [INFN, Laboratori Nazionali di Frascati, Rome (Italy); Codino, A.; Grimani, C. [Perugia Univ. (Italy)]|[INFN, Perugia (Italy); De Pascale, M.P. [Rome Univ. `Tor Vergata` (Italy). Dip. di Fisica]|[INFN, Sezione Univ. `Tor Vergata` Rome (Italy); Cafagna, F. [Bari Univ. (Italy)]|[INFN, Bari (Italy); Golden, R.L. [New Mexico State Univ., Las Cruces, NM (United States). Particle Astrophysics Lab.; Brancaccio, F.; Bocciolini, M. [Florence Univ. (Italy)]|[INFN, Florence (Italy); Barbiellini, G.; Boezio, M. [Trieste Univ. (Italy)]|[INFN, Trieste (Italy)

    1995-09-01

    A compact RICH detector has been developed and used for particle identification in a balloon borne spectrometer to measure the flux of antimatter in the cosmic radiation. This is the first RICH detector ever used in space experiments that is capable of detecting unit charged particles, such as antiprotons. The RICH and all other detectors performed well during the 27 hours long flight.

  15. BEACON: automated tool for Bacterial GEnome Annotation ComparisON.

    Science.gov (United States)

    Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B

    2015-08-18

    Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  16. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.

    2015-08-18

    Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  17. Quick Pad Tagger : An Efficient Graphical User Interface for Building Annotated Corpora with Multiple Annotation Layers

    OpenAIRE

    Marc Schreiber; Kai Barkschat; Bodo Kraft; Albert Zundorf

    2015-01-01

    More and more domain specific applications in the internet make use of Natural Language Processing (NLP) tools (e. g. Information Extraction systems). The output quality of these applications relies on the output quality of the used NLP tools. Often, the quality can be increased by annotating a domain specific corpus. However, annotating a corpus is a time consuming and exhaustive task. To reduce the annota tion time we present...

  18. Robustness Metrics: Consolidating the multiple approaches to quantify Robustness

    DEFF Research Database (Denmark)

    Göhler, Simon Moritz; Eifler, Tobias; Howard, Thomas J.

    2016-01-01

    robustness metrics; 3) Functional expectancy and dispersion robustness metrics; and 4) Probability of conformance robustness metrics. The goal was to give a comprehensive overview of robustness metrics and guidance to scholars and practitioners to understand the different types of robustness metrics...

  19. Robustness of Structures

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard

    2008-01-01

    This paper describes the background of the robustness requirements implemented in the Danish Code of Practice for Safety of Structures and in the Danish National Annex to the Eurocode 0, see (DS-INF 146, 2003), (DS 409, 2006), (EN 1990 DK NA, 2007) and (Sørensen and Christensen, 2006). More...... frequent use of advanced types of structures with limited redundancy and serious consequences in case of failure combined with increased requirements to efficiency in design and execution followed by increased risk of human errors has made the need of requirements to robustness of new structures essential....... According to Danish design rules robustness shall be documented for all structures in high consequence class. The design procedure to document sufficient robustness consists of: 1) Review of loads and possible failure modes / scenarios and determination of acceptable collapse extent; 2) Review...

  20. Robustness of structures

    DEFF Research Database (Denmark)

    Vrouwenvelder, T.; Sørensen, John Dalsgaard

    2009-01-01

    After the collapse of the World Trade Centre towers in 2001 and a number of collapses of structural systems in the beginning of the century, robustness of structural systems has gained renewed interest. Despite many significant theoretical, methodical and technological advances, structural...... of robustness for structural design such requirements are not substantiated in more detail, nor have the engineering profession been able to agree on an interpretation of robustness which facilitates for its uantification. A European COST action TU 601 on ‘Robustness of structures' has started in 2007...... by a group of members of the CSS. This paper describes the ongoing work in this action, with emphasis on the development of a theoretical and risk based quantification and optimization procedure on the one side and a practical pre-normative guideline on the other....

  1. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes

    Energy Technology Data Exchange (ETDEWEB)

    Brettin, Thomas; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Olsen, Gary J.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Thomason, James A.; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang

    2015-02-10

    The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.

  2. RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes.

    Science.gov (United States)

    Brettin, Thomas; Davis, James J; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Olsen, Gary J; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D; Shukla, Maulik; Thomason, James A; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R; Xia, Fangfang

    2015-02-10

    The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.

  3. Robust Approaches to Forecasting

    OpenAIRE

    Jennifer Castle; David Hendry; Michael P. Clements

    2014-01-01

    We investigate alternative robust approaches to forecasting, using a new class of robust devices, contrasted with equilibrium correction models. Their forecasting properties are derived facing a range of likely empirical problems at the forecast origin, including measurement errors, implulses, omitted variables, unanticipated location shifts and incorrectly included variables that experience a shift. We derive the resulting forecast biases and error variances, and indicate when the methods ar...

  4. Robustness - theoretical framework

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard; Rizzuto, Enrico; Faber, Michael H.

    2010-01-01

    More frequent use of advanced types of structures with limited redundancy and serious consequences in case of failure combined with increased requirements to efficiency in design and execution followed by increased risk of human errors has made the need of requirements to robustness of new struct...... of this fact sheet is to describe a theoretical and risk based framework to form the basis for quantification of robustness and for pre-normative guidelines....

  5. Qualitative Robustness in Estimation

    Directory of Open Access Journals (Sweden)

    Mohammed Nasser

    2012-07-01

    Full Text Available Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Times New Roman","serif";} Qualitative robustness, influence function, and breakdown point are three main concepts to judge an estimator from the viewpoint of robust estimation. It is important as well as interesting to study relation among them. This article attempts to present the concept of qualitative robustness as forwarded by first proponents and its later development. It illustrates intricacies of qualitative robustness and its relation with consistency, and also tries to remove commonly believed misunderstandings about relation between influence function and qualitative robustness citing some examples from literature and providing a new counter-example. At the end it places a useful finite and a simulated version of   qualitative robustness index (QRI. In order to assess the performance of the proposed measures, we have compared fifteen estimators of correlation coefficient using simulated as well as real data sets.

  6. Model and Interoperability using Meta Data Annotations

    Science.gov (United States)

    David, O.

    2011-12-01

    Software frameworks and architectures are in need for meta data to efficiently support model integration. Modelers have to know the context of a model, often stepping into modeling semantics and auxiliary information usually not provided in a concise structure and universal format, consumable by a range of (modeling) tools. XML often seems the obvious solution for capturing meta data, but its wide adoption to facilitate model interoperability is limited by XML schema fragmentation, complexity, and verbosity outside of a data-automation process. Ontologies seem to overcome those shortcomings, however the practical significance of their use remains to be demonstrated. OMS version 3 took a different approach for meta data representation. The fundamental building block of a modular model in OMS is a software component representing a single physical process, calibration method, or data access approach. Here, programing language features known as Annotations or Attributes were adopted. Within other (non-modeling) frameworks it has been observed that annotations lead to cleaner and leaner application code. Framework-supported model integration, traditionally accomplished using Application Programming Interfaces (API) calls is now achieved using descriptive code annotations. Fully annotated components for various hydrological and Ag-system models now provide information directly for (i) model assembly and building, (ii) data flow analysis for implicit multi-threading or visualization, (iii) automated and comprehensive model documentation of component dependencies, physical data properties, (iv) automated model and component testing, calibration, and optimization, and (v) automated audit-traceability to account for all model resources leading to a particular simulation result. Such a non-invasive methodology leads to models and modeling components with only minimal dependencies on the modeling framework but a strong reference to its originating code. Since models and

  7. Ontorat: automatic generation of new ontology terms, annotations, and axioms based on ontology design patterns.

    Science.gov (United States)

    Xiang, Zuoshuang; Zheng, Jie; Lin, Yu; He, Yongqun

    2015-01-01

    It is time-consuming to build an ontology with many terms and axioms. Thus it is desired to automate the process of ontology development. Ontology Design Patterns (ODPs) provide a reusable solution to solve a recurrent modeling problem in the context of ontology engineering. Because ontology terms often follow specific ODPs, the Ontology for Biomedical Investigations (OBI) developers proposed a Quick Term Templates (QTTs) process targeted at generating new ontology classes following the same pattern, using term templates in a spreadsheet format. Inspired by the ODPs and QTTs, the Ontorat web application is developed to automatically generate new ontology terms, annotations of terms, and logical axioms based on a specific ODP(s). The inputs of an Ontorat execution include axiom expression settings, an input data file, ID generation settings, and a target ontology (optional). The axiom expression settings can be saved as a predesigned Ontorat setting format text file for reuse. The input data file is generated based on a template file created by a specific ODP (text or Excel format). Ontorat is an efficient tool for ontology expansion. Different use cases are described. For example, Ontorat was applied to automatically generate over 1,000 Japan RIKEN cell line cell terms with both logical axioms and rich annotation axioms in the Cell Line Ontology (CLO). Approximately 800 licensed animal vaccines were represented and annotated in the Vaccine Ontology (VO) by Ontorat. The OBI team used Ontorat to add assay and device terms required by ENCODE project. Ontorat was also used to add missing annotations to all existing Biobank specific terms in the Biobank Ontology. A collection of ODPs and templates with examples are provided on the Ontorat website and can be reused to facilitate ontology development. With ever increasing ontology development and applications, Ontorat provides a timely platform for generating and annotating a large number of ontology terms by following

  8. Snpdat: Easy and rapid annotation of results from de novo snp discovery projects for model and non-model organisms

    Directory of Open Access Journals (Sweden)

    Doran Anthony G

    2013-02-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are the most abundant genetic variant found in vertebrates and invertebrates. SNP discovery has become a highly automated, robust and relatively inexpensive process allowing the identification of many thousands of mutations for model and non-model organisms. Annotating large numbers of SNPs can be a difficult and complex process. Many tools available are optimised for use with organisms densely sampled for SNPs, such as humans. There are currently few tools available that are species non-specific or support non-model organism data. Results Here we present SNPdat, a high throughput analysis tool that can provide a comprehensive annotation of both novel and known SNPs for any organism with a draft sequence and annotation. Using a dataset of 4,566 SNPs identified in cattle using high-throughput DNA sequencing we demonstrate the annotations performed and the statistics that can be generated by SNPdat. Conclusions SNPdat provides users with a simple tool for annotation of genomes that are either not supported by other tools or have a small number of annotated SNPs available. SNPdat can also be used to analyse datasets from organisms which are densely sampled for SNPs. As a command line tool it can easily be incorporated into existing SNP discovery pipelines and fills a niche for analyses involving non-model organisms that are not supported by many available SNP annotation tools. SNPdat will be of great interest to scientists involved in SNP discovery and analysis projects, particularly those with limited bioinformatics experience.

  9. Spectral trees as a robust annotation tool in LC–MS based metabolomics

    NARCIS (Netherlands)

    Hooft, van der J.J.J.; Vervoort, J.J.M.; Bino, R.J.; Vos, de C.H.

    2012-01-01

    The identification of large series of metabolites detectable by mass spectrometry (MS) in crude extracts is a challenging task. In order to test and apply the so-called multistage mass spectrometry (MS n ) spectral tree approach as tool in metabolite identification in complex sample extracts, we

  10. Consumer energy research: an annotated bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, C.D.; McDougall, G.H.G.

    1980-01-01

    This document is an updated and expanded version of an earlier annotated bibliography by Dr. C. Dennis Anderson and Carman Cullen (A Review and Annotation of Energy Research on Consumers, March 1978). It is the final draft of the major report that will be published in English and French and made publicly available through the Consumer Research and Evaluation Branch of Consumer and Corporate Affairs, Canada. Two agencies granting permission to include some of their energy abstracts are the Rand Corporation and the DOE Technical Information Center. The bibliography consists mainly of empirical studies, including surveys and experiments. It also includes a number of descriptive and econometric studies that utilize secondary data. Many of the studies provide summaries of research is specific areas, and point out directions for future research efforts. 14 tables.

  11. Annotation of selection strengths in viral genomes

    DEFF Research Database (Denmark)

    McCauley, Stephen; de Groot, Saskia; Mailund, Thomas

    2007-01-01

    Motivation: Viral genomes tend to code in overlapping reading frames to maximize information content. This may result in atypical codon bias and particular evolutionary constraints. Due to the fast mutation rate of viruses, there is additional strong evidence for varying selection between intra......- and intergenomic regions. The presence of multiple coding regions complicates the concept of Ka/Ks ratio, and thus begs for an alternative approach when investigating selection strengths. Building on the paper by McCauley & Hein (2006), we develop a method for annotating a viral genome coding in overlapping...... may thus achieve an annotation both of coding regions as well as selection strengths, allowing us to investigate different selection patterns and hypotheses. Results: We illustrate our method by applying it to a multiple alignment of four HIV2 sequences, as well as four Hepatitis B sequences. We...

  12. Annotating functional RNAs in genomes using Infernal.

    Science.gov (United States)

    Nawrocki, Eric P

    2014-01-01

    Many different types of functional non-coding RNAs participate in a wide range of important cellular functions but the large majority of these RNAs are not routinely annotated in published genomes. Several programs have been developed for identifying RNAs, including specific tools tailored to a particular RNA family as well as more general ones designed to work for any family. Many of these tools utilize covariance models (CMs), statistical models of the conserved sequence, and structure of an RNA family. In this chapter, as an illustrative example, the Infernal software package and CMs from the Rfam database are used to identify RNAs in the genome of the archaeon Methanobrevibacter ruminantium, uncovering some additional RNAs not present in the genome's initial annotation. Analysis of the results and comparison with family-specific methods demonstrate some important strengths and weaknesses of this general approach.

  13. Cameras for Public Health Surveillance: A Methods Protocol for Crowdsourced Annotation of Point-of-Sale Photographs.

    Science.gov (United States)

    Ilakkuvan, Vinu; Tacelosky, Michael; Ivey, Keith C; Pearson, Jennifer L; Cantrell, Jennifer; Vallone, Donna M; Abrams, David B; Kirchner, Thomas R

    2014-04-09

    assess the usefulness of the tool. The proportion of individuals accurately identifying the presence of a specific advertisement was higher when provided with pictures of the product's logo and an example of the ad, and even higher when also provided the zoom tool (χ(2) 2=155.7, Pphotograph annotation and methodically assess the quality of raters' work. Overall, the combination of crowdsourcing technologies with tiered data flow and tools that enhance annotation quality represents a breakthrough solution to the problem of photograph annotation, vastly expanding opportunities for the use of photographs rich in public health and other data on a scale previously unimaginable.

  14. Deburring: an annotated bibliography. Volume V

    International Nuclear Information System (INIS)

    Gillespie, L.K.

    1978-01-01

    An annotated summary of 204 articles and publications on burrs, burr prevention and deburring is presented. Thirty-seven deburring processes are listed. Entries cited include English, Russian, French, Japanese and German language articles. Entries are indexed by deburring processes, author, and language. Indexes also indicate which references discuss equipment and tooling, how to use a process, economics, burr properties, and how to design to minimize burr problems. Research studies are identified as are the materials deburred

  15. Robustness in econometrics

    CERN Document Server

    Sriboonchitta, Songsak; Huynh, Van-Nam

    2017-01-01

    This book presents recent research on robustness in econometrics. Robust data processing techniques – i.e., techniques that yield results minimally affected by outliers – and their applications to real-life economic and financial situations are the main focus of this book. The book also discusses applications of more traditional statistical techniques to econometric problems. Econometrics is a branch of economics that uses mathematical (especially statistical) methods to analyze economic systems, to forecast economic and financial dynamics, and to develop strategies for achieving desirable economic performance. In day-by-day data, we often encounter outliers that do not reflect the long-term economic trends, e.g., unexpected and abrupt fluctuations. As such, it is important to develop robust data processing techniques that can accommodate these fluctuations.

  16. Robust Manufacturing Control

    CERN Document Server

    2013-01-01

    This contributed volume collects research papers, presented at the CIRP Sponsored Conference Robust Manufacturing Control: Innovative and Interdisciplinary Approaches for Global Networks (RoMaC 2012, Jacobs University, Bremen, Germany, June 18th-20th 2012). These research papers present the latest developments and new ideas focusing on robust manufacturing control for global networks. Today, Global Production Networks (i.e. the nexus of interconnected material and information flows through which products and services are manufactured, assembled and distributed) are confronted with and expected to adapt to: sudden and unpredictable large-scale changes of important parameters which are occurring more and more frequently, event propagation in networks with high degree of interconnectivity which leads to unforeseen fluctuations, and non-equilibrium states which increasingly characterize daily business. These multi-scale changes deeply influence logistic target achievement and call for robust planning and control ...

  17. Automatic Function Annotations for Hoare Logic

    Directory of Open Access Journals (Sweden)

    Daniel Matichuk

    2012-11-01

    Full Text Available In systems verification we are often concerned with multiple, inter-dependent properties that a program must satisfy. To prove that a program satisfies a given property, the correctness of intermediate states of the program must be characterized. However, this intermediate reasoning is not always phrased such that it can be easily re-used in the proofs of subsequent properties. We introduce a function annotation logic that extends Hoare logic in two important ways: (1 when proving that a function satisfies a Hoare triple, intermediate reasoning is automatically stored as function annotations, and (2 these function annotations can be exploited in future Hoare logic proofs. This reduces duplication of reasoning between the proofs of different properties, whilst serving as a drop-in replacement for traditional Hoare logic to avoid the costly process of proof refactoring. We explain how this was implemented in Isabelle/HOL and applied to an experimental branch of the seL4 microkernel to significantly reduce the size and complexity of existing proofs.

  18. Jannovar: a java library for exome annotation.

    Science.gov (United States)

    Jäger, Marten; Wang, Kai; Bauer, Sebastian; Smedley, Damian; Krawitz, Peter; Robinson, Peter N

    2014-05-01

    Transcript-based annotation and pedigree analysis are two basic steps in the computational analysis of whole-exome sequencing experiments in genetic diagnostics and disease-gene discovery projects. Here, we present Jannovar, a stand-alone Java application as well as a Java library designed to be used in larger software frameworks for exome and genome analysis. Jannovar uses an interval tree to identify all transcripts affected by a given variant, and provides Human Genome Variation Society-compliant annotations both for variants affecting coding sequences and splice junctions as well as untranslated regions and noncoding RNA transcripts. Jannovar can also perform family-based pedigree analysis with Variant Call Format (VCF) files with data from members of a family segregating a Mendelian disorder. Using a desktop computer, Jannovar requires a few seconds to annotate a typical VCF file with exome data. Jannovar is freely available under the BSD2 license. Source code as well as the Java application and library file can be downloaded from http://compbio.charite.de (with tutorial) and https://github.com/charite/jannovar. © 2014 WILEY PERIODICALS, INC.

  19. Annotating breast cancer microarray samples using ontologies

    Science.gov (United States)

    Liu, Hongfang; Li, Xin; Yoon, Victoria; Clarke, Robert

    2008-01-01

    As the most common cancer among women, breast cancer results from the accumulation of mutations in essential genes. Recent advance in high-throughput gene expression microarray technology has inspired researchers to use the technology to assist breast cancer diagnosis, prognosis, and treatment prediction. However, the high dimensionality of microarray experiments and public access of data from many experiments have caused inconsistencies which initiated the development of controlled terminologies and ontologies for annotating microarray experiments, such as the standard microarray Gene Expression Data (MGED) ontology (MO). In this paper, we developed BCM-CO, an ontology tailored specifically for indexing clinical annotations of breast cancer microarray samples from the NCI Thesaurus. Our research showed that the coverage of NCI Thesaurus is very limited with respect to i) terms used by researchers to describe breast cancer histology (covering 22 out of 48 histology terms); ii) breast cancer cell lines (covering one out of 12 cell lines); and iii) classes corresponding to the breast cancer grading and staging. By incorporating a wider range of those terms into BCM-CO, we were able to indexed breast cancer microarray samples from GEO using BCM-CO and MGED ontology and developed a prototype system with web interface that allows the retrieval of microarray data based on the ontology annotations. PMID:18999108

  20. Robust plasmonic substrates

    DEFF Research Database (Denmark)

    Kostiučenko, Oksana; Fiutowski, Jacek; Tamulevicius, Tomas

    2014-01-01

    Robustness is a key issue for the applications of plasmonic substrates such as tip-enhanced Raman spectroscopy, surface-enhanced spectroscopies, enhanced optical biosensing, optical and optoelectronic plasmonic nanosensors and others. A novel approach for the fabrication of robust plasmonic...... substrates is presented, which relies on the coverage of gold nanostructures with diamond-like carbon (DLC) thin films of thicknesses 25, 55 and 105 nm. DLC thin films were grown by direct hydrocarbon ion beam deposition. In order to find the optimum balance between optical and mechanical properties...

  1. Robust Self Tuning Controllers

    DEFF Research Database (Denmark)

    Poulsen, Niels Kjølstad

    1985-01-01

    The present thesis concerns robustness properties of adaptive controllers. It is addressed to methods for robustifying self tuning controllers with respect to abrupt changes in the plant parameters. In the thesis an algorithm for estimating abruptly changing parameters is presented. The estimator...... has several operation modes and a detector for controlling the mode. A special self tuning controller has been developed to regulate plant with changing time delay.......The present thesis concerns robustness properties of adaptive controllers. It is addressed to methods for robustifying self tuning controllers with respect to abrupt changes in the plant parameters. In the thesis an algorithm for estimating abruptly changing parameters is presented. The estimator...

  2. Research: Rags to Rags? Riches to Riches?

    Science.gov (United States)

    Bracey, Gerald W.

    2004-01-01

    Everyone has read about what might be called the "gold gap"--how the rich in this country are getting richer and controlling an ever-larger share of the nation's wealth. The Century Foundation has started publishing "Reality Check", a series of guides to campaign issues that sometimes finds gaps in these types of cherished delusions. The guides…

  3. Robust surgery loading

    NARCIS (Netherlands)

    Hans, Elias W.; Wullink, Gerhard; van Houdenhoven, Mark; Kazemier, Geert

    2008-01-01

    We consider the robust surgery loading problem for a hospital’s operating theatre department, which concerns assigning surgeries and sufficient planned slack to operating room days. The objective is to maximize capacity utilization and minimize the risk of overtime, and thus cancelled patients. This

  4. Robustness Envelopes of Networks

    NARCIS (Netherlands)

    Trajanovski, S.; Martín-Hernández, J.; Winterbach, W.; Van Mieghem, P.

    2013-01-01

    We study the robustness of networks under node removal, considering random node failure, as well as targeted node attacks based on network centrality measures. Whilst both of these have been studied in the literature, existing approaches tend to study random failure in terms of average-case

  5. An annotated checklist of Odonata (Insecta of Kanha Tiger Reserve and adjoining areas, central India

    Directory of Open Access Journals (Sweden)

    P.K. Sahoo

    2013-01-01

    Full Text Available Odonates were recorded from Kanha Tiger Reserve and its adjoining areas during January-December 2010. Thirty eight species were recorded belonging to seven families and 26 genera. Twelve species distribution is first time recorded from the reserve. With the addition of these newly recorded species with the previous records the species richness of the reserve increased up to 48 species, belonging to eight families. Among the collected Anisopterans Orthretum sabina sabina (Drury was the most abundant species. A detailed annotated checklist of recorded odonates with the previous records is presented in the Table.

  6. Evaluation of three automated genome annotations for Halorhabdus utahensis.

    Directory of Open Access Journals (Sweden)

    Peter Bakke

    2009-07-01

    Full Text Available Genome annotations are accumulating rapidly and depend heavily on automated annotation systems. Many genome centers offer annotation systems but no one has compared their output in a systematic way to determine accuracy and inherent errors. Errors in the annotations are routinely deposited in databases such as NCBI and used to validate subsequent annotation errors. We submitted the genome sequence of halophilic archaeon Halorhabdus utahensis to be analyzed by three genome annotation services. We have examined the output from each service in a variety of ways in order to compare the methodology and effectiveness of the annotations, as well as to explore the genes, pathways, and physiology of the previously unannotated genome. The annotation services differ considerably in gene calls, features, and ease of use. We had to manually identify the origin of replication and the species-specific consensus ribosome-binding site. Additionally, we conducted laboratory experiments to test H. utahensis growth and enzyme activity. Current annotation practices need to improve in order to more accurately reflect a genome's biological potential. We make specific recommendations that could improve the quality of microbial annotation projects.

  7. Sequencing and annotation of mitochondrial genomes from individual parasitic helminths.

    Science.gov (United States)

    Jex, Aaron R; Littlewood, D Timothy; Gasser, Robin B

    2015-01-01

    Mitochondrial (mt) genomics has significant implications in a range of fundamental areas of parasitology, including evolution, systematics, and population genetics as well as explorations of mt biochemistry, physiology, and function. Mt genomes also provide a rich source of markers to aid molecular epidemiological and ecological studies of key parasites. However, there is still a paucity of information on mt genomes for many metazoan organisms, particularly parasitic helminths, which has often related to challenges linked to sequencing from tiny amounts of material. The advent of next-generation sequencing (NGS) technologies has paved the way for low cost, high-throughput mt genomic research, but there have been obstacles, particularly in relation to post-sequencing assembly and analyses of large datasets. In this chapter, we describe protocols for the efficient amplification and sequencing of mt genomes from small portions of individual helminths, and highlight the utility of NGS platforms to expedite mt genomics. In addition, we recommend approaches for manual or semi-automated bioinformatic annotation and analyses to overcome the bioinformatic "bottleneck" to research in this area. Taken together, these approaches have demonstrated applicability to a range of parasites and provide prospects for using complete mt genomic sequence datasets for large-scale molecular systematic and epidemiological studies. In addition, these methods have broader utility and might be readily adapted to a range of other medium-sized molecular regions (i.e., 10-100 kb), including large genomic operons, and other organellar (e.g., plastid) and viral genomes.

  8. Kings Today, Rich Tomorrow

    DEFF Research Database (Denmark)

    Fattoum, Asma

    2013-01-01

    This study investigates the King vs. Rich dilemma that founder-CEOs face at IPO. When undertaking IPO, founders face two options. They can either get rich, but then run the risk of losing the control over their firms; or they can remain kings by introducing defensive mechanisms, but this is likel...

  9. Developments on RICH detectors

    International Nuclear Information System (INIS)

    Besson, P.; Bourgeois, P.

    1996-01-01

    The RICH (ring imaging Cherenkov) detector which is dedicated to Cherenkov radiation detection is described. An improvement made by replacing photo sensible vapor with solid photocathode is studied. A RICH detector prototype with a CsI photocathode has been built in Saclay and used with Saturne. The first results are presented. (A.C.)

  10. Robust holographic storage system design.

    Science.gov (United States)

    Watanabe, Takahiro; Watanabe, Minoru

    2011-11-21

    Demand is increasing daily for large data storage systems that are useful for applications in spacecraft, space satellites, and space robots, which are all exposed to radiation-rich space environment. As candidates for use in space embedded systems, holographic storage systems are promising because they can easily provided the demanded large-storage capability. Particularly, holographic storage systems, which have no rotation mechanism, are demanded because they are virtually maintenance-free. Although a holographic memory itself is an extremely robust device even in a space radiation environment, its associated lasers and drive circuit devices are vulnerable. Such vulnerabilities sometimes engendered severe problems that prevent reading of all contents of the holographic memory, which is a turn-off failure mode of a laser array. This paper therefore presents a proposal for a recovery method for the turn-off failure mode of a laser array on a holographic storage system, and describes results of an experimental demonstration. © 2011 Optical Society of America

  11. Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger.

    Science.gov (United States)

    Wright, James C; Sugden, Deana; Francis-McIntyre, Sue; Riba-Garcia, Isabel; Gaskell, Simon J; Grigoriev, Igor V; Baker, Scott E; Beynon, Robert J; Hubbard, Simon J

    2009-02-04

    Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS) and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR). 405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6%) of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models. This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST) data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method.

  12. Plann: A command-line application for annotating plastome sequences.

    Science.gov (United States)

    Huang, Daisie I; Cronk, Quentin C B

    2015-08-01

    Plann automates the process of annotating a plastome sequence in GenBank format for either downstream processing or for GenBank submission by annotating a new plastome based on a similar, well-annotated plastome. Plann is a Perl script to be executed on the command line. Plann compares a new plastome sequence to the features annotated in a reference plastome and then shifts the intervals of any matching features to the locations in the new plastome. Plann's output can be used in the National Center for Biotechnology Information's tbl2asn to create a Sequin file for GenBank submission. Unlike Web-based annotation packages, Plann is a locally executable script that will accurately annotate a plastome sequence to a locally specified reference plastome. Because it executes from the command line, it is ready to use in other software pipelines and can be easily rerun as a draft plastome is improved.

  13. A robust classic.

    Science.gov (United States)

    Kutzner, Florian; Vogel, Tobias; Freytag, Peter; Fiedler, Klaus

    2011-01-01

    In the present research, we argue for the robustness of illusory correlations (ICs, Hamilton & Gifford, 1976) regarding two boundary conditions suggested in previous research. First, we argue that ICs are maintained under extended experience. Using simulations, we derive conflicting predictions. Whereas noise-based accounts predict ICs to be maintained (Fielder, 2000; Smith, 1991), a prominent account based on discrepancy-reducing feedback learning predicts ICs to disappear (Van Rooy et al., 2003). An experiment involving 320 observations with majority and minority members supports the claim that ICs are maintained. Second, we show that actively using the stereotype to make predictions that are met with reward and punishment does not eliminate the bias. In addition, participants' operant reactions afford a novel online measure of ICs. In sum, our findings highlight the robustness of ICs that can be explained as a result of unbiased but noisy learning.

  14. A Robust Geometric Model for Argument Classification

    Science.gov (United States)

    Giannone, Cristina; Croce, Danilo; Basili, Roberto; de Cao, Diego

    Argument classification is the task of assigning semantic roles to syntactic structures in natural language sentences. Supervised learning techniques for frame semantics have been recently shown to benefit from rich sets of syntactic features. However argument classification is also highly dependent on the semantics of the involved lexicals. Empirical studies have shown that domain dependence of lexical information causes large performance drops in outside domain tests. In this paper a distributional approach is proposed to improve the robustness of the learning model against out-of-domain lexical phenomena.

  15. Annotation of mammalian primary microRNAs

    Directory of Open Access Journals (Sweden)

    Enright Anton J

    2008-11-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are important regulators of gene expression and have been implicated in development, differentiation and pathogenesis. Hundreds of miRNAs have been discovered in mammalian genomes. Approximately 50% of mammalian miRNAs are expressed from introns of protein-coding genes; the primary transcript (pri-miRNA is therefore assumed to be the host transcript. However, very little is known about the structure of pri-miRNAs expressed from intergenic regions. Here we annotate transcript boundaries of miRNAs in human, mouse and rat genomes using various transcription features. The 5' end of the pri-miRNA is predicted from transcription start sites, CpG islands and 5' CAGE tags mapped in the upstream flanking region surrounding the precursor miRNA (pre-miRNA. The 3' end of the pri-miRNA is predicted based on the mapping of polyA signals, and supported by cDNA/EST and ditags data. The predicted pri-miRNAs are also analyzed for promoter and insulator-associated regulatory regions. Results We define sets of conserved and non-conserved human, mouse and rat pre-miRNAs using bidirectional BLAST and synteny analysis. Transcription features in their flanking regions are used to demarcate the 5' and 3' boundaries of the pri-miRNAs. The lengths and boundaries of primary transcripts are highly conserved between orthologous miRNAs. A significant fraction of pri-miRNAs have lengths between 1 and 10 kb, with very few introns. We annotate a total of 59 pri-miRNA structures, which include 82 pre-miRNAs. 36 pri-miRNAs are conserved in all 3 species. In total, 18 of the confidently annotated transcripts express more than one pre-miRNA. The upstream regions of 54% of the predicted pri-miRNAs are found to be associated with promoter and insulator regulatory sequences. Conclusion Little is known about the primary transcripts of intergenic miRNAs. Using comparative data, we are able to identify the boundaries of a significant proportion of

  16. Annotated bibliography of Software Engineering Laboratory literature

    Science.gov (United States)

    Morusiewicz, Linda; Valett, Jon D.

    1991-01-01

    An annotated bibliography of technical papers, documents, and memorandums produced by or related to the Software Engineering Laboratory is given. More than 100 publications are summarized. These publications cover many areas of software engineering and range from research reports to software documentation. All materials have been grouped into eight general subject areas for easy reference: The Software Engineering Laboratory; The Software Engineering Laboratory: Software Development Documents; Software Tools; Software Models; Software Measurement; Technology Evaluations; Ada Technology; and Data Collection. Subject and author indexes further classify these documents by specific topic and individual author.

  17. Robust Airline Schedules

    OpenAIRE

    Eggenberg, Niklaus; Salani, Matteo; Bierlaire, Michel

    2010-01-01

    Due to economic pressure industries, when planning, tend to focus on optimizing the expected profit or the yield. The consequence of highly optimized solutions is an increased sensitivity to uncertainty. This generates additional "operational" costs, incurred by possible modifications of the original plan to be performed when reality does not reflect what was expected in the planning phase. The modern research trend focuses on "robustness" of solutions instead of yield or profit. Although ro...

  18. The Crane Robust Control

    Directory of Open Access Journals (Sweden)

    Marek Hicar

    2004-01-01

    Full Text Available The article is about a control design for complete structure of the crane: crab, bridge and crane uplift.The most important unknown parameters for simulations are burden weight and length of hanging rope. We will use robustcontrol for crab and bridge control to ensure adaptivity for burden weight and rope length. Robust control will be designed for current control of the crab and bridge, necessary is to know the range of unknown parameters. Whole robust will be splitto subintervals and after correct identification of unknown parameters the most suitable robust controllers will be chosen.The most important condition at the crab and bridge motion is avoiding from burden swinging in the final position. Crab and bridge drive is designed by asynchronous motor fed from frequency converter. We will use crane uplift with burden weightobserver in combination for uplift, crab and bridge drive with cooperation of their parameters: burden weight, rope length and crab and bridge position. Controllers are designed by state control method. We will use preferably a disturbance observerwhich will identify burden weight as a disturbance. The system will be working in both modes at empty hook as well asat maximum load: burden uplifting and dropping down.

  19. Protein sequence annotation in the genome era: the annotation concept of SWISS-PROT+TREMBL.

    Science.gov (United States)

    Apweiler, R; Gateau, A; Contrino, S; Martin, M J; Junker, V; O'Donovan, C; Lang, F; Mitaritonna, N; Kappus, S; Bairoch, A

    1997-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation, a minimal level of redundancy and high level of integration with other databases. Ongoing genome sequencing projects have dramatically increased the number of protein sequences to be incorporated into SWISS-PROT. Since we do not want to dilute the quality standards of SWISS-PROT by incorporating sequences without proper sequence analysis and annotation, we cannot speed up the incorporation of new incoming data indefinitely. However, as we also want to make the sequences available as fast as possible, we introduced TREMBL (TRanslation of EMBL nucleotide sequence database), a supplement to SWISS-PROT. TREMBL consists of computer-annotated entries in SWISS-PROT format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except for CDS already included in SWISS-PROT. While TREMBL is already of immense value, its computer-generated annotation does not match the quality of SWISS-PROTs. The main difference is in the protein functional information attached to sequences. With this in mind, we are dedicating substantial effort to develop and apply computer methods to enhance the functional information attached to TREMBL entries.

  20. IMPROVEMENT OF THE RICHNESS ESTIMATES OF maxBCG CLUSTERS

    International Nuclear Information System (INIS)

    Rozo, Eduardo; Rykoff, Eli S.; Koester, Benjamin P.; Hansen, Sarah; Becker, Matthew; Bleem, Lindsey; McKay, Timothy; Hao Jiangang; Evrard, August; Wechsler, Risa H.; Sheldon, Erin; Johnston, David; Annis, James; Scranton, Ryan

    2009-01-01

    Minimizing the scatter between cluster mass and accessible observables is an important goal for cluster cosmology. In this work, we introduce a new matched filter richness estimator, and test its performance using the maxBCG cluster catalog. Our new estimator significantly reduces the variance in the L X -richness relation, from σ lnLx 2 = (0.86±0.02) 2 to σ lnLx 2 = (0.69±0.02) 2 . Relative to the maxBCG richness estimate, it also removes the strong redshift dependence of the L X -richness scaling relations, and is significantly more robust to photometric and redshift errors. These improvements are largely due to the better treatment of galaxy color data. We also demonstrate the scatter in the L X -richness relation depends on the aperture used to estimate cluster richness, and introduce a novel approach for optimizing said aperture which can easily be generalized to other mass tracers.

  1. A Novel Approach to Semantic and Coreference Annotation at LLNL

    Energy Technology Data Exchange (ETDEWEB)

    Firpo, M

    2005-02-04

    A case is made for the importance of high quality semantic and coreference annotation. The challenges of providing such annotation are described. Asperger's Syndrome is introduced, and the connections are drawn between the needs of text annotation and the abilities of persons with Asperger's Syndrome to meet those needs. Finally, a pilot program is recommended wherein semantic annotation is performed by people with Asperger's Syndrome. The primary points embodied in this paper are as follows: (1) Document annotation is essential to the Natural Language Processing (NLP) projects at Lawrence Livermore National Laboratory (LLNL); (2) LLNL does not currently have a system in place to meet its need for text annotation; (3) Text annotation is challenging for a variety of reasons, many related to its very rote nature; (4) Persons with Asperger's Syndrome are particularly skilled at rote verbal tasks, and behavioral experts agree that they would excel at text annotation; and (6) A pilot study is recommend in which two to three people with Asperger's Syndrome annotate documents and then the quality and throughput of their work is evaluated relative to that of their neuro-typical peers.

  2. Review of actinide-sediment reactions with an annotated bibliography

    Energy Technology Data Exchange (ETDEWEB)

    Ames, L.L.; Rai, D.; Serne, R.J.

    1976-02-10

    The annotated bibliography is divided into sections on chemistry and geochemistry, migration and accumulation, cultural distributions, natural distributions, and bibliographies and annual reviews. (LK)

  3. Correction of the Caulobacter crescentus NA1000 genome annotation.

    Directory of Open Access Journals (Sweden)

    Bert Ely

    Full Text Available Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.

  4. Annotating non-coding regions of the genome.

    Science.gov (United States)

    Alexander, Roger P; Fang, Gang; Rozowsky, Joel; Snyder, Michael; Gerstein, Mark B

    2010-08-01

    Most of the human genome consists of non-protein-coding DNA. Recently, progress has been made in annotating these non-coding regions through the interpretation of functional genomics experiments and comparative sequence analysis. One can conceptualize functional genomics analysis as involving a sequence of steps: turning the output of an experiment into a 'signal' at each base pair of the genome; smoothing this signal and segmenting it into small blocks of initial annotation; and then clustering these small blocks into larger derived annotations and networks. Finally, one can relate functional genomics annotations to conserved units and measures of conservation derived from comparative sequence analysis.

  5. Revisiting Robustness and Evolvability: Evolution in Weighted Genotype Spaces

    Science.gov (United States)

    Partha, Raghavendran; Raman, Karthik

    2014-01-01

    Robustness and evolvability are highly intertwined properties of biological systems. The relationship between these properties determines how biological systems are able to withstand mutations and show variation in response to them. Computational studies have explored the relationship between these two properties using neutral networks of RNA sequences (genotype) and their secondary structures (phenotype) as a model system. However, these studies have assumed every mutation to a sequence to be equally likely; the differences in the likelihood of the occurrence of various mutations, and the consequence of probabilistic nature of the mutations in such a system have previously been ignored. Associating probabilities to mutations essentially results in the weighting of genotype space. We here perform a comparative analysis of weighted and unweighted neutral networks of RNA sequences, and subsequently explore the relationship between robustness and evolvability. We show that assuming an equal likelihood for all mutations (as in an unweighted network), underestimates robustness and overestimates evolvability of a system. In spite of discarding this assumption, we observe that a negative correlation between sequence (genotype) robustness and sequence evolvability persists, and also that structure (phenotype) robustness promotes structure evolvability, as observed in earlier studies using unweighted networks. We also study the effects of base composition bias on robustness and evolvability. Particularly, we explore the association between robustness and evolvability in a sequence space that is AU-rich – sequences with an AU content of 80% or higher, compared to a normal (unbiased) sequence space. We find that evolvability of both sequences and structures in an AU-rich space is lesser compared to the normal space, and robustness higher. We also observe that AU-rich populations evolving on neutral networks of phenotypes, can access less phenotypic variation compared to

  6. Training nuclei detection algorithms with simple annotations

    Directory of Open Access Journals (Sweden)

    Henning Kost

    2017-01-01

    Full Text Available Background: Generating good training datasets is essential for machine learning-based nuclei detection methods. However, creating exhaustive nuclei contour annotations, to derive optimal training data from, is often infeasible. Methods: We compared different approaches for training nuclei detection methods solely based on nucleus center markers. Such markers contain less accurate information, especially with regard to nuclear boundaries, but can be produced much easier and in greater quantities. The approaches use different automated sample extraction methods to derive image positions and class labels from nucleus center markers. In addition, the approaches use different automated sample selection methods to improve the detection quality of the classification algorithm and reduce the run time of the training process. We evaluated the approaches based on a previously published generic nuclei detection algorithm and a set of Ki-67-stained breast cancer images. Results: A Voronoi tessellation-based sample extraction method produced the best performing training sets. However, subsampling of the extracted training samples was crucial. Even simple class balancing improved the detection quality considerably. The incorporation of active learning led to a further increase in detection quality. Conclusions: With appropriate sample extraction and selection methods, nuclei detection algorithms trained on the basis of simple center marker annotations can produce comparable quality to algorithms trained on conventionally created training sets.

  7. Annotate-it: a Swiss-knife approach to annotation, analysis and interpretation of single nucleotide variation in human disease.

    Science.gov (United States)

    Sifrim, Alejandro; Van Houdt, Jeroen Kj; Tranchevent, Leon-Charles; Nowakowska, Beata; Sakai, Ryo; Pavlopoulos, Georgios A; Devriendt, Koen; Vermeesch, Joris R; Moreau, Yves; Aerts, Jan

    2012-01-01

    The increasing size and complexity of exome/genome sequencing data requires new tools for clinical geneticists to discover disease-causing variants. Bottlenecks in identifying the causative variation include poor cross-sample querying, constantly changing functional annotation and not considering existing knowledge concerning the phenotype. We describe a methodology that facilitates exploration of patient sequencing data towards identification of causal variants under different genetic hypotheses. Annotate-it facilitates handling, analysis and interpretation of high-throughput single nucleotide variant data. We demonstrate our strategy using three case studies. Annotate-it is freely available and test data are accessible to all users at http://www.annotate-it.org.

  8. Robust efficient video fingerprinting

    Science.gov (United States)

    Puri, Manika; Lubin, Jeffrey

    2009-02-01

    We have developed a video fingerprinting system with robustness and efficiency as the primary and secondary design criteria. In extensive testing, the system has shown robustness to cropping, letter-boxing, sub-titling, blur, drastic compression, frame rate changes, size changes and color changes, as well as to the geometric distortions often associated with camcorder capture in cinema settings. Efficiency is afforded by a novel two-stage detection process in which a fast matching process first computes a number of likely candidates, which are then passed to a second slower process that computes the overall best match with minimal false alarm probability. One key component of the algorithm is a maximally stable volume computation - a three-dimensional generalization of maximally stable extremal regions - that provides a content-centric coordinate system for subsequent hash function computation, independent of any affine transformation or extensive cropping. Other key features include an efficient bin-based polling strategy for initial candidate selection, and a final SIFT feature-based computation for final verification. We describe the algorithm and its performance, and then discuss additional modifications that can provide further improvement to efficiency and accuracy.

  9. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.; Alam, Intikhab; Bajic, Vladimir B.

    2015-01-01

    We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  10. Prepare-Participate-Connect: Active Learning with Video Annotation

    Science.gov (United States)

    Colasante, Meg; Douglas, Kathy

    2016-01-01

    Annotation of video provides students with the opportunity to view and engage with audiovisual content in an interactive and participatory way rather than in passive-receptive mode. This article discusses research into the use of video annotation in four vocational programs at RMIT University in Melbourne, which allowed students to interact with…

  11. The GATO gene annotation tool for research laboratories

    Directory of Open Access Journals (Sweden)

    A. Fujita

    2005-11-01

    Full Text Available Large-scale genome projects have generated a rapidly increasing number of DNA sequences. Therefore, development of computational methods to rapidly analyze these sequences is essential for progress in genomic research. Here we present an automatic annotation system for preliminary analysis of DNA sequences. The gene annotation tool (GATO is a Bioinformatics pipeline designed to facilitate routine functional annotation and easy access to annotated genes. It was designed in view of the frequent need of genomic researchers to access data pertaining to a common set of genes. In the GATO system, annotation is generated by querying some of the Web-accessible resources and the information is stored in a local database, which keeps a record of all previous annotation results. GATO may be accessed from everywhere through the internet or may be run locally if a large number of sequences are going to be annotated. It is implemented in PHP and Perl and may be run on any suitable Web server. Usually, installation and application of annotation systems require experience and are time consuming, but GATO is simple and practical, allowing anyone with basic skills in informatics to access it without any special training. GATO can be downloaded at [http://mariwork.iq.usp.br/gato/]. Minimum computer free space required is 2 MB.

  12. A Selected Annotated Bibliography on Work Time Options.

    Science.gov (United States)

    Ivantcho, Barbara

    This annotated bibliography is divided into three sections. Section I contains annotations of general publications on work time options. Section II presents resources on flexitime and the compressed work week. In Section III are found resources related to these reduced work time options: permanent part-time employment, job sharing, voluntary…

  13. Propagating annotations of molecular networks using in silico fragmentation.

    Science.gov (United States)

    da Silva, Ricardo R; Wang, Mingxun; Nothias, Louis-Félix; van der Hooft, Justin J J; Caraballo-Rodríguez, Andrés Mauricio; Fox, Evan; Balunas, Marcy J; Klassen, Jonathan L; Lopes, Norberto Peporine; Dorrestein, Pieter C

    2018-04-18

    The annotation of small molecules is one of the most challenging and important steps in untargeted mass spectrometry analysis, as most of our biological interpretations rely on structural annotations. Molecular networking has emerged as a structured way to organize and mine data from untargeted tandem mass spectrometry (MS/MS) experiments and has been widely applied to propagate annotations. However, propagation is done through manual inspection of MS/MS spectra connected in the spectral networks and is only possible when a reference library spectrum is available. One of the alternative approaches used to annotate an unknown fragmentation mass spectrum is through the use of in silico predictions. One of the challenges of in silico annotation is the uncertainty around the correct structure among the predicted candidate lists. Here we show how molecular networking can be used to improve the accuracy of in silico predictions through propagation of structural annotations, even when there is no match to a MS/MS spectrum in spectral libraries. This is accomplished through creating a network consensus of re-ranked structural candidates using the molecular network topology and structural similarity to improve in silico annotations. The Network Annotation Propagation (NAP) tool is accessible through the GNPS web-platform https://gnps.ucsd.edu/ProteoSAFe/static/gnps-theoretical.jsp.

  14. Gene calling and bacterial genome annotation with BG7.

    Science.gov (United States)

    Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo

    2015-01-01

    New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).

  15. Online Metacognitive Strategies, Hypermedia Annotations, and Motivation on Hypertext Comprehension

    Science.gov (United States)

    Shang, Hui-Fang

    2016-01-01

    This study examined the effect of online metacognitive strategies, hypermedia annotations, and motivation on reading comprehension in a Taiwanese hypertext environment. A path analysis model was proposed based on the assumption that if English as a foreign language learners frequently use online metacognitive strategies and hypermedia annotations,…

  16. Protein Annotators' Assistant: A Novel Application of Information Retrieval Techniques.

    Science.gov (United States)

    Wise, Michael J.

    2000-01-01

    Protein Annotators' Assistant (PAA) is a software system which assists protein annotators in assigning functions to newly sequenced proteins. PAA employs a number of information retrieval techniques in a novel setting and is thus related to text categorization, where multiple categories may be suggested, except that in this case none of the…

  17. Automated evaluation of annotators for museum collections using subjective login

    NARCIS (Netherlands)

    Ceolin, D.; Nottamkandath, A.; Fokkink, W.J.; Dimitrakos, Th.; Moona, R.; Patel, Dh.; Harrison McKnight, D.

    2012-01-01

    Museums are rapidly digitizing their collections, and face a huge challenge to annotate every digitized artifact in store. Therefore they are opening up their archives for receiving annotations from experts world-wide. This paper presents an architecture for choosing the most eligible set of

  18. Collaborative Paper-Based Annotation of Lecture Slides

    Science.gov (United States)

    Steimle, Jurgen; Brdiczka, Oliver; Muhlhauser, Max

    2009-01-01

    In a study of notetaking in university courses, we found that the large majority of students prefer paper to computer-based media like Tablet PCs for taking notes and making annotations. Based on this finding, we developed CoScribe, a concept and system which supports students in making collaborative handwritten annotations on printed lecture…

  19. Annotating with Propp's Morphology of the Folktale: Reproducibility and Trainability

    NARCIS (Netherlands)

    Fisseni, B.; Kurji, A.; Löwe, B.

    2014-01-01

    We continue the study of the reproducibility of Propp’s annotations from Bod et al. (2012). We present four experiments in which test subjects were taught Propp’s annotation system; we conclude that Propp’s system needs a significant amount of training, but that with sufficient time investment, it

  20. Developing Annotation Solutions for Online Data Driven Learning

    Science.gov (United States)

    Perez-Paredes, Pascual; Alcaraz-Calero, Jose M.

    2009-01-01

    Although "annotation" is a widely-researched topic in Corpus Linguistics (CL), its potential role in Data Driven Learning (DDL) has not been addressed in depth by Foreign Language Teaching (FLT) practitioners. Furthermore, most of the research in the use of DDL methods pays little attention to annotation in the design and implementation…

  1. Automatic Annotation Method on Learners' Opinions in Case Method Discussion

    Science.gov (United States)

    Samejima, Masaki; Hisakane, Daichi; Komoda, Norihisa

    2015-01-01

    Purpose: The purpose of this paper is to annotate an attribute of a problem, a solution or no annotation on learners' opinions automatically for supporting the learners' discussion without a facilitator. The case method aims at discussing problems and solutions in a target case. However, the learners miss discussing some of problems and solutions.…

  2. First generation annotations for the fathead minnow (Pimephales promelas) genome

    Science.gov (United States)

    Ab initio gene prediction and evidence alignment were used to produce the first annotations for the fathead minnow SOAPdenovo genome assembly. Additionally, a genome browser hosted at genome.setac.org provides simplified access to the annotation data in context with fathead minno...

  3. Improving Microbial Genome Annotations in an Integrated Database Context

    Science.gov (United States)

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Anderson, Iain; Mavromatis, Konstantinos; Kyrpides, Nikos C.; Ivanova, Natalia N.

    2013-01-01

    Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG) family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/. PMID:23424620

  4. Ten steps to get started in Genome Assembly and Annotation

    Science.gov (United States)

    Dominguez Del Angel, Victoria; Hjerde, Erik; Sterck, Lieven; Capella-Gutierrez, Salvadors; Notredame, Cederic; Vinnere Pettersson, Olga; Amselem, Joelle; Bouri, Laurent; Bocs, Stephanie; Klopp, Christophe; Gibrat, Jean-Francois; Vlasova, Anna; Leskosek, Brane L.; Soler, Lucile; Binzer-Panchal, Mahesh; Lantz, Henrik

    2018-01-01

    As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR). PMID:29568489

  5. Sharing Map Annotations in Small Groups: X Marks the Spot

    Science.gov (United States)

    Congleton, Ben; Cerretani, Jacqueline; Newman, Mark W.; Ackerman, Mark S.

    Advances in location-sensing technology, coupled with an increasingly pervasive wireless Internet, have made it possible (and increasingly easy) to access and share information with context of one’s geospatial location. We conducted a four-phase study, with 27 students, to explore the practices surrounding the creation, interpretation and sharing of map annotations in specific social contexts. We found that annotation authors consider multiple factors when deciding how to annotate maps, including the perceived utility to the audience and how their contributions will reflect on the image they project to others. Consumers of annotations value the novelty of information, but must be convinced of the author’s credibility. In this paper we describe our study, present the results, and discuss implications for the design of software for sharing map annotations.

  6. Improving microbial genome annotations in an integrated database context.

    Directory of Open Access Journals (Sweden)

    I-Min A Chen

    Full Text Available Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.

  7. Semantator: annotating clinical narratives with semantic web ontologies.

    Science.gov (United States)

    Song, Dezhao; Chute, Christopher G; Tao, Cui

    2012-01-01

    To facilitate clinical research, clinical data needs to be stored in a machine processable and understandable way. Manual annotating clinical data is time consuming. Automatic approaches (e.g., Natural Language Processing systems) have been adopted to convert such data into structured formats; however, the quality of such automatically extracted data may not always be satisfying. In this paper, we propose Semantator, a semi-automatic tool for document annotation with Semantic Web ontologies. With a loaded free text document and an ontology, Semantator supports the creation/deletion of ontology instances for any document fragment, linking/disconnecting instances with the properties in the ontology, and also enables automatic annotation by connecting to the NCBO annotator and cTAKES. By representing annotations in Semantic Web standards, Semantator supports reasoning based upon the underlying semantics of the owl:disjointWith and owl:equivalentClass predicates. We present discussions based on user experiences of using Semantator.

  8. Annotated bibliography of software engineering laboratory literature

    Science.gov (United States)

    Kistler, David; Bristow, John; Smith, Don

    1994-01-01

    This document is an annotated bibliography of technical papers, documents, and memorandums produced by or related to the Software Engineering Laboratory. Nearly 200 publications are summarized. These publications cover many areas of software engineering and range from research reports to software documentation. This document has been updated and reorganized substantially since the original version (SEL-82-006, November 1982). All materials have been grouped into eight general subject areas for easy reference: (1) The Software Engineering Laboratory; (2) The Software Engineering Laboratory: Software Development Documents; (3) Software Tools; (4) Software Models; (5) Software Measurement; (6) Technology Evaluations; (7) Ada Technology; and (8) Data Collection. This document contains an index of these publications classified by individual author.

  9. Preprocessing Greek Papyri for Linguistic Annotation

    Directory of Open Access Journals (Sweden)

    Vierros, Marja

    2017-08-01

    Full Text Available Greek documentary papyri form an important direct source for Ancient Greek. It has been exploited surprisingly little in Greek linguistics due to a lack of good tools for searching linguistic structures. This article presents a new tool and digital platform, “Sematia”, which enables transforming the digital texts available in TEI EpiDoc XML format to a format which can be morphologically and syntactically annotated (treebanked, and where the user can add new metadata concerning the text type, writer and handwriting of each act of writing. An important aspect in this process is to take into account the original surviving writing vs. the standardization of language and supplements made by the editors. This is performed by creating two different layers of the same text. The platform is in its early development phase. Ongoing and future developments, such as tagging linguistic variation phenomena as well as queries performed within Sematia, are discussed at the end of the article.

  10. Promoting positive parenting: an annotated bibliography.

    Science.gov (United States)

    Ahmann, Elizabeth

    2002-01-01

    Positive parenting is built on respect for children and helps develop self-esteem, inner discipline, self-confidence, responsibility, and resourcefulness. Positive parenting is also good for parents: parents feel good about parenting well. It builds a sense of dignity. Positive parenting can be learned. Understanding normal development is a first step, so that parents can distinguish common behaviors in a stage of development from "problems." Central to positive parenting is developing thoughtful approaches to child guidance that can be used in place of anger, manipulation, punishment, and rewards. Support for developing creative and loving approaches to meet special parenting challenges, such as temperament, disabilities, separation and loss, and adoption, is sometimes necessary as well. This annotated bibliography offers resources to professionals helping parents and to parents wishing to develop positive parenting skills.

  11. Entrainment: an annotated bibliography. Interim report

    International Nuclear Information System (INIS)

    Carrier, R.F.; Hannon, E.H.

    1979-04-01

    The 604 annotated references in this bibliography on the effects of pumped entrainment of aquatic organisms through the cooling systems of thermal power plants were compiled from published and unpublished literature and cover the years 1947 through 1977. References to published literature were obtained by searching large-scale commercial data bases, ORNL in-house-generated data bases, relevant journals, and periodical bibliographies. The unpublished literature is a compilation of Sections 316(a) and 316(b) demonstrations, environmental impact statements, and environmental reports prepared by the utilities in compliance with Federal Water Pollution Control Administration regulations. The bibliography includes references on monitoring studies at power plant sites, laboratory studies of physical and biological effects on entrained organisms, engineering strategies for the mitigation of entrainment effects, and selected theoretical studies concerned with the methodology for determining entrainment effects

  12. The effectiveness of annotated (vs. non-annotated) digital pathology slides as a teaching tool during dermatology and pathology residencies.

    Science.gov (United States)

    Marsch, Amanda F; Espiritu, Baltazar; Groth, John; Hutchens, Kelli A

    2014-06-01

    With today's technology, paraffin-embedded, hematoxylin & eosin-stained pathology slides can be scanned to generate high quality virtual slides. Using proprietary software, digital images can also be annotated with arrows, circles and boxes to highlight certain diagnostic features. Previous studies assessing digital microscopy as a teaching tool did not involve the annotation of digital images. The objective of this study was to compare the effectiveness of annotated digital pathology slides versus non-annotated digital pathology slides as a teaching tool during dermatology and pathology residencies. A study group composed of 31 dermatology and pathology residents was asked to complete an online pre-quiz consisting of 20 multiple choice style questions, each associated with a static digital pathology image. After completion, participants were given access to an online tutorial composed of digitally annotated pathology slides and subsequently asked to complete a post-quiz. A control group of 12 residents completed a non-annotated version of the tutorial. Nearly all participants in the study group improved their quiz score, with an average improvement of 17%, versus only 3% (P = 0.005) in the control group. These results support the notion that annotated digital pathology slides are superior to non-annotated slides for the purpose of resident education. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  13. Robust automated knowledge capture.

    Energy Technology Data Exchange (ETDEWEB)

    Stevens-Adams, Susan Marie; Abbott, Robert G.; Forsythe, James Chris; Trumbo, Michael Christopher Stefan; Haass, Michael Joseph; Hendrickson, Stacey M. Langfitt

    2011-10-01

    This report summarizes research conducted through the Sandia National Laboratories Robust Automated Knowledge Capture Laboratory Directed Research and Development project. The objective of this project was to advance scientific understanding of the influence of individual cognitive attributes on decision making. The project has developed a quantitative model known as RumRunner that has proven effective in predicting the propensity of an individual to shift strategies on the basis of task and experience related parameters. Three separate studies are described which have validated the basic RumRunner model. This work provides a basis for better understanding human decision making in high consequent national security applications, and in particular, the individual characteristics that underlie adaptive thinking.

  14. Passion, Robustness and Perseverance

    DEFF Research Database (Denmark)

    Lim, Miguel Antonio; Lund, Rebecca

    2016-01-01

    Evaluation and merit in the measured university are increasingly based on taken-for-granted assumptions about the “ideal academic”. We suggest that the scholar now needs to show that she is passionate about her work and that she gains pleasure from pursuing her craft. We suggest that passion...... and pleasure achieve an exalted status as something compulsory. The scholar ought to feel passionate about her work and signal that she takes pleasure also in the difficult moments. Passion has become a signal of robustness and perseverance in a job market characterised by funding shortages, increased pressure...... way to demonstrate their potential and, crucially, their passion for their work. Drawing on the literature on technologies of governance, we reflect on what is captured and what is left out by these two evaluation instruments. We suggest that bibliometric analysis at the individual level is deeply...

  15. Robust Optical Flow Estimation

    Directory of Open Access Journals (Sweden)

    Javier Sánchez Pérez

    2013-10-01

    Full Text Available n this work, we describe an implementation of the variational method proposed by Brox etal. in 2004, which yields accurate optical flows with low running times. It has several benefitswith respect to the method of Horn and Schunck: it is more robust to the presence of outliers,produces piecewise-smooth flow fields and can cope with constant brightness changes. Thismethod relies on the brightness and gradient constancy assumptions, using the information ofthe image intensities and the image gradients to find correspondences. It also generalizes theuse of continuous L1 functionals, which help mitigate the effect of outliers and create a TotalVariation (TV regularization. Additionally, it introduces a simple temporal regularizationscheme that enforces a continuous temporal coherence of the flow fields.

  16. Robust Multimodal Dictionary Learning

    Science.gov (United States)

    Cao, Tian; Jojic, Vladimir; Modla, Shannon; Powell, Debbie; Czymmek, Kirk; Niethammer, Marc

    2014-01-01

    We propose a robust multimodal dictionary learning method for multimodal images. Joint dictionary learning for both modalities may be impaired by lack of correspondence between image modalities in training data, for example due to areas of low quality in one of the modalities. Dictionaries learned with such non-corresponding data will induce uncertainty about image representation. In this paper, we propose a probabilistic model that accounts for image areas that are poorly corresponding between the image modalities. We cast the problem of learning a dictionary in presence of problematic image patches as a likelihood maximization problem and solve it with a variant of the EM algorithm. Our algorithm iterates identification of poorly corresponding patches and re-finements of the dictionary. We tested our method on synthetic and real data. We show improvements in image prediction quality and alignment accuracy when using the method for multimodal image registration. PMID:24505674

  17. Robust snapshot interferometric spectropolarimetry.

    Science.gov (United States)

    Kim, Daesuk; Seo, Yoonho; Yoon, Yonghee; Dembele, Vamara; Yoon, Jae Woong; Lee, Kyu Jin; Magnusson, Robert

    2016-05-15

    This Letter describes a Stokes vector measurement method based on a snapshot interferometric common-path spectropolarimeter. The proposed scheme, which employs an interferometric polarization-modulation module, can extract the spectral polarimetric parameters Ψ(k) and Δ(k) of a transmissive anisotropic object by which an accurate Stokes vector can be calculated in the spectral domain. It is inherently strongly robust to the object 3D pose variation, since it is designed distinctly so that the measured object can be placed outside of the interferometric module. Experiments are conducted to verify the feasibility of the proposed system. The proposed snapshot scheme enables us to extract the spectral Stokes vector of a transmissive anisotropic object within tens of msec with high accuracy.

  18. International Conference on Robust Statistics

    CERN Document Server

    Filzmoser, Peter; Gather, Ursula; Rousseeuw, Peter

    2003-01-01

    Aspects of Robust Statistics are important in many areas. Based on the International Conference on Robust Statistics 2001 (ICORS 2001) in Vorau, Austria, this volume discusses future directions of the discipline, bringing together leading scientists, experienced researchers and practitioners, as well as younger researchers. The papers cover a multitude of different aspects of Robust Statistics. For instance, the fundamental problem of data summary (weights of evidence) is considered and its robustness properties are studied. Further theoretical subjects include e.g.: robust methods for skewness, time series, longitudinal data, multivariate methods, and tests. Some papers deal with computational aspects and algorithms. Finally, the aspects of application and programming tools complete the volume.

  19. The CBM RICH project

    Energy Technology Data Exchange (ETDEWEB)

    Adamczewski-Musch, J. [GSI Darmstadt (Germany); Becker, K.-H. [University Wuppertal (Germany); Belogurov, S. [ITEP Moscow (Russian Federation); Boldyreva, N. [PNPI Gatchina (Russian Federation); Chernogorov, A. [ITEP Moscow (Russian Federation); Deveaux, C. [University Gießen (Germany); Dobyrn, V. [PNPI Gatchina (Russian Federation); Dürr, M. [University Gießen (Germany); Eom, J. [Pusan National University (Korea, Republic of); Eschke, J. [GSI Darmstadt (Germany); Höhne, C. [University Gießen (Germany); Kampert, K.-H. [University Wuppertal (Germany); Kleipa, V. [GSI Darmstadt (Germany); Kochenda, L. [PNPI Gatchina (Russian Federation); Kolb, B. [GSI Darmstadt (Germany); Kopfer, J. [University Wuppertal (Germany); Kravtsov, P. [PNPI Gatchina (Russian Federation); Lebedev, S.; Lebedeva, E. [University Gießen (Germany); Leonova, E. [PNPI Gatchina (Russian Federation); and others

    2014-12-01

    The Compressed Baryonic Matter (CBM) experiment will study the properties of super dense nuclear matter by means of heavy ion collisions at the future FAIR facility. An integral detector component is a large Ring Imaging Cherenkov detector with CO{sub 2} gas radiator, which will mainly serve for electron identification and pion suppression necessary to access rare dileptonic probes like e{sup +}e{sup −} decays of light vector mesons or J/Ψ. We describe the design of this future RICH detector and focus on results obtained by building a CBM RICH detector prototype tested at CERN-PS.

  20. Robustness Analyses of Timber Structures

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning; Sørensen, John Dalsgaard; Hald, Frederik

    2013-01-01

    The robustness of structural systems has obtained a renewed interest arising from a much more frequent use of advanced types of structures with limited redundancy and serious consequences in case of failure. In order to minimise the likelihood of such disproportionate structural failures, many mo...... with respect to robustness of timber structures and will discuss the consequences of such robustness issues related to the future development of timber structures.......The robustness of structural systems has obtained a renewed interest arising from a much more frequent use of advanced types of structures with limited redundancy and serious consequences in case of failure. In order to minimise the likelihood of such disproportionate structural failures, many...... modern building codes consider the need for the robustness of structures and provide strategies and methods to obtain robustness. Therefore, a structural engineer may take necessary steps to design robust structures that are insensitive to accidental circumstances. The present paper summaries issues...

  1. Current and future trends in marine image annotation software

    Science.gov (United States)

    Gomes-Pereira, Jose Nuno; Auger, Vincent; Beisiegel, Kolja; Benjamin, Robert; Bergmann, Melanie; Bowden, David; Buhl-Mortensen, Pal; De Leo, Fabio C.; Dionísio, Gisela; Durden, Jennifer M.; Edwards, Luke; Friedman, Ariell; Greinert, Jens; Jacobsen-Stout, Nancy; Lerner, Steve; Leslie, Murray; Nattkemper, Tim W.; Sameoto, Jessica A.; Schoening, Timm; Schouten, Ronald; Seager, James; Singh, Hanumant; Soubigou, Olivier; Tojeira, Inês; van den Beld, Inge; Dias, Frederico; Tempera, Fernando; Santos, Ricardo S.

    2016-12-01

    Given the need to describe, analyze and index large quantities of marine imagery data for exploration and monitoring activities, a range of specialized image annotation tools have been developed worldwide. Image annotation - the process of transposing objects or events represented in a video or still image to the semantic level, may involve human interactions and computer-assisted solutions. Marine image annotation software (MIAS) have enabled over 500 publications to date. We review the functioning, application trends and developments, by comparing general and advanced features of 23 different tools utilized in underwater image analysis. MIAS requiring human input are basically a graphical user interface, with a video player or image browser that recognizes a specific time code or image code, allowing to log events in a time-stamped (and/or geo-referenced) manner. MIAS differ from similar software by the capability of integrating data associated to video collection, the most simple being the position coordinates of the video recording platform. MIAS have three main characteristics: annotating events in real time, posteriorly to annotation and interact with a database. These range from simple annotation interfaces, to full onboard data management systems, with a variety of toolboxes. Advanced packages allow to input and display data from multiple sensors or multiple annotators via intranet or internet. Posterior human-mediated annotation often include tools for data display and image analysis, e.g. length, area, image segmentation, point count; and in a few cases the possibility of browsing and editing previous dive logs or to analyze the annotations. The interaction with a database allows the automatic integration of annotations from different surveys, repeated annotation and collaborative annotation of shared datasets, browsing and querying of data. Progress in the field of automated annotation is mostly in post processing, for stable platforms or still images

  2. An Annotated Bibliography of Concept Mapping

    Science.gov (United States)

    Garcia, GNA

    2008-01-01

    A rich narrative-style bibliography of concept mapping (reviewing six articles published between 1992-2005). Articles reviewed include: (1) Cognitive mapping: A qualitative research method for social work (C. Bitoni); (2) Collaborative concept mapping: Provoking and supporting meaningful discourse (C. Boxtel, J. Linden, E. Roelofs, and G. Erkens);…

  3. An Annotated Bibliography of Accelerated Learning

    Science.gov (United States)

    Garcia, GNA

    2007-01-01

    A rich narrative-style bibliography of accelerated learning (reviewing six articles published between 1995-2003). Articles reviewed include: (1) Accelerative learning and the Emerging Science of Wholeness (D. D. Beale); (2) Effective Teaching in Accelerated Learning Programs (D. Boyd); (3) A Critical Theory Perspective on Accelerated Learning (S.…

  4. Black cohosh Actaea racemosa: an annotated bibliography

    Science.gov (United States)

    Mary L. Predny; Patricia De Angelis; James L. Chamberlain

    2006-01-01

    Black cohosh (Actaea racemosa, Syn.: Cimicifuga racemosa), a member of the buttercup family (Ranunculaceae), is an erect perennial found in rich cove forests of Eastern North America from Georgia to Ontario. Native Americans used black cohosh for a variety of ailments including rheumatism, malaria, sore throats, and complications...

  5. IntelliGO: a new vector-based semantic similarity measure including annotation origin

    Directory of Open Access Journals (Sweden)

    Devignes Marie-Dominique

    2010-12-01

    previously published measures. Conclusions The IntelliGO similarity measure provides a customizable and comprehensive method for quantifying gene similarity based on GO annotations. It also displays a robust set-discriminating power which suggests it will be useful for functional clustering. Availability An on-line version of the IntelliGO similarity measure is available at: http://bioinfo.loria.fr/Members/benabdsi/intelligo_project/

  6. Evidence-based gene models for structural and functional annotations of the oil palm genome.

    Science.gov (United States)

    Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie

    2017-09-08

    Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools. Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC 3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC 3 -rich genes (GC 3  ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures. We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC 3 -rich and intronless), as well as those associated with important functions, such as FA

  7. Neutron rich nuclei

    International Nuclear Information System (INIS)

    Foucher, R.

    1979-01-01

    If some β - emitters are particularly interesting to study in light, medium, and heavy nuclei, another (and also) difficult problem is to know systematically the properties of these neutron rich nuclei far from the stability line. A review of some of their characteristics is presented. How far is it possible to be objective in the interpretation of data is questioned and implications are discussed

  8. Fuzzy Emotional Semantic Analysis and Automated Annotation of Scene Images

    Directory of Open Access Journals (Sweden)

    Jianfang Cao

    2015-01-01

    Full Text Available With the advances in electronic and imaging techniques, the production of digital images has rapidly increased, and the extraction and automated annotation of emotional semantics implied by images have become issues that must be urgently addressed. To better simulate human subjectivity and ambiguity for understanding scene images, the current study proposes an emotional semantic annotation method for scene images based on fuzzy set theory. A fuzzy membership degree was calculated to describe the emotional degree of a scene image and was implemented using the Adaboost algorithm and a back-propagation (BP neural network. The automated annotation method was trained and tested using scene images from the SUN Database. The annotation results were then compared with those based on artificial annotation. Our method showed an annotation accuracy rate of 91.2% for basic emotional values and 82.4% after extended emotional values were added, which correspond to increases of 5.5% and 8.9%, respectively, compared with the results from using a single BP neural network algorithm. Furthermore, the retrieval accuracy rate based on our method reached approximately 89%. This study attempts to lay a solid foundation for the automated emotional semantic annotation of more types of images and therefore is of practical significance.

  9. Ontology modularization to improve semantic medical image annotation.

    Science.gov (United States)

    Wennerberg, Pinar; Schulz, Klaus; Buitelaar, Paul

    2011-02-01

    Searching for medical images and patient reports is a significant challenge in a clinical setting. The contents of such documents are often not described in sufficient detail thus making it difficult to utilize the inherent wealth of information contained within them. Semantic image annotation addresses this problem by describing the contents of images and reports using medical ontologies. Medical images and patient reports are then linked to each other through common annotations. Subsequently, search algorithms can more effectively find related sets of documents on the basis of these semantic descriptions. A prerequisite to realizing such a semantic search engine is that the data contained within should have been previously annotated with concepts from medical ontologies. One major challenge in this regard is the size and complexity of medical ontologies as annotation sources. Manual annotation is particularly time consuming labor intensive in a clinical environment. In this article we propose an approach to reducing the size of clinical ontologies for more efficient manual image and text annotation. More precisely, our goal is to identify smaller fragments of a large anatomy ontology that are relevant for annotating medical images from patients suffering from lymphoma. Our work is in the area of ontology modularization, which is a recent and active field of research. We describe our approach, methods and data set in detail and we discuss our results. Copyright © 2010 Elsevier Inc. All rights reserved.

  10. The caBIG annotation and image Markup project.

    Science.gov (United States)

    Channin, David S; Mongkolwat, Pattanasak; Kleper, Vladimir; Sepukar, Kastubh; Rubin, Daniel L

    2010-04-01

    Image annotation and markup are at the core of medical interpretation in both the clinical and the research setting. Digital medical images are managed with the DICOM standard format. While DICOM contains a large amount of meta-data about whom, where, and how the image was acquired, DICOM says little about the content or meaning of the pixel data. An image annotation is the explanatory or descriptive information about the pixel data of an image that is generated by a human or machine observer. An image markup is the graphical symbols placed over the image to depict an annotation. While DICOM is the standard for medical image acquisition, manipulation, transmission, storage, and display, there are no standards for image annotation and markup. Many systems expect annotation to be reported verbally, while markups are stored in graphical overlays or proprietary formats. This makes it difficult to extract and compute with both of them. The goal of the Annotation and Image Markup (AIM) project is to develop a mechanism, for modeling, capturing, and serializing image annotation and markup data that can be adopted as a standard by the medical imaging community. The AIM project produces both human- and machine-readable artifacts. This paper describes the AIM information model, schemas, software libraries, and tools so as to prepare researchers and developers for their use of AIM.

  11. Comparison of concept recognizers for building the Open Biomedical Annotator

    Directory of Open Access Journals (Sweden)

    Rubin Daniel

    2009-09-01

    Full Text Available Abstract The National Center for Biomedical Ontology (NCBO is developing a system for automated, ontology-based access to online biomedical resources (Shah NH, et al.: Ontology-driven indexing of public datasets for translational bioinformatics. BMC Bioinformatics 2009, 10(Suppl 2:S1. The system's indexing workflow processes the text metadata of diverse resources such as datasets from GEO and ArrayExpress to annotate and index them with concepts from appropriate ontologies. This indexing requires the use of a concept-recognition tool to identify ontology concepts in the resource's textual metadata. In this paper, we present a comparison of two concept recognizers – NLM's MetaMap and the University of Michigan's Mgrep. We utilize a number of data sources and dictionaries to evaluate the concept recognizers in terms of precision, recall, speed of execution, scalability and customizability. Our evaluations demonstrate that Mgrep has a clear edge over MetaMap for large-scale service oriented applications. Based on our analysis we also suggest areas of potential improvements for Mgrep. We have subsequently used Mgrep to build the Open Biomedical Annotator service. The Annotator service has access to a large dictionary of biomedical terms derived from the United Medical Language System (UMLS and NCBO ontologies. The Annotator also leverages the hierarchical structure of the ontologies and their mappings to expand annotations. The Annotator service is available to the community as a REST Web service for creating ontology-based annotations of their data.

  12. Annotation of the Evaluative Language in a Dependency Treebank

    Directory of Open Access Journals (Sweden)

    Šindlerová Jana

    2017-12-01

    Full Text Available In the paper, we present our efforts to annotate evaluative language in the Prague Dependency Treebank 2.0. The project is a follow-up of the series of annotations of small plaintext corpora. It uses automatic identification of potentially evaluative nodes through mapping a Czech subjectivity lexicon to syntactically annotated data. These nodes are then manually checked by an annotator and either dismissed as standing in a non-evaluative context, or confirmed as evaluative. In the latter case, information about the polarity orientation, the source and target of evaluation is added by the annotator. The annotations unveiled several advantages and disadvantages of the chosen framework. The advantages involve more structured and easy-to-handle environment for the annotator, visibility of syntactic patterning of the evaluative state, effective solving of discontinuous structures or a new perspective on the influence of good/bad news. The disadvantages include little capability of treating cases with evaluation spread among more syntactically connected nodes at once, little capability of treating metaphorical expressions, or disregarding the effects of negation and intensification in the current scheme.

  13. Dynamics robustness of cascading systems.

    Directory of Open Access Journals (Sweden)

    Jonathan T Young

    2017-03-01

    Full Text Available A most important property of biochemical systems is robustness. Static robustness, e.g., homeostasis, is the insensitivity of a state against perturbations, whereas dynamics robustness, e.g., homeorhesis, is the insensitivity of a dynamic process. In contrast to the extensively studied static robustness, dynamics robustness, i.e., how a system creates an invariant temporal profile against perturbations, is little explored despite transient dynamics being crucial for cellular fates and are reported to be robust experimentally. For example, the duration of a stimulus elicits different phenotypic responses, and signaling networks process and encode temporal information. Hence, robustness in time courses will be necessary for functional biochemical networks. Based on dynamical systems theory, we uncovered a general mechanism to achieve dynamics robustness. Using a three-stage linear signaling cascade as an example, we found that the temporal profiles and response duration post-stimulus is robust to perturbations against certain parameters. Then analyzing the linearized model, we elucidated the criteria of when signaling cascades will display dynamics robustness. We found that changes in the upstream modules are masked in the cascade, and that the response duration is mainly controlled by the rate-limiting module and organization of the cascade's kinetics. Specifically, we found two necessary conditions for dynamics robustness in signaling cascades: 1 Constraint on the rate-limiting process: The phosphatase activity in the perturbed module is not the slowest. 2 Constraints on the initial conditions: The kinase activity needs to be fast enough such that each module is saturated even with fast phosphatase activity and upstream changes are attenuated. We discussed the relevance of such robustness to several biological examples and the validity of the above conditions therein. Given the applicability of dynamics robustness to a variety of systems, it

  14. MimoSA: a system for minimotif annotation

    Directory of Open Access Journals (Sweden)

    Kundeti Vamsi

    2010-06-01

    Full Text Available Abstract Background Minimotifs are short peptide sequences within one protein, which are recognized by other proteins or molecules. While there are now several minimotif databases, they are incomplete. There are reports of many minimotifs in the primary literature, which have yet to be annotated, while entirely novel minimotifs continue to be published on a weekly basis. Our recently proposed function and sequence syntax for minimotifs enables us to build a general tool that will facilitate structured annotation and management of minimotif data from the biomedical literature. Results We have built the MimoSA application for minimotif annotation. The application supports management of the Minimotif Miner database, literature tracking, and annotation of new minimotifs. MimoSA enables the visualization, organization, selection and editing functions of minimotifs and their attributes in the MnM database. For the literature components, Mimosa provides paper status tracking and scoring of papers for annotation through a freely available machine learning approach, which is based on word correlation. The paper scoring algorithm is also available as a separate program, TextMine. Form-driven annotation of minimotif attributes enables entry of new minimotifs into the MnM database. Several supporting features increase the efficiency of annotation. The layered architecture of MimoSA allows for extensibility by separating the functions of paper scoring, minimotif visualization, and database management. MimoSA is readily adaptable to other annotation efforts that manually curate literature into a MySQL database. Conclusions MimoSA is an extensible application that facilitates minimotif annotation and integrates with the Minimotif Miner database. We have built MimoSA as an application that integrates dynamic abstract scoring with a high performance relational model of minimotif syntax. MimoSA's TextMine, an efficient paper-scoring algorithm, can be used to

  15. PCAS – a precomputed proteome annotation database resource

    Directory of Open Access Journals (Sweden)

    Luo Jingchu

    2003-11-01

    Full Text Available Abstract Background Many model proteomes or "complete" sets of proteins of given organisms are now publicly available. Much effort has been invested in computational annotation of those "draft" proteomes. Motif or domain based algorithms play a pivotal role in functional classification of proteins. Employing most available computational algorithms, mainly motif or domain recognition algorithms, we set up to develop an online proteome annotation system with integrated proteome annotation data to complement existing resources. Results We report here the development of PCAS (ProteinCentric Annotation System as an online resource of pre-computed proteome annotation data. We applied most available motif or domain databases and their analysis methods, including hmmpfam search of HMMs in Pfam, SMART and TIGRFAM, RPS-PSIBLAST search of PSSMs in CDD, pfscan of PROSITE patterns and profiles, as well as PSI-BLAST search of SUPERFAMILY PSSMs. In addition, signal peptide and TM are predicted using SignalP and TMHMM respectively. We mapped SUPERFAMILY and COGs to InterPro, so the motif or domain databases are integrated through InterPro. PCAS displays table summaries of pre-computed data and a graphical presentation of motifs or domains relative to the protein. As of now, PCAS contains human IPI, mouse IPI, and rat IPI, A. thaliana, C. elegans, D. melanogaster, S. cerevisiae, and S. pombe proteome. PCAS is available at http://pak.cbi.pku.edu.cn/proteome/gca.php Conclusion PCAS gives better annotation coverage for model proteomes by employing a wider collection of available algorithms. Besides presenting the most confident annotation data, PCAS also allows customized query so users can inspect statistically less significant boundary information as well. Therefore, besides providing general annotation information, PCAS could be used as a discovery platform. We plan to update PCAS twice a year. We will upgrade PCAS when new proteome annotation algorithms

  16. Robust continuous clustering.

    Science.gov (United States)

    Shah, Sohil Atul; Koltun, Vladlen

    2017-09-12

    Clustering is a fundamental procedure in the analysis of scientific data. It is used ubiquitously across the sciences. Despite decades of research, existing clustering algorithms have limited effectiveness in high dimensions and often require tuning parameters for different domains and datasets. We present a clustering algorithm that achieves high accuracy across multiple domains and scales efficiently to high dimensions and large datasets. The presented algorithm optimizes a smooth continuous objective, which is based on robust statistics and allows heavily mixed clusters to be untangled. The continuous nature of the objective also allows clustering to be integrated as a module in end-to-end feature learning pipelines. We demonstrate this by extending the algorithm to perform joint clustering and dimensionality reduction by efficiently optimizing a continuous global objective. The presented approach is evaluated on large datasets of faces, hand-written digits, objects, newswire articles, sensor readings from the Space Shuttle, and protein expression levels. Our method achieves high accuracy across all datasets, outperforming the best prior algorithm by a factor of 3 in average rank.

  17. Annotation of the protein coding regions of the equine genome

    DEFF Research Database (Denmark)

    Hestand, Matthew S.; Kalbfleisch, Theodore S.; Coleman, Stephen J.

    2015-01-01

    Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced m...... and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross...

  18. Roadmap for annotating transposable elements in eukaryote genomes.

    Science.gov (United States)

    Permal, Emmanuelle; Flutre, Timothée; Quesneville, Hadi

    2012-01-01

    Current high-throughput techniques have made it feasible to sequence even the genomes of non-model organisms. However, the annotation process now represents a bottleneck to genome analysis, especially when dealing with transposable elements (TE). Combined approaches, using both de novo and knowledge-based methods to detect TEs, are likely to produce reasonably comprehensive and sensitive results. This chapter provides a roadmap for researchers involved in genome projects to address this issue. At each step of the TE annotation process, from the identification of TE families to the annotation of TE copies, we outline the tools and good practices to be used.

  19. Robust Trust in Expert Testimony

    Directory of Open Access Journals (Sweden)

    Christian Dahlman

    2015-05-01

    Full Text Available The standard of proof in criminal trials should require that the evidence presented by the prosecution is robust. This requirement of robustness says that it must be unlikely that additional information would change the probability that the defendant is guilty. Robustness is difficult for a judge to estimate, as it requires the judge to assess the possible effect of information that the he or she does not have. This article is concerned with expert witnesses and proposes a method for reviewing the robustness of expert testimony. According to the proposed method, the robustness of expert testimony is estimated with regard to competence, motivation, external strength, internal strength and relevance. The danger of trusting non-robust expert testimony is illustrated with an analysis of the Thomas Quick Case, a Swedish legal scandal where a patient at a mental institution was wrongfully convicted for eight murders.

  20. Fluid inclusions in salt: an annotated bibliography

    International Nuclear Information System (INIS)

    Isherwood, D.J.

    1979-01-01

    An annotated bibliography is presented which was compiled while searching the literature for information on fluid inclusions in salt for the Nuclear Regulatory Commission's study on the deep-geologic disposal of nuclear waste. The migration of fluid inclusions in a thermal gradient is a potential hazard to the safe disposal of nuclear waste in a salt repository. At the present time, a prediction as to whether this hazard precludes the use of salt for waste disposal can not be made. Limited data from the Salt-Vault in situ heater experiments in the early 1960's (Bradshaw and McClain, 1971) leave little doubt that fluid inclusions can migrate towards a heat source. In addition to the bibliography, there is a brief summary of the physical and chemical characteristics that together with the temperature of the waste will determine the chemical composition of the brine in contact with the waste canister, the rate of fluid migration, and the brine-canister-waste interactions

  1. Annotation and Curation of Uncharacterized proteins- Challenges

    Directory of Open Access Journals (Sweden)

    Johny eIjaq

    2015-03-01

    Full Text Available Hypothetical Proteins are the proteins that are predicted to be expressed from an open reading frame (ORF, constituting a substantial fraction of proteomes in both prokaryotes and eukaryotes. Genome projects have led to the identification of many therapeutic targets, the putative function of the protein and their interactions. In this review we have enlisted various methods. Annotation linked to structural and functional prediction of hypothetical proteins assist in the discovery of new structures and functions serving as markers and pharmacological targets for drug designing, discovery and screening. Mass spectrometry is an analytical technique for validating protein characterisation. Matrix-assisted laser desorption ionization–mass spectrometry (MALDI-MS is an efficient analytical method. Microarrays and Protein expression profiles help understanding the biological systems through a systems-wide study of proteins and their interactions with other proteins and non-proteinaceous molecules to control complex processes in cells and tissues and even whole organism. Next generation sequencing technology accelerates multiple areas of genomics research.

  2. Sophia: A Expedient UMLS Concept Extraction Annotator.

    Science.gov (United States)

    Divita, Guy; Zeng, Qing T; Gundlapalli, Adi V; Duvall, Scott; Nebeker, Jonathan; Samore, Matthew H

    2014-01-01

    An opportunity exists for meaningful concept extraction and indexing from large corpora of clinical notes in the Veterans Affairs (VA) electronic medical record. Currently available tools such as MetaMap, cTAKES and HITex do not scale up to address this big data need. Sophia, a rapid UMLS concept extraction annotator was developed to fulfill a mandate and address extraction where high throughput is needed while preserving performance. We report on the development, testing and benchmarking of Sophia against MetaMap and cTAKEs. Sophia demonstrated improved performance on recall as compared to cTAKES and MetaMap (0.71 vs 0.66 and 0.38). The overall f-score was similar to cTAKES and an improvement over MetaMap (0.53 vs 0.57 and 0.43). With regard to speed of processing records, we noted Sophia to be several fold faster than cTAKES and the scaled-out MetaMap service. Sophia offers a viable alternative for high-throughput information extraction tasks.

  3. Frame on frames: an annotated bibliography

    International Nuclear Information System (INIS)

    Wright, T.; Tsao, H.J.

    1983-01-01

    The success or failure of any sample survey of a finite population is largely dependent upon the condition and adequacy of the list or frame from which the probability sample is selected. Much of the published survey sampling related work has focused on the measurement of sampling errors and, more recently, on nonsampling errors to a lesser extent. Recent studies on data quality for various types of data collection systems have revealed that the extent of the nonsampling errors far exceeds that of the sampling errors in many cases. While much of this nonsampling error, which is difficult to measure, can be attributed to poor frames, relatively little effort or theoretical work has focused on this contribution to total error. The objective of this paper is to present an annotated bibliography on frames with the hope that it will bring together, for experimenters, a number of suggestions for action when sampling from imperfect frames and that more attention will be given to this area of survey methods research

  4. Annotating Human P-Glycoprotein Bioassay Data.

    Science.gov (United States)

    Zdrazil, Barbara; Pinto, Marta; Vasanthanathan, Poongavanam; Williams, Antony J; Balderud, Linda Zander; Engkvist, Ola; Chichester, Christine; Hersey, Anne; Overington, John P; Ecker, Gerhard F

    2012-08-01

    Huge amounts of small compound bioactivity data have been entering the public domain as a consequence of open innovation initiatives. It is now the time to carefully analyse existing bioassay data and give it a systematic structure. Our study aims to annotate prominent in vitro assays used for the determination of bioactivities of human P-glycoprotein inhibitors and substrates as they are represented in the ChEMBL and TP-search open source databases. Furthermore, the ability of data, determined in different assays, to be combined with each other is explored. As a result of this study, it is suggested that for inhibitors of human P-glycoprotein it is possible to combine data coming from the same assay type, if the cell lines used are also identical and the fluorescent or radiolabeled substrate have overlapping binding sites. In addition, it demonstrates that there is a need for larger chemical diverse datasets that have been measured in a panel of different assays. This would certainly alleviate the search for other inter-correlations between bioactivity data yielded by different assay setups.

  5. The CBM RICH project

    Energy Technology Data Exchange (ETDEWEB)

    Adamczewski-Musch, J. [GSI Helmholtzzentrum für Schwerionenforschung GmbH, D-64291 Darmstadt (Germany); Akishin, P. [Laboratory of Information Technologies, Joint Institute for Nuclear research (JINR-LIT), Dubna (Russian Federation); Becker, K.-H. [Department of Physics, University of Wuppertal, D-42097 Wuppertal (Germany); Belogurov, S. [SSC RF ITEP, 117218 Moscow (Russian Federation); Bendarouach, J. [Institute of Physics II and Institute of Applied Physics, Justus Liebig University Giessen, D-35392 Giessen (Germany); Boldyreva, N. [National Research Centre “Kurchatov Institute” B.P. Konstantinov Petersburg Nuclear Physics Institute, 188300 Gatchina (Russian Federation); Chernogorov, A. [SSC RF ITEP, 117218 Moscow (Russian Federation); Deveaux, C. [Institute of Physics II and Institute of Applied Physics, Justus Liebig University Giessen, D-35392 Giessen (Germany); Dobyrn, V. [National Research Centre “Kurchatov Institute” B.P. Konstantinov Petersburg Nuclear Physics Institute, 188300 Gatchina (Russian Federation); Dürr, M. [Institute of Physics II and Institute of Applied Physics, Justus Liebig University Giessen, D-35392 Giessen (Germany); Eschke, J. [GSI Helmholtzzentrum für Schwerionenforschung GmbH, D-64291 Darmstadt (Germany); Förtsch, J. [Department of Physics, University of Wuppertal, D-42097 Wuppertal (Germany); Heep, J.; Höhne, C. [Institute of Physics II and Institute of Applied Physics, Justus Liebig University Giessen, D-35392 Giessen (Germany); Kampert, K.-H. [Department of Physics, University of Wuppertal, D-42097 Wuppertal (Germany); and others

    2017-02-11

    The CBM RICH detector is an integral component of the future CBM experiment at FAIR, providing efficient electron identification and pion suppression necessary for the measurement of rare dileptonic probes in heavy ion collisions. The RICH design is based on CO{sub 2} gas as radiator, a segmented spherical glass focussing mirror with Al+MgF{sub 2} reflective coating, and Multianode Photomultipliers for efficient Cherenkov photon detection. Hamamatsu H12700 MAPMTs have recently been selected as photon sensors, following an extensive sensor evaluation, including irradiation tests to ensure sufficient radiation hardness of the MAPMTs. A brief overview of the detector design and concept is given, results on the radiation hardness of the photon sensors are shown, and the development of a FPGA-TDC based readout chain is discussed.

  6. The Bologna Annotation Resource (BAR 3.0): improving protein functional annotation.

    Science.gov (United States)

    Profiti, Giuseppe; Martelli, Pier Luigi; Casadio, Rita

    2017-07-03

    BAR 3.0 updates our server BAR (Bologna Annotation Resource) for predicting protein structural and functional features from sequence. We increase data volume, query capabilities and information conveyed to the user. The core of BAR 3.0 is a graph-based clustering procedure of UniProtKB sequences, following strict pairwise similarity criteria (sequence identity ≥40% with alignment coverage ≥90%). Each cluster contains the available annotation downloaded from UniProtKB, GO, PFAM and PDB. After statistical validation, GO terms and PFAM domains are cluster-specific and annotate new sequences entering the cluster after satisfying similarity constraints. BAR 3.0 includes 28 869 663 sequences in 1 361 773 clusters, of which 22.2% (22 241 661 sequences) and 47.4% (24 555 055 sequences) have at least one validated GO term and one PFAM domain, respectively. 1.4% of the clusters (36% of all sequences) include PDB structures and the cluster is associated to a hidden Markov model that allows building template-target alignment suitable for structural modeling. Some other 3 399 026 sequences are singletons. BAR 3.0 offers an improved search interface, allowing queries by UniProtKB-accession, Fasta sequence, GO-term, PFAM-domain, organism, PDB and ligand/s. When evaluated on the CAFA2 targets, BAR 3.0 largely outperforms our previous version and scores among state-of-the-art methods. BAR 3.0 is publicly available and accessible at http://bar.biocomp.unibo.it/bar3. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. The CLEO RICH detector

    International Nuclear Information System (INIS)

    Artuso, M.; Ayad, R.; Bukin, K.; Efimov, A.; Boulahouache, C.; Dambasuren, E.; Kopp, S.; Li, Ji; Majumder, G.; Menaa, N.; Mountain, R.; Schuh, S.; Skwarnicki, T.; Stone, S.; Viehhauser, G.; Wang, J.C.; Coan, T.E.; Fadeyev, V.; Maravin, Y.; Volobouev, I.; Ye, J.; Anderson, S.; Kubota, Y.; Smith, A.

    2005-01-01

    We describe the design, construction and performance of a Ring Imaging Cherenkov Detector (RICH) constructed to identify charged particles in the CLEO experiment. Cherenkov radiation occurs in LiF crystals, both planar and ones with a novel 'sawtooth'-shaped exit surface. Photons in the wavelength interval 135-165nm are detected using multi-wire chambers filled with a mixture of methane gas and triethylamine vapor. Excellent π/K separation is demonstrated

  8. MitoBamAnnotator: A web-based tool for detecting and annotating heteroplasmy in human mitochondrial DNA sequences.

    Science.gov (United States)

    Zhidkov, Ilia; Nagar, Tal; Mishmar, Dan; Rubin, Eitan

    2011-11-01

    The use of Next-Generation Sequencing of mitochondrial DNA is becoming widespread in biological and clinical research. This, in turn, creates a need for a convenient tool that detects and analyzes heteroplasmy. Here we present MitoBamAnnotator, a user friendly web-based tool that allows maximum flexibility and control in heteroplasmy research. MitoBamAnnotator provides the user with a comprehensively annotated overview of mitochondrial genetic variation, allowing for an in-depth analysis with no prior knowledge in programming. Copyright © 2011 Elsevier B.V. and Mitochondria Research Society. All rights reserved. All rights reserved.

  9. CBM RICH geometry optimization

    Energy Technology Data Exchange (ETDEWEB)

    Mahmoud, Tariq; Hoehne, Claudia [II. Physikalisches Institut, Giessen Univ. (Germany); Collaboration: CBM-Collaboration

    2016-07-01

    The Compressed Baryonic Matter (CBM) experiment at the future FAIR complex will investigate the phase diagram of strongly interacting matter at high baryon density and moderate temperatures in A+A collisions from 2-11 AGeV (SIS100) beam energy. The main electron identification detector in the CBM experiment will be a RICH detector with a CO{sub 2} gaseous-radiator, focusing spherical glass mirrors, and MAPMT photo-detectors being placed on a PMT-plane. The RICH detector is located directly behind the CBM dipole magnet. As the final magnet geometry is now available, some changes in the RICH geometry become necessary. In order to guarantee a magnetic field of 1 mT at maximum in the PMT plane for effective operation of the MAPMTs, two measures have to be taken: The PMT plane is moved outwards of the stray field by tilting the mirrors by 10 degrees and shielding boxes have been designed. In this contribution the results of the geometry optimization procedure are presented.

  10. Detecting modularity "smells" in dependencies injected with Java annotations

    NARCIS (Netherlands)

    Roubtsov, S.; Serebrenik, A.; Brand, van den M.G.J.

    2010-01-01

    Dependency injection is a recent programming mechanism reducing dependencies among components by delegating them to an external entity, called a dependency injection framework. An increasingly popular approach to dependency injection implementation relies upon using Java annotations, a special form

  11. Annotated bibliography of South African indigenous evergreen forest ecology

    CSIR Research Space (South Africa)

    Geldenhuys, CJ

    1985-01-01

    Full Text Available Annotated references to 519 publications are presented, together with keyword listings and keyword, regional, place name and taxonomic indices. This bibliography forms part of the first phase of the activities of the Forest Biome Task Group....

  12. Creating New Medical Ontologies for Image Annotation A Case Study

    CERN Document Server

    Stanescu, Liana; Brezovan, Marius; Mihai, Cristian Gabriel

    2012-01-01

    Creating New Medical Ontologies for Image Annotation focuses on the problem of the medical images automatic annotation process, which is solved in an original manner by the authors. All the steps of this process are described in detail with algorithms, experiments and results. The original algorithms proposed by authors are compared with other efficient similar algorithms. In addition, the authors treat the problem of creating ontologies in an automatic way, starting from Medical Subject Headings (MESH). They have presented some efficient and relevant annotation models and also the basics of the annotation model used by the proposed system: Cross Media Relevance Models. Based on a text query the system will retrieve the images that contain objects described by the keywords.

  13. Geothermal wetlands: an annotated bibliography of pertinent literature

    Energy Technology Data Exchange (ETDEWEB)

    Stanley, N.E.; Thurow, T.L.; Russell, B.F.; Sullivan, J.F.

    1980-05-01

    This annotated bibliography covers the following topics: algae, wetland ecosystems; institutional aspects; macrophytes - general, production rates, and mineral absorption; trace metal absorption; wetland soils; water quality; and other aspects of marsh ecosystems. (MHR)

  14. Managing and Querying Image Annotation and Markup in XML

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid. PMID:21218167

  15. Managing and Querying Image Annotation and Markup in XML.

    Science.gov (United States)

    Wang, Fusheng; Pan, Tony; Sharma, Ashish; Saltz, Joel

    2010-01-01

    Proprietary approaches for representing annotations and image markup are serious barriers for researchers to share image data and knowledge. The Annotation and Image Markup (AIM) project is developing a standard based information model for image annotation and markup in health care and clinical trial environments. The complex hierarchical structures of AIM data model pose new challenges for managing such data in terms of performance and support of complex queries. In this paper, we present our work on managing AIM data through a native XML approach, and supporting complex image and annotation queries through native extension of XQuery language. Through integration with xService, AIM databases can now be conveniently shared through caGrid.

  16. Annotating Evidence Based Clinical Guidelines : A Lightweight Ontology

    NARCIS (Netherlands)

    Hoekstra, R.; de Waard, A.; Vdovjak, R.; Paschke, A.; Burger, A.; Romano, P.; Marshall, M.S.; Splendiani, A.

    2012-01-01

    This paper describes a lightweight ontology for representing annotations of declarative evidence based clinical guidelines. We present the motivation and requirements for this representation, based on an analysis of several guidelines. The ontology provides the means to connect clinical questions

  17. 06491 Summary -- Digital Historical Corpora- Architecture, Annotation, and Retrieval

    OpenAIRE

    Burnard, Lou; Dobreva, Milena; Fuhr, Norbert; Lüdeling, Anke

    2007-01-01

    The seminar "Digital Historical Corpora" brought together scholars from (historical) linguistics, (historical) philology, computational linguistics and computer science who work with collections of historical texts. The issues that were discussed include digitization, corpus design, corpus architecture, annotation, search, and retrieval.

  18. Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger

    Directory of Open Access Journals (Sweden)

    Grigoriev Igor V

    2009-02-01

    Full Text Available Abstract Background Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR. Results 405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6% of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models. Conclusion This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method.

  19. Combined evidence annotation of transposable elements in genome sequences.

    Directory of Open Access Journals (Sweden)

    Hadi Quesneville

    2005-07-01

    Full Text Available Transposable elements (TEs are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from combining multiple independent sources of computational evidence. To elevate the quality of TE annotations to a level comparable to that of gene models, we have developed a combined evidence-model TE annotation pipeline, analogous to systems used for gene annotation, by integrating results from multiple homology-based and de novo TE identification methods. As proof of principle, we have annotated "TE models" in Drosophila melanogaster Release 4 genomic sequences using the combined computational evidence derived from RepeatMasker, BLASTER, TBLASTX, all-by-all BLASTN, RECON, TE-HMM and the previous Release 3.1 annotation. Our system is designed for use with the Apollo genome annotation tool, allowing automatic results to be curated manually to produce reliable annotations. The euchromatic TE fraction of D. melanogaster is now estimated at 5.3% (cf. 3.86% in Release 3.1, and we found a substantially higher number of TEs (n = 6,013 than previously identified (n = 1,572. Most of the new TEs derive from small fragments of a few hundred nucleotides long and highly abundant families not previously annotated (e.g., INE-1. We also estimated that 518 TE copies (8.6% are inserted into at least one other TE, forming a nest of elements. The pipeline allows rapid and thorough annotation of even the most complex TE models, including highly deleted and/or nested elements such as those often found in heterochromatic sequences. Our pipeline can be easily adapted to other genome sequences, such as those of the D. melanogaster heterochromatin or other

  20. A Machine Learning Based Analytical Framework for Semantic Annotation Requirements

    OpenAIRE

    Hamed Hassanzadeh; MohammadReza Keyvanpour

    2011-01-01

    The Semantic Web is an extension of the current web in which information is given well-defined meaning. The perspective of Semantic Web is to promote the quality and intelligence of the current web by changing its contents into machine understandable form. Therefore, semantic level information is one of the cornerstones of the Semantic Web. The process of adding semantic metadata to web resources is called Semantic Annotation. There are many obstacles against the Semantic Annotation, such as ...

  1. Annotation Method (AM): SE7_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE7_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  2. Annotation Method (AM): SE36_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE36_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  3. Annotation Method (AM): SE14_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE14_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  4. Genome Annotation and Transcriptomics of Oil-Producing Algae

    Science.gov (United States)

    2015-03-16

    AFRL-OSR-VA-TR-2015-0103 GENOME ANNOTATION AND TRANSCRIPTOMICS OF OIL-PRODUCING ALGAE Sabeeha Merchant UNIVERSITY OF CALIFORNIA LOS ANGELES Final...2010 To 12-31-2014 4. TITLE AND SUBTITLE GENOME ANNOTATION AND TRANSCRIPTOMICS OF OIL-PRODUCING ALGAE 5a. CONTRACT NUMBER FA9550-10-1-0095 5b...NOTES 14. ABSTRACT Most algae accumulate triacylglycerols (TAGs) when they are starved for essential nutrients like N, S, P (or Si in the case of some

  5. Annotation Method (AM): SE33_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE33_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  6. Annotation Method (AM): SE12_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE12_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  7. Annotation Method (AM): SE20_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE20_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  8. Annotation Method (AM): SE2_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE2_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  9. Annotation Method (AM): SE28_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE28_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  10. Annotation Method (AM): SE11_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE11_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  11. Annotation Method (AM): SE17_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE17_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  12. Annotation Method (AM): SE10_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE10_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  13. Annotation Method (AM): SE4_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE4_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  14. Annotation Method (AM): SE9_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE9_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  15. Annotation Method (AM): SE3_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE3_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  16. Annotation Method (AM): SE25_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE25_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  17. Annotation Method (AM): SE30_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE30_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  18. Annotation Method (AM): SE16_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE16_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  19. Annotation Method (AM): SE29_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE29_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  20. Annotation Method (AM): SE35_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE35_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  1. Annotation Method (AM): SE6_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE6_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  2. Annotation Method (AM): SE1_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE1_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  3. Annotation Method (AM): SE8_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE8_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  4. Annotation Method (AM): SE13_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE13_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  5. Annotation Method (AM): SE26_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE26_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  6. Annotation Method (AM): SE27_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE27_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  7. Annotation Method (AM): SE34_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE34_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  8. Annotation Method (AM): SE5_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available base search. Peaks with no hit to these databases are then selected to secondary se...arch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are ma...SE5_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary data

  9. Annotation Method (AM): SE15_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE15_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  10. Annotation Method (AM): SE31_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE31_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  11. Annotation Method (AM): SE32_AM1 [Metabolonote[Archive

    Lifescience Database Archive (English)

    Full Text Available abase search. Peaks with no hit to these databases are then selected to secondary s...earch using exactMassDB and Pep1000 databases. After the database search processes, each database hits are m...SE32_AM1 PowerGet annotation A1 In annotation process, KEGG, KNApSAcK and LipidMAPS are used for primary dat

  12. Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements

    Directory of Open Access Journals (Sweden)

    Danuta Roszko

    2015-06-01

    Full Text Available Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements In the article the authors present the experimental Polish-Lithuanian corpus (ECorpPL-LT formed for the idea of Polish-Lithuanian theoretical contrastive studies, a Polish-Lithuanian electronic dictionary, and as help for a sworn translator. The semantic annotation being brought into ECorpPL-LT is extremely useful in Polish-Lithuanian contrastive studies, and also proves helpful in translation work.

  13. Analysis of LYSA-calculus with explicit confidentiality annotations

    DEFF Research Database (Denmark)

    Gao, Han; Nielson, Hanne Riis

    2006-01-01

    Recently there has been an increased research interest in applying process calculi in the verification of cryptographic protocols due to their ability to formally model protocols. This work presents LYSA with explicit confidentiality annotations for indicating the expected behavior of target...... malicious activities performed by attackers as specified by the confidentiality annotations. The proposed analysis approach is fully automatic without the need of human intervention and has been applied successfully to a number of protocols....

  14. Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2009-04-17

    Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentous ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.

  15. MetaStorm: A Public Resource for Customizable Metagenomics Annotation.

    Science.gov (United States)

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.

  16. AutoFACT: An Automatic Functional Annotation and Classification Tool

    Directory of Open Access Journals (Sweden)

    Lang B Franz

    2005-06-01

    Full Text Available Abstract Background Assignment of function to new molecular sequence data is an essential step in genomics projects. The usual process involves similarity searches of a given sequence against one or more databases, an arduous process for large datasets. Results We present AutoFACT, a fully automated and customizable annotation tool that assigns biologically informative functions to a sequence. Key features of this tool are that it (1 analyzes nucleotide and protein sequence data; (2 determines the most informative functional description by combining multiple BLAST reports from several user-selected databases; (3 assigns putative metabolic pathways, functional classes, enzyme classes, GeneOntology terms and locus names; and (4 generates output in HTML, text and GFF formats for the user's convenience. We have compared AutoFACT to four well-established annotation pipelines. The error rate of functional annotation is estimated to be only between 1–2%. Comparison of AutoFACT to the traditional top-BLAST-hit annotation method shows that our procedure increases the number of functionally informative annotations by approximately 50%. Conclusion AutoFACT will serve as a useful annotation tool for smaller sequencing groups lacking dedicated bioinformatics staff. It is implemented in PERL and runs on LINUX/UNIX platforms. AutoFACT is available at http://megasun.bch.umontreal.ca/Software/AutoFACT.htm.

  17. MetaStorm: A Public Resource for Customizable Metagenomics Annotation.

    Directory of Open Access Journals (Sweden)

    Gustavo Arango-Argoty

    Full Text Available Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/, which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.

  18. MetaStorm: A Public Resource for Customizable Metagenomics Annotation

    Science.gov (United States)

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S.; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution. PMID:27632579

  19. MIPS: analysis and annotation of genome information in 2007.

    Science.gov (United States)

    Mewes, H W; Dietmann, S; Frishman, D; Gregory, R; Mannhaupt, G; Mayer, K F X; Münsterkötter, M; Ruepp, A; Spannagl, M; Stümpflen, V; Rattei, T

    2008-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) combines automatic processing of large amounts of sequences with manual annotation of selected model genomes. Due to the massive growth of the available data, the depth of annotation varies widely between independent databases. Also, the criteria for the transfer of information from known to orthologous sequences are diverse. To cope with the task of global in-depth genome annotation has become unfeasible. Therefore, our efforts are dedicated to three levels of annotation: (i) the curation of selected genomes, in particular from fungal and plant taxa (e.g. CYGD, MNCDB, MatDB), (ii) the comprehensive, consistent, automatic annotation employing exhaustive methods for the computation of sequence similarities and sequence-related attributes as well as the classification of individual sequences (SIMAP, PEDANT and FunCat) and (iii) the compilation of manually curated databases for protein interactions based on scrutinized information from the literature to serve as an accepted set of reliable annotated interaction data (MPACT, MPPI, CORUM). All databases and tools described as well as the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).

  20. A framework for annotating human genome in disease context.

    Science.gov (United States)

    Xu, Wei; Wang, Huisong; Cheng, Wenqing; Fu, Dong; Xia, Tian; Kibbe, Warren A; Lin, Simon M

    2012-01-01

    Identification of gene-disease association is crucial to understanding disease mechanism. A rapid increase in biomedical literatures, led by advances of genome-scale technologies, poses challenge for manually-curated-based annotation databases to characterize gene-disease associations effectively and timely. We propose an automatic method-The Disease Ontology Annotation Framework (DOAF) to provide a comprehensive annotation of the human genome using the computable Disease Ontology (DO), the NCBO Annotator service and NCBI Gene Reference Into Function (GeneRIF). DOAF can keep the resulting knowledgebase current by periodically executing automatic pipeline to re-annotate the human genome using the latest DO and GeneRIF releases at any frequency such as daily or monthly. Further, DOAF provides a computable and programmable environment which enables large-scale and integrative analysis by working with external analytic software or online service platforms. A user-friendly web interface (doa.nubic.northwestern.edu) is implemented to allow users to efficiently query, download, and view disease annotations and the underlying evidences.

  1. A semi-automatic annotation tool for cooking video

    Science.gov (United States)

    Bianco, Simone; Ciocca, Gianluigi; Napoletano, Paolo; Schettini, Raimondo; Margherita, Roberto; Marini, Gianluca; Gianforme, Giorgio; Pantaleo, Giuseppe

    2013-03-01

    In order to create a cooking assistant application to guide the users in the preparation of the dishes relevant to their profile diets and food preferences, it is necessary to accurately annotate the video recipes, identifying and tracking the foods of the cook. These videos present particular annotation challenges such as frequent occlusions, food appearance changes, etc. Manually annotate the videos is a time-consuming, tedious and error-prone task. Fully automatic tools that integrate computer vision algorithms to extract and identify the elements of interest are not error free, and false positive and false negative detections need to be corrected in a post-processing stage. We present an interactive, semi-automatic tool for the annotation of cooking videos that integrates computer vision techniques under the supervision of the user. The annotation accuracy is increased with respect to completely automatic tools and the human effort is reduced with respect to completely manual ones. The performance and usability of the proposed tool are evaluated on the basis of the time and effort required to annotate the same video sequences.

  2. Robust and distributed hypothesis testing

    CERN Document Server

    Gül, Gökhan

    2017-01-01

    This book generalizes and extends the available theory in robust and decentralized hypothesis testing. In particular, it presents a robust test for modeling errors which is independent from the assumptions that a sufficiently large number of samples is available, and that the distance is the KL-divergence. Here, the distance can be chosen from a much general model, which includes the KL-divergence as a very special case. This is then extended by various means. A minimax robust test that is robust against both outliers as well as modeling errors is presented. Minimax robustness properties of the given tests are also explicitly proven for fixed sample size and sequential probability ratio tests. The theory of robust detection is extended to robust estimation and the theory of robust distributed detection is extended to classes of distributions, which are not necessarily stochastically bounded. It is shown that the quantization functions for the decision rules can also be chosen as non-monotone. Finally, the boo...

  3. Robustness of IPTV business models

    NARCIS (Netherlands)

    Bouwman, H.; Zhengjia, M.; Duin, P. van der; Limonard, S.

    2008-01-01

    The final stage in the STOF method is an evaluation of the robustness of the design, for which the method provides some guidelines. For many innovative services, the future holds numerous uncertainties, which makes evaluating the robustness of a business model a difficult task. In this chapter, we

  4. Robustness Evaluation of Timber Structures

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning; Sørensen, John Dalsgaard

    2009-01-01

    Robustness of structural systems has obtained a renewed interest due to a much more frequent use of advanced types of structures with limited redundancy and serious consequences in case of failure.......Robustness of structural systems has obtained a renewed interest due to a much more frequent use of advanced types of structures with limited redundancy and serious consequences in case of failure....

  5. PedAM: a database for Pediatric Disease Annotation and Medicine.

    Science.gov (United States)

    Jia, Jinmeng; An, Zhongxin; Ming, Yue; Guo, Yongli; Li, Wei; Li, Xin; Liang, Yunxiang; Guo, Dongming; Tai, Jun; Chen, Geng; Jin, Yaqiong; Liu, Zhimei; Ni, Xin; Shi, Tieliu

    2018-01-04

    There is a significant number of children around the world suffering from the consequence of the misdiagnosis and ineffective treatment for various diseases. To facilitate the precision medicine in pediatrics, a database namely the Pediatric Disease Annotations & Medicines (PedAM) has been built to standardize and classify pediatric diseases. The PedAM integrates both biomedical resources and clinical data from Electronic Medical Records to support the development of computational tools, by which enables robust data analysis and integration. It also uses disease-manifestation (D-M) integrated from existing biomedical ontologies as prior knowledge to automatically recognize text-mined, D-M-specific syntactic patterns from 774 514 full-text articles and 8 848 796 abstracts in MEDLINE. Additionally, disease connections based on phenotypes or genes can be visualized on the web page of PedAM. Currently, the PedAM contains standardized 8528 pediatric disease terms (4542 unique disease concepts and 3986 synonyms) with eight annotation fields for each disease, including definition synonyms, gene, symptom, cross-reference (Xref), human phenotypes and its corresponding phenotypes in the mouse. The database PedAM is freely accessible at http://www.unimd.org/pedam/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. An Annotation Agnostic Algorithm for Detecting Nascent RNA Transcripts in GRO-Seq.

    Science.gov (United States)

    Azofeifa, Joseph G; Allen, Mary A; Lladser, Manuel E; Dowell, Robin D

    2017-01-01

    We present a fast and simple algorithm to detect nascent RNA transcription in global nuclear run-on sequencing (GRO-seq). GRO-seq is a relatively new protocol that captures nascent transcripts from actively engaged polymerase, providing a direct read-out on bona fide transcription. Most traditional assays, such as RNA-seq, measure steady state RNA levels which are affected by transcription, post-transcriptional processing, and RNA stability. GRO-seq data, however, presents unique analysis challenges that are only beginning to be addressed. Here, we describe a new algorithm, Fast Read Stitcher (FStitch), that takes advantage of two popular machine-learning techniques, hidden Markov models and logistic regression, to classify which regions of the genome are transcribed. Given a small user-defined training set, our algorithm is accurate, robust to varying read depth, annotation agnostic, and fast. Analysis of GRO-seq data without a priori need for annotation uncovers surprising new insights into several aspects of the transcription process.

  7. Diversity Indices as Measures of Functional Annotation Methods in Metagenomics Studies

    KAUST Repository

    Jankovic, Boris R.

    2016-01-26

    Applications of high-throughput techniques in metagenomics studies produce massive amounts of data. Fragments of genomic, transcriptomic and proteomic molecules are all found in metagenomics samples. Laborious and meticulous effort in sequencing and functional annotation are then required to, amongst other objectives, reconstruct a taxonomic map of the environment that metagenomics samples were taken from. In addition to computational challenges faced by metagenomics studies, the analysis is further complicated by the presence of contaminants in the samples, potentially resulting in skewed taxonomic analysis. The functional annotation in metagenomics can utilize all available omics data and therefore different methods that are associated with a particular type of data. For example, protein-coding DNA, non-coding RNA or ribosomal RNA data can be used in such an analysis. These methods would have their advantages and disadvantages and the question of comparison among them naturally arises. There are several criteria that can be used when performing such a comparison. Loosely speaking, methods can be evaluated in terms of computational complexity or in terms of the expected biological accuracy. We propose that the concept of diversity that is used in the ecosystems and species diversity studies can be successfully used in evaluating certain aspects of the methods employed in metagenomics studies. We show that when applying the concept of Hill’s diversity, the analysis of variations in the diversity order provides valuable clues into the robustness of methods used in the taxonomical analysis.

  8. Improving integrative searching of systems chemical biology data using semantic annotation.

    Science.gov (United States)

    Chen, Bin; Ding, Ying; Wild, David J

    2012-03-08

    Systems chemical biology and chemogenomics are considered critical, integrative disciplines in modern biomedical research, but require data mining of large, integrated, heterogeneous datasets from chemistry and biology. We previously developed an RDF-based resource called Chem2Bio2RDF that enabled querying of such data using the SPARQL query language. Whilst this work has proved useful in its own right as one of the first major resources in these disciplines, its utility could be greatly improved by the application of an ontology for annotation of the nodes and edges in the RDF graph, enabling a much richer range of semantic queries to be issued. We developed a generalized chemogenomics and systems chemical biology OWL ontology called Chem2Bio2OWL that describes the semantics of chemical compounds, drugs, protein targets, pathways, genes, diseases and side-effects, and the relationships between them. The ontology also includes data provenance. We used it to annotate our Chem2Bio2RDF dataset, making it a rich semantic resource. Through a series of scientific case studies we demonstrate how this (i) simplifies the process of building SPARQL queries, (ii) enables useful new kinds of queries on the data and (iii) makes possible intelligent reasoning and semantic graph mining in chemogenomics and systems chemical biology. Chem2Bio2OWL is available at http://chem2bio2rdf.org/owl. The document is available at http://chem2bio2owl.wikispaces.com.

  9. Improving integrative searching of systems chemical biology data using semantic annotation

    Directory of Open Access Journals (Sweden)

    Chen Bin

    2012-03-01

    Full Text Available Abstract Background Systems chemical biology and chemogenomics are considered critical, integrative disciplines in modern biomedical research, but require data mining of large, integrated, heterogeneous datasets from chemistry and biology. We previously developed an RDF-based resource called Chem2Bio2RDF that enabled querying of such data using the SPARQL query language. Whilst this work has proved useful in its own right as one of the first major resources in these disciplines, its utility could be greatly improved by the application of an ontology for annotation of the nodes and edges in the RDF graph, enabling a much richer range of semantic queries to be issued. Results We developed a generalized chemogenomics and systems chemical biology OWL ontology called Chem2Bio2OWL that describes the semantics of chemical compounds, drugs, protein targets, pathways, genes, diseases and side-effects, and the relationships between them. The ontology also includes data provenance. We used it to annotate our Chem2Bio2RDF dataset, making it a rich semantic resource. Through a series of scientific case studies we demonstrate how this (i simplifies the process of building SPARQL queries, (ii enables useful new kinds of queries on the data and (iii makes possible intelligent reasoning and semantic graph mining in chemogenomics and systems chemical biology. Availability Chem2Bio2OWL is available at http://chem2bio2rdf.org/owl. The document is available at http://chem2bio2owl.wikispaces.com.

  10. Experiments with crowdsourced re-annotation of a POS tagging data set

    DEFF Research Database (Denmark)

    Hovy, Dirk; Plank, Barbara; Søgaard, Anders

    2014-01-01

    Crowdsourcing lets us collect multiple annotations for an item from several annotators. Typically, these are annotations for non-sequential classification tasks. While there has been some work on crowdsourcing named entity annotations, researchers have assumed that syntactic tasks such as part......-of-speech (POS) tagging cannot be crowdsourced. This paper shows that workers can actually annotate sequential data almost as well as experts. Further, we show that the models learned from crowdsourced annotations fare as well as the models learned from expert annotations in downstream tasks....

  11. Robust statistical methods with R

    CERN Document Server

    Jureckova, Jana

    2005-01-01

    Robust statistical methods were developed to supplement the classical procedures when the data violate classical assumptions. They are ideally suited to applied research across a broad spectrum of study, yet most books on the subject are narrowly focused, overly theoretical, or simply outdated. Robust Statistical Methods with R provides a systematic treatment of robust procedures with an emphasis on practical application.The authors work from underlying mathematical tools to implementation, paying special attention to the computational aspects. They cover the whole range of robust methods, including differentiable statistical functions, distance of measures, influence functions, and asymptotic distributions, in a rigorous yet approachable manner. Highlighting hands-on problem solving, many examples and computational algorithms using the R software supplement the discussion. The book examines the characteristics of robustness, estimators of real parameter, large sample properties, and goodness-of-fit tests. It...

  12. Deep Question Answering for protein annotation.

    Science.gov (United States)

    Gobeill, Julien; Gaudinat, Arnaud; Pasche, Emilie; Vishnyakova, Dina; Gaudet, Pascale; Bairoch, Amos; Ruch, Patrick

    2015-01-01

    Biomedical professionals have access to a huge amount of literature, but when they use a search engine, they often have to deal with too many documents to efficiently find the appropriate information in a reasonable time. In this perspective, question-answering (QA) engines are designed to display answers, which were automatically extracted from the retrieved documents. Standard QA engines in literature process a user question, then retrieve relevant documents and finally extract some possible answers out of these documents using various named-entity recognition processes. In our study, we try to answer complex genomics questions, which can be adequately answered only using Gene Ontology (GO) concepts. Such complex answers cannot be found using state-of-the-art dictionary- and redundancy-based QA engines. We compare the effectiveness of two dictionary-based classifiers for extracting correct GO answers from a large set of 100 retrieved abstracts per question. In the same way, we also investigate the power of GOCat, a GO supervised classifier. GOCat exploits the GOA database to propose GO concepts that were annotated by curators for similar abstracts. This approach is called deep QA, as it adds an original classification step, and exploits curated biological data to infer answers, which are not explicitly mentioned in the retrieved documents. We show that for complex answers such as protein functional descriptions, the redundancy phenomenon has a limited effect. Similarly usual dictionary-based approaches are relatively ineffective. In contrast, we demonstrate how existing curated data, beyond information extraction, can be exploited by a supervised classifier, such as GOCat, to massively improve both the quantity and the quality of the answers with a +100% improvement for both recall and precision. Database URL: http://eagl.unige.ch/DeepQA4PA/. © The Author(s) 2015. Published by Oxford University Press.

  13. Designing Robustness to Temperature in a Feedforward Loop Circuit

    OpenAIRE

    Sen, Shaunak; Kim, Jongmin; Murray, Richard M.

    2013-01-01

    Incoherent feedforward loops represent important biomolecular circuit elements capable of a rich set of dynamic behavior including adaptation and pulsed responses. Temperature can modulate some of these properties through its effect on the underlying reaction rate parameters. It is generally unclear how to design such a circuit where the properties are robust to variations in temperature. Here, we address this issue using a combination of tools from control and dynamical systems theory as wel...

  14. Design, validation and annotation of transcriptome-wide oligonucleotide probes for the oligochaete annelid Eisenia fetida.

    Directory of Open Access Journals (Sweden)

    Ping Gong

    Full Text Available High density oligonucleotide probe arrays have increasingly become an important tool in genomics studies. In organisms with incomplete genome sequence, one strategy for oligo probe design is to reduce the number of unique probes that target every non-redundant transcript through bioinformatic analysis and experimental testing. Here we adopted this strategy in making oligo probes for the earthworm Eisenia fetida, a species for which we have sequenced transcriptome-scale expressed sequence tags (ESTs. Our objectives were to identify unique transcripts as targets, to select an optimal and non-redundant oligo probe for each of these target ESTs, and to annotate the selected target sequences. We developed a streamlined and easy-to-follow approach to the design, validation and annotation of species-specific array probes. Four 244K-formatted oligo arrays were designed using eArray and were hybridized to a pooled E. fetida cRNA sample. We identified 63,541 probes with unsaturated signal intensities consistently above the background level. Target transcripts of these probes were annotated using several sequence alignment algorithms. Significant hits were obtained for 37,439 (59% probed targets. We validated and made publicly available 63.5K oligo probes so the earthworm research community can use them to pursue ecological, toxicological, and other functional genomics questions. Our approach is efficient, cost-effective and robust because it (1 does not require a major genomics core facility; (2 allows new probes to be easily added and old probes modified or eliminated when new sequence information becomes available, (3 is not bioinformatics-intensive upfront but does provide opportunities for more in-depth annotation of biological functions for target genes; and (4 if desired, EST orthologs to the UniGene clusters of a reference genome can be identified and selected in order to improve the target gene specificity of designed probes. This approach is

  15. Robust loss functions for boosting.

    Science.gov (United States)

    Kanamori, Takafumi; Takenouchi, Takashi; Eguchi, Shinto; Murata, Noboru

    2007-08-01

    Boosting is known as a gradient descent algorithm over loss functions. It is often pointed out that the typical boosting algorithm, Adaboost, is highly affected by outliers. In this letter, loss functions for robust boosting are studied. Based on the concept of robust statistics, we propose a transformation of loss functions that makes boosting algorithms robust against extreme outliers. Next, the truncation of loss functions is applied to contamination models that describe the occurrence of mislabels near decision boundaries. Numerical experiments illustrate that the proposed loss functions derived from the contamination models are useful for handling highly noisy data in comparison with other loss functions.

  16. Theoretical Framework for Robustness Evaluation

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard

    2011-01-01

    This paper presents a theoretical framework for evaluation of robustness of structural systems, incl. bridges and buildings. Typically modern structural design codes require that ‘the consequence of damages to structures should not be disproportional to the causes of the damages’. However, although...... the importance of robustness for structural design is widely recognized the code requirements are not specified in detail, which makes the practical use difficult. This paper describes a theoretical and risk based framework to form the basis for quantification of robustness and for pre-normative guidelines...

  17. Robustness of airline route networks

    Science.gov (United States)

    Lordan, Oriol; Sallan, Jose M.; Escorihuela, Nuria; Gonzalez-Prieto, David

    2016-03-01

    Airlines shape their route network by defining their routes through supply and demand considerations, paying little attention to network performance indicators, such as network robustness. However, the collapse of an airline network can produce high financial costs for the airline and all its geographical area of influence. The aim of this study is to analyze the topology and robustness of the network route of airlines following Low Cost Carriers (LCCs) and Full Service Carriers (FSCs) business models. Results show that FSC hubs are more central than LCC bases in their route network. As a result, LCC route networks are more robust than FSC networks.

  18. ASAP: Amplification, sequencing & annotation of plastomes

    Directory of Open Access Journals (Sweden)

    Folta Kevin M

    2005-12-01

    Full Text Available Abstract Background Availability of DNA sequence information is vital for pursuing structural, functional and comparative genomics studies in plastids. Traditionally, the first step in mining the valuable information within a chloroplast genome requires sequencing a chloroplast plasmid library or BAC clones. These activities involve complicated preparatory procedures like chloroplast DNA isolation or identification of the appropriate BAC clones to be sequenced. Rolling circle amplification (RCA is being used currently to amplify the chloroplast genome from purified chloroplast DNA and the resulting products are sheared and cloned prior to sequencing. Herein we present a universal high-throughput, rapid PCR-based technique to amplify, sequence and assemble plastid genome sequence from diverse species in a short time and at reasonable cost from total plant DNA, using the large inverted repeat region from strawberry and peach as proof of concept. The method exploits the highly conserved coding regions or intergenic regions of plastid genes. Using an informatics approach, chloroplast DNA sequence information from 5 available eudicot plastomes was aligned to identify the most conserved regions. Cognate primer pairs were then designed to generate ~1 – 1.2 kb overlapping amplicons from the inverted repeat region in 14 diverse genera. Results 100% coverage of the inverted repeat region was obtained from Arabidopsis, tobacco, orange, strawberry, peach, lettuce, tomato and Amaranthus. Over 80% coverage was obtained from distant species, including Ginkgo, loblolly pine and Equisetum. Sequence from the inverted repeat region of strawberry and peach plastome was obtained, annotated and analyzed. Additionally, a polymorphic region identified from gel electrophoresis was sequenced from tomato and Amaranthus. Sequence analysis revealed large deletions in these species relative to tobacco plastome thus exhibiting the utility of this method for structural and

  19. False positive reduction in protein-protein interaction predictions using gene ontology annotations

    Directory of Open Access Journals (Sweden)

    Lin Yen-Han

    2007-07-01

    Full Text Available Abstract Background Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated. Results Gene Ontology (GO annotations were used to reduce false positive protein-protein interactions (PPI pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The 'strength', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the strength varies between two and ten-fold of randomly removing protein pairs from the datasets. Conclusion Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially

  20. Robust Portfolio Optimization Using Pseudodistances.

    Science.gov (United States)

    Toma, Aida; Leoni-Aubin, Samuela

    2015-01-01

    The presence of outliers in financial asset returns is a frequently occurring phenomenon which may lead to unreliable mean-variance optimized portfolios. This fact is due to the unbounded influence that outliers can have on the mean returns and covariance estimators that are inputs in the optimization procedure. In this paper we present robust estimators of mean and covariance matrix obtained by minimizing an empirical version of a pseudodistance between the assumed model and the true model underlying the data. We prove and discuss theoretical properties of these estimators, such as affine equivariance, B-robustness, asymptotic normality and asymptotic relative efficiency. These estimators can be easily used in place of the classical estimators, thereby providing robust optimized portfolios. A Monte Carlo simulation study and applications to real data show the advantages of the proposed approach. We study both in-sample and out-of-sample performance of the proposed robust portfolios comparing them with some other portfolios known in literature.

  1. Robust methods for data reduction

    CERN Document Server

    Farcomeni, Alessio

    2015-01-01

    Robust Methods for Data Reduction gives a non-technical overview of robust data reduction techniques, encouraging the use of these important and useful methods in practical applications. The main areas covered include principal components analysis, sparse principal component analysis, canonical correlation analysis, factor analysis, clustering, double clustering, and discriminant analysis.The first part of the book illustrates how dimension reduction techniques synthesize available information by reducing the dimensionality of the data. The second part focuses on cluster and discriminant analy

  2. Annotation of nerve cord transcriptome in earthworm Eisenia fetida

    Directory of Open Access Journals (Sweden)

    Vasanthakumar Ponesakki

    2017-12-01

    Full Text Available In annelid worms, the nerve cord serves as a crucial organ to control the sensory and behavioral physiology. The inadequate genome resource of earthworms has prioritized the comprehensive analysis of their transcriptome dataset to monitor the genes express in the nerve cord and predict their role in the neurotransmission and sensory perception of the species. The present study focuses on identifying the potential transcripts and predicting their functional features by annotating the transcriptome dataset of nerve cord tissues prepared by Gong et al., 2010 from the earthworm Eisenia fetida. Totally 9762 transcripts were successfully annotated against the NCBI nr database using the BLASTX algorithm and among them 7680 transcripts were assigned to a total of 44,354 GO terms. The conserve domain analysis indicated the over representation of P-loop NTPase domain and calcium binding EF-hand domain. The COG functional annotation classified 5860 transcript sequences into 25 functional categories. Further, 4502 contig sequences were found to map with 124 KEGG pathways. The annotated contig dataset exhibited 22 crucial neuropeptides having considerable matches to the marine annelid Platynereis dumerilii, suggesting their possible role in neurotransmission and neuromodulation. In addition, 108 human stem cell marker homologs were identified including the crucial epigenetic regulators, transcriptional repressors and cell cycle regulators, which may contribute to the neuronal and segmental regeneration. The complete functional annotation of this nerve cord transcriptome can be further utilized to interpret genetic and molecular mechanisms associated with neuronal development, nervous system regeneration and nerve cord function.

  3. Graph-based sequence annotation using a data integration approach

    Directory of Open Access Journals (Sweden)

    Pesch Robert

    2008-06-01

    Full Text Available The automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara- Cyc which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation.

  4. Graph-based sequence annotation using a data integration approach.

    Science.gov (United States)

    Pesch, Robert; Lysenko, Artem; Hindle, Matthew; Hassani-Pak, Keywan; Thiele, Ralf; Rawlings, Christopher; Köhler, Jacob; Taubert, Jan

    2008-08-25

    The automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara-Cyc) which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation. The methods and algorithms presented in this publication are an integral part of the ONDEX system which is freely available from http://ondex.sf.net/.

  5. Evaluating Functional Annotations of Enzymes Using the Gene Ontology.

    Science.gov (United States)

    Holliday, Gemma L; Davidson, Rebecca; Akiva, Eyal; Babbitt, Patricia C

    2017-01-01

    The Gene Ontology (GO) (Ashburner et al., Nat Genet 25(1):25-29, 2000) is a powerful tool in the informatics arsenal of methods for evaluating annotations in a protein dataset. From identifying the nearest well annotated homologue of a protein of interest to predicting where misannotation has occurred to knowing how confident you can be in the annotations assigned to those proteins is critical. In this chapter we explore what makes an enzyme unique and how we can use GO to infer aspects of protein function based on sequence similarity. These can range from identification of misannotation or other errors in a predicted function to accurate function prediction for an enzyme of entirely unknown function. Although GO annotation applies to any gene products, we focus here a describing our approach for hierarchical classification of enzymes in the Structure-Function Linkage Database (SFLD) (Akiva et al., Nucleic Acids Res 42(Database issue):D521-530, 2014) as a guide for informed utilisation of annotation transfer based on GO terms.

  6. An annotated corpus with nanomedicine and pharmacokinetic parameters

    Directory of Open Access Journals (Sweden)

    Lewinski NA

    2017-10-01

    Full Text Available Nastassja A Lewinski,1 Ivan Jimenez,1 Bridget T McInnes2 1Department of Chemical and Life Science Engineering, Virginia Commonwealth University, Richmond, VA, 2Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA Abstract: A vast amount of data on nanomedicines is being generated and published, and natural language processing (NLP approaches can automate the extraction of unstructured text-based data. Annotated corpora are a key resource for NLP and information extraction methods which employ machine learning. Although corpora are available for pharmaceuticals, resources for nanomedicines and nanotechnology are still limited. To foster nanotechnology text mining (NanoNLP efforts, we have constructed a corpus of annotated drug product inserts taken from the US Food and Drug Administration’s Drugs@FDA online database. In this work, we present the development of the Engineered Nanomedicine Database corpus to support the evaluation of nanomedicine entity extraction. The data were manually annotated for 21 entity mentions consisting of nanomedicine physicochemical characterization, exposure, and biologic response information of 41 Food and Drug Administration-approved nanomedicines. We evaluate the reliability of the manual annotations and demonstrate the use of the corpus by evaluating two state-of-the-art named entity extraction systems, OpenNLP and Stanford NER. The annotated corpus is available open source and, based on these results, guidelines and suggestions for future development of additional nanomedicine corpora are provided. Keywords: nanotechnology, informatics, natural language processing, text mining, corpora

  7. Elucidating high-dimensional cancer hallmark annotation via enriched ontology.

    Science.gov (United States)

    Yan, Shankai; Wong, Ka-Chun

    2017-09-01

    Cancer hallmark annotation is a promising technique that could discover novel knowledge about cancer from the biomedical literature. The automated annotation of cancer hallmarks could reveal relevant cancer transformation processes in the literature or extract the articles that correspond to the cancer hallmark of interest. It acts as a complementary approach that can retrieve knowledge from massive text information, advancing numerous focused studies in cancer research. Nonetheless, the high-dimensional nature of cancer hallmark annotation imposes a unique challenge. To address the curse of dimensionality, we compared multiple cancer hallmark annotation methods on 1580 PubMed abstracts. Based on the insights, a novel approach, UDT-RF, which makes use of ontological features is proposed. It expands the feature space via the Medical Subject Headings (MeSH) ontology graph and utilizes novel feature selections for elucidating the high-dimensional cancer hallmark annotation space. To demonstrate its effectiveness, state-of-the-art methods are compared and evaluated by a multitude of performance metrics, revealing the full performance spectrum on the full set of cancer hallmarks. Several case studies are conducted, demonstrating how the proposed approach could reveal novel insights into cancers. https://github.com/cskyan/chmannot. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Consumer energy research: an annotated bibliography. Vol. 3

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, D.C.; McDougall, G.H.G.

    1983-04-01

    This annotated bibliography attempts to provide a comprehensive package of existing information in consumer related energy research. A concentrated effort was made to collect unpublished material as well as material from journals and other sources, including governments, utilities research institutes and private firms. A deliberate effort was made to include agencies outside North America. For the most part the bibliography is limited to annotations of empiracal studies. However, it includes a number of descriptive reports which appear to make a significant contribution to understanding consumers and energy use. The format of the annotations displays the author, date of publication, title and source of the study. Annotations of empirical studies are divided into four parts: objectives, methods, variables and findings/implications. Care was taken to provide a reasonable amount of detail in the annotations to enable the reader to understand the methodology, the results and the degree to which the implications fo the study can be generalized to other situations. Studies are arranged alphabetically by author. The content of the studies reviewed is classified in a series of tables which are intended to provide a summary of sources, types and foci of the various studies. These tables are intended to aid researchers interested in specific topics to locate those studies most relevant to their work. The studies are categorized using a number of different classification criteria, for example, methodology used, type of energy form, type of policy initiative, and type of consumer activity. A general overview of the studies is also presented. 17 tabs.

  9. Automatic annotation of lecture videos for multimedia driven pedagogical platforms

    Directory of Open Access Journals (Sweden)

    Ali Shariq Imran

    2016-12-01

    Full Text Available Today’s eLearning websites are heavily loaded with multimedia contents, which are often unstructured, unedited, unsynchronized, and lack inter-links among different multimedia components. Hyperlinking different media modality may provide a solution for quick navigation and easy retrieval of pedagogical content in media driven eLearning websites. In addition, finding meta-data information to describe and annotate media content in eLearning platforms is challenging, laborious, prone to errors, and time-consuming task. Thus annotations for multimedia especially of lecture videos became an important part of video learning objects. To address this issue, this paper proposes three major contributions namely, automated video annotation, the 3-Dimensional (3D tag clouds, and the hyper interactive presenter (HIP eLearning platform. Combining existing state-of-the-art SIFT together with tag cloud, a novel approach for automatic lecture video annotation for the HIP is proposed. New video annotations are implemented automatically providing the needed random access in lecture videos within the platform, and a 3D tag cloud is proposed as a new way of user interaction mechanism. A preliminary study of the usefulness of the system has been carried out, and the initial results suggest that 70% of the students opted for using HIP as their preferred eLearning platform at Gjøvik University College (GUC.

  10. An annotated corpus with nanomedicine and pharmacokinetic parameters.

    Science.gov (United States)

    Lewinski, Nastassja A; Jimenez, Ivan; McInnes, Bridget T

    2017-01-01

    A vast amount of data on nanomedicines is being generated and published, and natural language processing (NLP) approaches can automate the extraction of unstructured text-based data. Annotated corpora are a key resource for NLP and information extraction methods which employ machine learning. Although corpora are available for pharmaceuticals, resources for nanomedicines and nanotechnology are still limited. To foster nanotechnology text mining (NanoNLP) efforts, we have constructed a corpus of annotated drug product inserts taken from the US Food and Drug Administration's Drugs@FDA online database. In this work, we present the development of the Engineered Nanomedicine Database corpus to support the evaluation of nanomedicine entity extraction. The data were manually annotated for 21 entity mentions consisting of nanomedicine physicochemical characterization, exposure, and biologic response information of 41 Food and Drug Administration-approved nanomedicines. We evaluate the reliability of the manual annotations and demonstrate the use of the corpus by evaluating two state-of-the-art named entity extraction systems, OpenNLP and Stanford NER. The annotated corpus is available open source and, based on these results, guidelines and suggestions for future development of additional nanomedicine corpora are provided.

  11. Annotating abstract pronominal anaphora in the DAD project

    DEFF Research Database (Denmark)

    Navarretta, Costanza; Olsen, Sussi Anni

    2008-01-01

    n this paper we present an extension of the MATE/GNOME annotation scheme for anaphora (Poesio 2004) which accounts for abstract anaphora in Danish and Italian. By abstract anaphora it is here meant pronouns whose linguistic antecedents are verbal phrases, clauses and discourse segments. The exten......n this paper we present an extension of the MATE/GNOME annotation scheme for anaphora (Poesio 2004) which accounts for abstract anaphora in Danish and Italian. By abstract anaphora it is here meant pronouns whose linguistic antecedents are verbal phrases, clauses and discourse segments....... The extended scheme, which we call the DAD annotation scheme, allows to annotate information about abstract anaphora which is important to investigate their use, see Webber (1988), Gundel et al. (2003), Navarretta (2004) and which can influence their automatic treatment. Intercoder agreement scores obtained...... by applying the DAD annotation scheme on texts and dialogues in the two languages are given and show that th information proposed in the scheme can be recognised in a reliable way....

  12. A Resource of Quantitative Functional Annotation for Homo sapiens Genes.

    Science.gov (United States)

    Taşan, Murat; Drabkin, Harold J; Beaver, John E; Chua, Hon Nian; Dunham, Julie; Tian, Weidong; Blake, Judith A; Roth, Frederick P

    2012-02-01

    The body of human genomic and proteomic evidence continues to grow at ever-increasing rates, while annotation efforts struggle to keep pace. A surprisingly small fraction of human genes have clear, documented associations with specific functions, and new functions continue to be found for characterized genes. Here we assembled an integrated collection of diverse genomic and proteomic data for 21,341 human genes and make quantitative associations of each to 4333 Gene Ontology terms. We combined guilt-by-profiling and guilt-by-association approaches to exploit features unique to the data types. Performance was evaluated by cross-validation, prospective validation, and by manual evaluation with the biological literature. Functional-linkage networks were also constructed, and their utility was demonstrated by identifying candidate genes related to a glioma FLN using a seed network from genome-wide association studies. Our annotations are presented-alongside existing validated annotations-in a publicly accessible and searchable web interface.

  13. Annotation-Based Whole Genomic Prediction and Selection

    DEFF Research Database (Denmark)

    Kadarmideen, Haja; Do, Duy Ngoc; Janss, Luc

    Genomic selection is widely used in both animal and plant species, however, it is performed with no input from known genomic or biological role of genetic variants and therefore is a black box approach in a genomic era. This study investigated the role of different genomic regions and detected QTLs...... in their contribution to estimated genomic variances and in prediction of genomic breeding values by applying SNP annotation approaches to feed efficiency. Ensembl Variant Predictor (EVP) and Pig QTL database were used as the source of genomic annotation for 60K chip. Genomic prediction was performed using the Bayes...... classes. Predictive accuracy was 0.531, 0.532, 0.302, and 0.344 for DFI, RFI, ADG and BF, respectively. The contribution per SNP to total genomic variance was similar among annotated classes across different traits. Predictive performance of SNP classes did not significantly differ from randomized SNP...

  14. Annotating smart environment sensor data for activity learning.

    Science.gov (United States)

    Szewcyzk, S; Dwan, K; Minor, B; Swedlove, B; Cook, D

    2009-01-01

    The pervasive sensing technologies found in smart homes offer unprecedented opportunities for providing health monitoring and assistance to individuals experiencing difficulties living independently at home. In order to monitor the functional health of smart home residents, we need to design technologies that recognize and track the activities that people perform at home. Machine learning techniques can perform this task, but the software algorithms rely upon large amounts of sample data that is correctly labeled with the corresponding activity. Labeling, or annotating, sensor data with the corresponding activity can be time consuming, may require input from the smart home resident, and is often inaccurate. Therefore, in this paper we investigate four alternative mechanisms for annotating sensor data with a corresponding activity label. We evaluate the alternative methods along the dimensions of annotation time, resident burden, and accuracy using sensor data collected in a real smart apartment.

  15. Use of Annotations for Component and Framework Interoperability

    Science.gov (United States)

    David, O.; Lloyd, W.; Carlson, J.; Leavesley, G. H.; Geter, F.

    2009-12-01

    The popular programming languages Java and C# provide annotations, a form of meta-data construct. Software frameworks for web integration, web services, database access, and unit testing now take advantage of annotations to reduce the complexity of APIs and the quantity of integration code between the application and framework infrastructure. Adopting annotation features in frameworks has been observed to lead to cleaner and leaner application code. The USDA Object Modeling System (OMS) version 3.0 fully embraces the annotation approach and additionally defines a meta-data standard for components and models. In version 3.0 framework/model integration previously accomplished using API calls is now achieved using descriptive annotations. This enables the framework to provide additional functionality non-invasively such as implicit multithreading, and auto-documenting capabilities while achieving a significant reduction in the size of the model source code. Using a non-invasive methodology leads to models and modeling components with only minimal dependencies on the modeling framework. Since models and modeling components are not directly bound to framework by the use of specific APIs and/or data types they can more easily be reused both within the framework as well as outside of it. To study the effectiveness of an annotation based framework approach with other modeling frameworks, a framework-invasiveness study was conducted to evaluate the effects of framework design on model code quality. A monthly water balance model was implemented across several modeling frameworks and several software metrics were collected. The metrics selected were measures of non-invasive design methods for modeling frameworks from a software engineering perspective. It appears that the use of annotations positively impacts several software quality measures. In a next step, the PRMS model was implemented in OMS 3.0 and is currently being implemented for water supply forecasting in the

  16. Image annotation based on positive-negative instances learning

    Science.gov (United States)

    Zhang, Kai; Hu, Jiwei; Liu, Quan; Lou, Ping

    2017-07-01

    Automatic image annotation is now a tough task in computer vision, the main sense of this tech is to deal with managing the massive image on the Internet and assisting intelligent retrieval. This paper designs a new image annotation model based on visual bag of words, using the low level features like color and texture information as well as mid-level feature as SIFT, and mixture the pic2pic, label2pic and label2label correlation to measure the correlation degree of labels and images. We aim to prune the specific features for each single label and formalize the annotation task as a learning process base on Positive-Negative Instances Learning. Experiments are performed using the Corel5K Dataset, and provide a quite promising result when comparing with other existing methods.

  17. Tagging like Humans: Diverse and Distinct Image Annotation

    KAUST Repository

    Wu, Baoyuan

    2018-03-31

    In this work we propose a new automatic image annotation model, dubbed {\\\\bf diverse and distinct image annotation} (D2IA). The generative model D2IA is inspired by the ensemble of human annotations, which create semantically relevant, yet distinct and diverse tags. In D2IA, we generate a relevant and distinct tag subset, in which the tags are relevant to the image contents and semantically distinct to each other, using sequential sampling from a determinantal point process (DPP) model. Multiple such tag subsets that cover diverse semantic aspects or diverse semantic levels of the image contents are generated by randomly perturbing the DPP sampling process. We leverage a generative adversarial network (GAN) model to train D2IA. Extensive experiments including quantitative and qualitative comparisons, as well as human subject studies, on two benchmark datasets demonstrate that the proposed model can produce more diverse and distinct tags than the state-of-the-arts.

  18. Sensor Control And Film Annotation For Long Range, Standoff Reconnaissance

    Science.gov (United States)

    Schmidt, Thomas G.; Peters, Owen L.; Post, Lawrence H.

    1984-12-01

    This paper describes a Reconnaissance Data Annotation System that incorporates off-the-shelf technology and system designs providing a high degree of adaptability and interoperability to satisfy future reconnaissance data requirements. The history of data annotation for reconnaissance is reviewed in order to provide the base from which future developments can be assessed and technical risks minimized. The system described will accommodate new developments in recording head assemblies and the incorporation of advanced cameras of both the film and electro-optical type. Use of microprocessor control and digital bus inter-face form the central design philosophy. For long range, high altitude, standoff missions, the Data Annotation System computes the projected latitude and longitude of central target position from aircraft position and attitude. This complements the use of longer ranges and high altitudes for reconnaissance missions.

  19. Processing sequence annotation data using the Lua programming language.

    Science.gov (United States)

    Ueno, Yutaka; Arita, Masanori; Kumagai, Toshitaka; Asai, Kiyoshi

    2003-01-01

    The data processing language in a graphical software tool that manages sequence annotation data from genome databases should provide flexible functions for the tasks in molecular biology research. Among currently available languages we adopted the Lua programming language. It fulfills our requirements to perform computational tasks for sequence map layouts, i.e. the handling of data containers, symbolic reference to data, and a simple programming syntax. Upon importing a foreign file, the original data are first decomposed in the Lua language while maintaining the original data schema. The converted data are parsed by the Lua interpreter and the contents are stored in our data warehouse. Then, portions of annotations are selected and arranged into our catalog format to be depicted on the sequence map. Our sequence visualization program was successfully implemented, embedding the Lua language for processing of annotation data and layout script. The program is available at http://staff.aist.go.jp/yutaka.ueno/guppy/.

  20. An Atlas of annotations of Hydra vulgaris transcriptome.

    Science.gov (United States)

    Evangelista, Daniela; Tripathi, Kumar Parijat; Guarracino, Mario Rosario

    2016-09-22

    RNA sequencing takes advantage of the Next Generation Sequencing (NGS) technologies for analyzing RNA transcript counts with an excellent accuracy. Trying to interpret this huge amount of data in biological information is still a key issue, reason for which the creation of web-resources useful for their analysis is highly desiderable. Starting from a previous work, Transcriptator, we present the Atlas of Hydra's vulgaris, an extensible web tool in which its complete transcriptome is annotated. In order to provide to the users an advantageous resource that include the whole functional annotated transcriptome of Hydra vulgaris water polyp, we implemented the Atlas web-tool contains 31.988 accesible and downloadable transcripts of this non-reference model organism. Atlas, as a freely available resource, can be considered a valuable tool to rapidly retrieve functional annotation for transcripts differentially expressed in Hydra vulgaris exposed to the distinct experimental treatments. WEB RESOURCE URL: http://www-labgtp.na.icar.cnr.it/Atlas .

  1. Rfam: annotating families of non-coding RNA sequences.

    Science.gov (United States)

    Daub, Jennifer; Eberhardt, Ruth Y; Tate, John G; Burge, Sarah W

    2015-01-01

    The primary task of the Rfam database is to collate experimentally validated noncoding RNA (ncRNA) sequences from the published literature and facilitate the prediction and annotation of new homologues in novel nucleotide sequences. We group homologous ncRNA sequences into "families" and related families are further grouped into "clans." We collate and manually curate data cross-references for these families from other databases and external resources. Our Web site offers researchers a simple interface to Rfam and provides tools with which to annotate their own sequences using our covariance models (CMs), through our tools for searching, browsing, and downloading information on Rfam families. In this chapter, we will work through examples of annotating a query sequence, collating family information, and searching for data.

  2. Automatically Annotated Mapping for Indoor Mobile Robot Applications

    DEFF Research Database (Denmark)

    Özkil, Ali Gürcan; Howard, Thomas J.

    2012-01-01

    This paper presents a new and practical method for mapping and annotating indoor environments for mobile robot use. The method makes use of 2D occupancy grid maps for metric representation, and topology maps to indicate the connectivity of the ‘places-of-interests’ in the environment. Novel use...... localization and mapping in topology space, and fuses camera and robot pose estimations to build an automatically annotated global topo-metric map. It is developed as a framework for a hospital service robot and tested in a real hospital. Experiments show that the method is capable of producing globally...... consistent, automatically annotated hybrid metric-topological maps that is needed by mobile service robots....

  3. Robust design optimization using the price of robustness, robust least squares and regularization methods

    Science.gov (United States)

    Bukhari, Hassan J.

    2017-12-01

    In this paper a framework for robust optimization of mechanical design problems and process systems that have parametric uncertainty is presented using three different approaches. Robust optimization problems are formulated so that the optimal solution is robust which means it is minimally sensitive to any perturbations in parameters. The first method uses the price of robustness approach which assumes the uncertain parameters to be symmetric and bounded. The robustness for the design can be controlled by limiting the parameters that can perturb.The second method uses the robust least squares method to determine the optimal parameters when data itself is subjected to perturbations instead of the parameters. The last method manages uncertainty by restricting the perturbation on parameters to improve sensitivity similar to Tikhonov regularization. The methods are implemented on two sets of problems; one linear and the other non-linear. This methodology will be compared with a prior method using multiple Monte Carlo simulation runs which shows that the approach being presented in this paper results in better performance.

  4. Saint: a lightweight integration environment for model annotation.

    Science.gov (United States)

    Lister, Allyson L; Pocock, Matthew; Taschuk, Morgan; Wipat, Anil

    2009-11-15

    Saint is a web application which provides a lightweight annotation integration environment for quantitative biological models. The system enables modellers to rapidly mark up models with biological information derived from a range of data sources. Saint is freely available for use on the web at http://www.cisban.ac.uk/saint. The web application is implemented in Google Web Toolkit and Tomcat, with all major browsers supported. The Java source code is freely available for download at http://saint-annotate.sourceforge.net. The Saint web server requires an installation of libSBML and has been tested on Linux (32-bit Ubuntu 8.10 and 9.04).

  5. ONEMercury: Towards Automatic Annotation of Earth Science Metadata

    Science.gov (United States)

    Tuarob, S.; Pouchard, L. C.; Noy, N.; Horsburgh, J. S.; Palanisamy, G.

    2012-12-01

    Earth sciences have become more data-intensive, requiring access to heterogeneous data collected from multiple places, times, and thematic scales. For example, research on climate change may involve exploring and analyzing observational data such as the migration of animals and temperature shifts across the earth, as well as various model-observation inter-comparison studies. Recently, DataONE, a federated data network built to facilitate access to and preservation of environmental and ecological data, has come to exist. ONEMercury has recently been implemented as part of the DataONE project to serve as a portal for discovering and accessing environmental and observational data across the globe. ONEMercury harvests metadata from the data hosted by multiple data repositories and makes it searchable via a common search interface built upon cutting edge search engine technology, allowing users to interact with the system, intelligently filter the search results on the fly, and fetch the data from distributed data sources. Linking data from heterogeneous sources always has a cost. A problem that ONEMercury faces is the different levels of annotation in the harvested metadata records. Poorly annotated records tend to be missed during the search process as they lack meaningful keywords. Furthermore, such records would not be compatible with the advanced search functionality offered by ONEMercury as the interface requires a metadata record be semantically annotated. The explosion of the number of metadata records harvested from an increasing number of data repositories makes it impossible to annotate the harvested records manually, urging the need for a tool capable of automatically annotating poorly curated metadata records. In this paper, we propose a topic-model (TM) based approach for automatic metadata annotation. Our approach mines topics in the set of well annotated records and suggests keywords for poorly annotated records based on topic similarity. We utilize the

  6. LHCB RICH gas system proposal

    CERN Document Server

    Bosteels, Michel; Haider, S

    2001-01-01

    Both LHCb RICH will be operated with fluorocarbon as gas radiator. RICH 1 will be filled with 4m^3 of C4F10 and RICH 2 with 100m^3 of CF4. The gas systems will run as a closed loop circulation and a gas recovery system within the closed loop is planned for RICH 1, where the recovery of the CF4 will only be realised during filling and emptying of the detector. Inline gas purification is foreseen for the gas systems in order to limit water and oxygen impurities.

  7. Platelet Rich Plasma: What should the rheumatologist expect?

    Directory of Open Access Journals (Sweden)

    Sílvia Fernandes

    2015-07-01

    Full Text Available In the last few decades, thousands of patients have benefited from platelet rich plasma (PRP therapies, emerging as a safe alternative in many different medical fields. Current evidence suggests that PRP may be of benefit over standard treatment in osteoarthritis patients and, in the musculoskeletal soft tissue injuries potential healing effects are waiting to be confirmed with robust evidence. Finally, in systemic rheumatic diseases PRP seems to have a role to play in the treatment of extra-articular symptoms.

  8. Advances in robust fractional control

    CERN Document Server

    Padula, Fabrizio

    2015-01-01

    This monograph presents design methodologies for (robust) fractional control systems. It shows the reader how to take advantage of the superior flexibility of fractional control systems compared with integer-order systems in achieving more challenging control requirements. There is a high degree of current interest in fractional systems and fractional control arising from both academia and industry and readers from both milieux are catered to in the text. Different design approaches having in common a trade-off between robustness and performance of the control system are considered explicitly. The text generalizes methodologies, techniques and theoretical results that have been successfully applied in classical (integer) control to the fractional case. The first part of Advances in Robust Fractional Control is the more industrially-oriented. It focuses on the design of fractional controllers for integer processes. In particular, it considers fractional-order proportional-integral-derivative controllers, becau...

  9. Robustness of digital artist authentication

    DEFF Research Database (Denmark)

    Jacobsen, Robert; Nielsen, Morten

    In many cases it is possible to determine the authenticity of a painting from digital reproductions of the paintings; this has been demonstrated for a variety of artists and with different approaches. Common to all these methods in digital artist authentication is that the potential of the method...... is in focus, while the robustness has not been considered, i.e. the degree to which the data collection process influences the decision of the method. However, in order for an authentication method to be successful in practice, it needs to be robust to plausible error sources from the data collection....... In this paper we investigate the robustness of the newly proposed authenticity method introduced by the authors based on second generation multiresolution analysis. This is done by modelling a number of realistic factors that can occur in the data collection....

  10. Attractive ellipsoids in robust control

    CERN Document Server

    Poznyak, Alexander; Azhmyakov, Vadim

    2014-01-01

    This monograph introduces a newly developed robust-control design technique for a wide class of continuous-time dynamical systems called the “attractive ellipsoid method.” Along with a coherent introduction to the proposed control design and related topics, the monograph studies nonlinear affine control systems in the presence of uncertainty and presents a constructive and easily implementable control strategy that guarantees certain stability properties. The authors discuss linear-style feedback control synthesis in the context of the above-mentioned systems. The development and physical implementation of high-performance robust-feedback controllers that work in the absence of complete information is addressed, with numerous examples to illustrate how to apply the attractive ellipsoid method to mechanical and electromechanical systems. While theorems are proved systematically, the emphasis is on understanding and applying the theory to real-world situations. Attractive Ellipsoids in Robust Control will a...

  11. Robustness of holonomic quantum gates

    International Nuclear Information System (INIS)

    Solinas, P.; Zanardi, P.; Zanghi, N.

    2005-01-01

    Full text: If the driving field fluctuates during the quantum evolution this produces errors in the applied operator. The holonomic (and geometrical) quantum gates are believed to be robust against some kind of noise. Because of the geometrical dependence of the holonomic operators can be robust against this kind of noise; in fact if the fluctuations are fast enough they cancel out leaving the final operator unchanged. I present the numerical studies of holonomic quantum gates subject to this parametric noise, the fidelity of the noise and ideal evolution is calculated for different noise correlation times. The holonomic quantum gates seem robust not only for fast fluctuating fields but also for slow fluctuating fields. These results can be explained as due to the geometrical feature of the holonomic operator: for fast fluctuating fields the fluctuations are canceled out, for slow fluctuating fields the fluctuations do not perturb the loop in the parameter space. (author)

  12. Robust estimation and hypothesis testing

    CERN Document Server

    Tiku, Moti L

    2004-01-01

    In statistical theory and practice, a certain distribution is usually assumed and then optimal solutions sought. Since deviations from an assumed distribution are very common, one cannot feel comfortable with assuming a particular distribution and believing it to be exactly correct. That brings the robustness issue in focus. In this book, we have given statistical procedures which are robust to plausible deviations from an assumed mode. The method of modified maximum likelihood estimation is used in formulating these procedures. The modified maximum likelihood estimators are explicit functions of sample observations and are easy to compute. They are asymptotically fully efficient and are as efficient as the maximum likelihood estimators for small sample sizes. The maximum likelihood estimators have computational problems and are, therefore, elusive. A broad range of topics are covered in this book. Solutions are given which are easy to implement and are efficient. The solutions are also robust to data anomali...

  13. Robustness in Railway Operations (RobustRailS)

    DEFF Research Database (Denmark)

    Jensen, Jens Parbo; Nielsen, Otto Anker

    This study considers the problem of enhancing railway timetable robustness without adding slack time, hence increasing the travel time. The approach integrates a transit assignment model to assess how passengers adapt their behaviour whenever operations are changed. First, the approach considers...

  14. A Robust Design Applicability Model

    DEFF Research Database (Denmark)

    Ebro, Martin; Lars, Krogstie; Howard, Thomas J.

    2015-01-01

    to be applicable in organisations assigning a high importance to one or more factors that are known to be impacted by RD, while also experiencing a high level of occurrence of this factor. The RDAM supplements existing maturity models and metrics to provide a comprehensive set of data to support management......This paper introduces a model for assessing the applicability of Robust Design (RD) in a project or organisation. The intention of the Robust Design Applicability Model (RDAM) is to provide support for decisions by engineering management considering the relevant level of RD activities...

  15. Ins-Robust Primitive Words

    OpenAIRE

    Srivastava, Amit Kumar; Kapoor, Kalpesh

    2017-01-01

    Let Q be the set of primitive words over a finite alphabet with at least two symbols. We characterize a class of primitive words, Q_I, referred to as ins-robust primitive words, which remain primitive on insertion of any letter from the alphabet and present some properties that characterizes words in the set Q_I. It is shown that the language Q_I is dense. We prove that the language of primitive words that are not ins-robust is not context-free. We also present a linear time algorithm to reco...

  16. Information rich display design

    International Nuclear Information System (INIS)

    Welch, Robin; Braseth, Alf Ove; Veland, Oeystein

    2004-01-01

    This paper presents the concept Information Rich Displays. The purpose of Information Rich Displays (IRDs) is to condensate prevailing information in process displays in such a way that each display format (picture) contains more relevant information for the user. Compared to traditional process control displays, this new concept allows the operator to attain key information at a glance and at the same time allows for improved monitoring of larger portions of the process. This again allows for reduced navigation between both process and trend displays and ease the cognitive demand on the operator. This concept has been created while working on designing display prototypes for the offshore petroleum production facilities of tomorrow. Offshore installations basically consist of wells, separation trains (where oil, gas and water are separated from each other), an oil tax measurement system (where oil quality is measured and the pressure increased to allow for export), gas compression (compression of gas for export) and utility systems (water treatment, chemical systems etc.). This means that an offshore control room operator has to deal with a complex process that comprises several functionally different systems. The need for a new approach to offshore display format design is in particular based on shortcomings in today's designs related to the keyhole effect, where the display format only reveals a fraction of the whole process. Furthermore, the upcoming introduction of larger off- and on-shore operation centres will increase the size and complexity of the operators' work domain. In the light of the increased demands on the operator, the proposed IRDs aim to counter the negative effects this may have on the workload. In this work we have attempted to classify the wide range of different roles an operator can have in different situations. The information content and amount being presented to the operator in a display should be viewed in context of the roles the

  17. A robust approach based on Weibull distribution for clustering gene expression data

    Directory of Open Access Journals (Sweden)

    Gong Binsheng

    2011-05-01

    Full Text Available Abstract Background Clustering is a widely used technique for analysis of gene expression data. Most clustering methods group genes based on the distances, while few methods group genes according to the similarities of the distributions of the gene expression levels. Furthermore, as the biological annotation resources accumulated, an increasing number of genes have been annotated into functional categories. As a result, evaluating the performance of clustering methods in terms of the functional consistency of the resulting clusters is of great interest. Results In this paper, we proposed the WDCM (Weibull Distribution-based Clustering Method, a robust approach for clustering gene expression data, in which the gene expressions of individual genes are considered as the random variables following unique Weibull distributions. Our WDCM is based on the concept that the genes with similar expression profiles have similar distribution parameters, and thus the genes are clustered via the Weibull distribution parameters. We used the WDCM to cluster three cancer gene expression data sets from the lung cancer, B-cell follicular lymphoma and bladder carcinoma and obtained well-clustered results. We compared the performance of WDCM with k-means and Self Organizing Map (SOM using functional annotation information given by the Gene Ontology (GO. The results showed that the functional annotation ratios of WDCM are higher than those of the other methods. We also utilized the external measure Adjusted Rand Index to validate the performance of the WDCM. The comparative results demonstrate that the WDCM provides the better clustering performance compared to k-means and SOM algorithms. The merit of the proposed WDCM is that it can be applied to cluster incomplete gene expression data without imputing the missing values. Moreover, the robustness of WDCM is also evaluated on the incomplete data sets. Conclusions The results demonstrate that our WDCM produces clusters

  18. Automated detection of microaneurysms using robust blob descriptors

    Science.gov (United States)

    Adal, K.; Ali, S.; Sidibé, D.; Karnowski, T.; Chaum, E.; Mériaudeau, F.

    2013-03-01

    Microaneurysms (MAs) are among the first signs of diabetic retinopathy (DR) that can be seen as round dark-red structures in digital color fundus photographs of retina. In recent years, automated computer-aided detection and diagnosis (CAD) of MAs has attracted many researchers due to its low-cost and versatile nature. In this paper, the MA detection problem is modeled as finding interest points from a given image and several interest point descriptors are introduced and integrated with machine learning techniques to detect MAs. The proposed approach starts by applying a novel fundus image contrast enhancement technique using Singular Value Decomposition (SVD) of fundus images. Then, Hessian-based candidate selection algorithm is applied to extract image regions which are more likely to be MAs. For each candidate region, robust low-level blob descriptors such as Speeded Up Robust Features (SURF) and Intensity Normalized Radon Transform are extracted to characterize candidate MA regions. The combined features are then classified using SVM which has been trained using ten manually annotated training images. The performance of the overall system is evaluated on Retinopathy Online Challenge (ROC) competition database. Preliminary results show the competitiveness of the proposed candidate selection techniques against state-of-the art methods as well as the promising future for the proposed descriptors to be used in the localization of MAs from fundus images.

  19. Annotated Bibliography; Freedom of Information Center Reports and Summary Papers.

    Science.gov (United States)

    Freedom of Information Center, Columbia, MO.

    This bibliography lists and annotates almost 400 information reports, opinion papers, and summary papers dealing with freedom of information. Topics covered include the nature of press freedom and increased press efforts toward more open access to information; the press situation in many foreign countries, including France, Sweden, Communist…

  20. Annotated bibliography of remote sensing methods for monitoring desertification

    Science.gov (United States)

    Walker, A.S.; Robinove, Charles J.

    1981-01-01

    Remote sensing techniques are valuable for locating, assessing, and monitoring desertification. Remotely sensed data provide a permanent record of the condition of the land in a format that allows changes in land features and condition to be measured. The annotated bibliography of 118 items discusses remote sensing methods that may be applied to desertification studies.

  1. JAFA: a protein function annotation meta-server

    DEFF Research Database (Denmark)

    Friedberg, Iddo; Harder, Tim; Godzik, Adam

    2006-01-01

    Annotations, or JAFA server. JAFA queries several function prediction servers with a protein sequence and assembles the returned predictions in a legible, non-redundant format. In this manner, JAFA combines the predictions of several servers to provide a comprehensive view of what are the predicted functions...

  2. Feeling Expression Using Avatars and Its Consistency for Subjective Annotation

    Science.gov (United States)

    Ito, Fuyuko; Sasaki, Yasunari; Hiroyasu, Tomoyuki; Miki, Mitsunori

    Consumer Generated Media(CGM) is growing rapidly and the amount of content is increasing. However, it is often difficult for users to extract important contents and the existence of contents recording their experiences can easily be forgotten. As there are no methods or systems to indicate the subjective value of the contents or ways to reuse them, subjective annotation appending subjectivity, such as feelings and intentions, to contents is needed. Representation of subjectivity depends on not only verbal expression, but also nonverbal expression. Linguistically expressed annotation, typified by collaborative tagging in social bookmarking systems, has come into widespread use, but there is no system of nonverbally expressed annotation on the web. We propose the utilization of controllable avatars as a means of nonverbal expression of subjectivity, and confirmed the consistency of feelings elicited by avatars over time for an individual and in a group. In addition, we compared the expressiveness and ease of subjective annotation between collaborative tagging and controllable avatars. The result indicates that the feelings evoked by avatars are consistent in both cases, and using controllable avatars is easier than collaborative tagging for representing feelings elicited by contents that do not express meaning, such as photos.

  3. Douglas-fir tussock moth: an annotated bibliography.

    Science.gov (United States)

    Robert W. Campbell; Lorna C. Youngs

    1978-01-01

    This annotated bibliography includes references to 338 papers. Each deals in some way with either the Douglas-fir tussock moth, Orgyia pseudotsugata (McDunnough), or a related species. Specifically, 210 publications and 82 unpublished documents make some reference, at least, to the Douglas-fir tussock moth; 55 are concerned with other species in...

  4. WORKSHOPS FOR THE HANDICAPPED, AN ANNOTATED BIBLIOGRAPHY--NO. 3.

    Science.gov (United States)

    PERKINS, DOROTHY C.; AND OTHERS

    THESE 126 ANNOTATIONS ARE THE THIRD VOLUME OF A CONTINUING SERIES OF BIBLIOGRAPHIES LISTING ARTICLES APPEARING IN JOURNALS AND CONFERENCE, RESEARCH, AND PROJECT REPORTS. LISTINGS INCLUDE TESTS, TEST RESULTS, STAFF TRAINING PROGRAMS, GUIDES FOR COUNSELORS AND TEACHERS, AND ARCHITECTURAL PLANNING, AND RELATE TO THE MENTALLY RETARDED, EMOTIONALLY…

  5. Workshops for the Handicapped; An Annotated Bibliography - No. 6.

    Science.gov (United States)

    Perkins, Dorothy C., Comp.; And Others

    An annotated bibliography of workshops for the handicapped covers the literature on work programs for the period July, 1968 through June, 1969. One hundred and fifty four publications were reviewed; the number of articles on administration, management, and planning of facilities and programs has increased since the last edition. (Author/RJ)

  6. Communication in a Diverse Classroom: An Annotated Bibliographic Review

    Science.gov (United States)

    Brown, Rachelle

    2016-01-01

    Students have social and personal needs to fulfill and communicate these needs in different ways. This annotated bibliographic review examined communication studies to provide educators of diverse classrooms with ideas to build an environment that contributes to student well-being. Participants in the studies ranged in age, ability, and cultural…

  7. Annotation of Tutorial Dialogue Goals for Natural Language Generation

    Science.gov (United States)

    Kim, Jung Hee; Freedman, Reva; Glass, Michael; Evens, Martha W.

    2006-01-01

    We annotated transcripts of human tutoring dialogue for the purpose of constructing a dialogue-based intelligent tutoring system, CIRCSIM-Tutor. The tutors were professors of physiology who were also expert tutors. The students were 1st year medical students who communicated with the tutors using typed communication from separate rooms. The tutors…

  8. Genome Annotation in a Community College Cell Biology Lab

    Science.gov (United States)

    Beagley, C. Timothy

    2013-01-01

    The Biology Department at Salt Lake Community College has used the IMG-ACT toolbox to introduce a genome mapping and annotation exercise into the laboratory portion of its Cell Biology course. This project provides students with an authentic inquiry-based learning experience while introducing them to computational biology and contemporary learning…

  9. Annotated checklist of fungi in Cyprus Island. 1. Larger Basidiomycota

    Directory of Open Access Journals (Sweden)

    Miguel Torrejón

    2014-06-01

    Full Text Available An annotated checklist of wild fungi living in Cyprus Island has been compiled broughting together all the information collected from the different works dealing with fungi in this area throughout the three centuries of mycology in Cyprus. This part contains 363 taxa of macroscopic Basidiomycota.

  10. Rural Development Literature 1976-1977: An Updated Annotated Bibliography.

    Science.gov (United States)

    Buzzard, Shirley, Comp.

    More than 100 books and articles on rural development published during 1976-77 are annotated in this selective bibliography. Concentrating on social science literature, the bibliography is interdisciplinary in nature, spanning agricultural economics, anthropology, community development, community health, and rural sociology. Types of works…

  11. Exploring Metacognitive Strategies and Hypermedia Annotations on Foreign Language Reading

    Science.gov (United States)

    Shang, Hui-Fang

    2017-01-01

    The effective use of reading strategies has been recognized as an important way to increase reading comprehension in hypermedia environments. The purpose of the study was to explore whether metacognitive strategy use and access to hypermedia annotations facilitated reading comprehension based on English as a foreign language students' proficiency…

  12. Shakespeare Is Alive and Well in Cyberspace: An Annotated Bibliography.

    Science.gov (United States)

    Hett, Dorothy Marie

    2002-01-01

    Suggests that in addition to using books and movies to enhance students' understanding of Shakespeare, teachers can add the World Wide Web to their repertoire to help students connect to Shakespeare. Presents annotations of 12 websites to use for teaching Shakespeare. (SG)

  13. Annotated bibliography of highly ionized atoms of importance to plasmas

    International Nuclear Information System (INIS)

    Schmieder, R.W.

    1975-04-01

    A bibliography is presented of the literature on highly ionized atoms which have relevance to plasmas. The bibliography is annotated with keywords, and indexed by subjects and authors. It should be of greatest use to researchers working on the problems of impurity cooling and diagnostics of CTR plasmas. (U.S.)

  14. Adolescent Literacy Resources: An Annotated Bibliography. Second Edition 2009

    Science.gov (United States)

    Center on Instruction, 2009

    2009-01-01

    This annotated bibliography updated from a 2007 edition, is intended as a resource for technical assistance providers as they work with states on adolescent literacy. This revision includes current research and documents of practical use in guiding improvements in grades 4-12 reading instruction in the content areas and in interventions for…

  15. Ethical Issues in Health Services: A Report and Annotated Bibliography.

    Science.gov (United States)

    Carmody, James

    This publication identifies, discusses, and lists areas for further research for five ethical issues related to health services: 1) the right to health care; 2) death and euthanasia; 3) human experimentation; 4) genetic engineering; and, 5) abortion. Following a discussion of each issue is a selected annotated bibliography covering the years 1967…

  16. Vind(x): Using the user through cooperative annotation

    NARCIS (Netherlands)

    Williams, A.D.; Vuurpijl, Louis; Schomaker, Lambert; van den Broek, Egon

    2002-01-01

    In this paper, the image retrieval system Vind(x) is described. The architecture of the system and first user experiences are reported. Using Vind(x), users on the Internet may cooperatively annotate objects in paintings by use of the pen or mouse. The collected data can be searched through

  17. Legal and Political Aspects of Satellite Telecommunication: An Annotated Bibliography.

    Science.gov (United States)

    Shervis, Katherine, Comp.

    The potential of satellites for telecommunication is enormous; however, it is possible that political and legal barriers rather than technological considerations will ultimately shape the utilization of satellite systems. This annotated bibliography is designed for use by lawyers, political scientists, technicians, engineers, and scholars who need…

  18. Classic Religious Books for Children: An Annotated Bibliography.

    Science.gov (United States)

    Campbell, Carol, Comp.

    This annotated bibliography of religious books for children contains approximately 450 books, one-fifth of which are Judaic. The books' current availability has been verified using Web sites such as those of individual publishers, the Library of Congress, Amazon.com, or Barnes&Noble.com. New subject headings have been added, such as Kwanza,…

  19. On temporality in discourse annotation : Theoretical and practical considerations

    NARCIS (Netherlands)

    Evers-Vermeul, J.; Hoek, J.; Scholman, M.C.J.

    2017-01-01

    Temporal information is one of the prominent features that determine the coherence in a discourse. That is why we need an adequate way to deal with this type of information during discourse annotation. In this paper, we will argue that temporal order is a relational rather than a segment-specific

  20. From protein interactions to functional annotation: graph alignment in Herpes

    Czech Academy of Sciences Publication Activity Database

    Kolář, Michal; Lassig, M.; Berg, J.

    2008-01-01

    Roč. 2, č. 90 (2008), e-e ISSN 1752-0509 Institutional research plan: CEZ:AV0Z50520514 Keywords : graph alignment * functional annotation * protein orthology Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.706, year: 2008

  1. An Annotated Bibliography of the Gestalt Methods, Techniques, and Therapy

    Science.gov (United States)

    Prewitt-Diaz, Joseph O.

    The purpose of this annotated bibliography is to provide the reader with a guide to relevant research in the area of Gestalt therapy, techniques, and methods. The majority of the references are journal articles written within the last 5 years or documents easily obtained through interlibrary loans from local libraries. These references were…

  2. Learning visual contexts for image annotation from Flickr groups

    NARCIS (Netherlands)

    Ulges, A.; Worring, M.; Breuel, T.

    2011-01-01

    We present an extension of automatic image annotation that takes the context of a picture into account. Our core assumption is that users do not only provide individual images to be tagged, but group their pictures into batches (e.g., all snapshots taken over the same holiday trip), whereas the

  3. Biochemical Space: A Framework for Systemic Annotation of Biological Models

    Czech Academy of Sciences Publication Activity Database

    Klement, M.; Děd, T.; Šafránek, D.; Červený, Jan; Müller, Stefan; Steuer, Ralf

    2014-01-01

    Roč. 306, JUL (2014), s. 31-44 ISSN 1571-0661 R&D Projects: GA MŠk(CZ) EE2.3.20.0256 Institutional support: RVO:67179843 Keywords : biological models * model annotation * systems biology * cyanobacteria Subject RIV: EH - Ecology, Behaviour

  4. Automatic Compound Annotation from Mass Spectrometry Data Using MAGMa.

    NARCIS (Netherlands)

    Ridder, L.O.; Hooft, van der J.J.J.; Verhoeven, S.

    2014-01-01

    The MAGMa software for automatic annotation of mass spectrometry based fragmentation data was applied to 16 MS/MS datasets of the CASMI 2013 contest. Eight solutions were submitted in category 1 (molecular formula assignments) and twelve in category 2 (molecular structure assignment). The MS/MS

  5. Resources for Achieving Sex Equity: An Annotated Bibliography.

    Science.gov (United States)

    Miller, Susan W., Comp.

    This annotated bibliography provides a list of resources dealing with sex equity in vocational education. The bibliography first provides operational definitions of "sexism,""sex fair,""sex affirmative,""sex bias," and "affirmative action." It then lists resources under the following topics and/or bibliographic forms: (1) sex role definition, (2)…

  6. Pertinent Discussions Toward Modeling the Social Edition: Annotated Bibliographies

    NARCIS (Netherlands)

    Siemens, R.; Timney, M.; Leitch, C.; Koolen, C.; Garnett, A.

    2012-01-01

    The two annotated bibliographies present in this publication document and feature pertinent discussions toward the activity of modeling the social edition, first exploring reading devices, tools and social media issues and, second, social networking tools for professional readers in the Humanities.

  7. Wanda ML - a markup language for digital annotation

    NARCIS (Netherlands)

    Franke, K.Y.; Guyon, I.; Schomaker, L.R.B.; Vuurpijl, L.G.

    2004-01-01

    WANDAML is an XML-based markup language for the annotation and filter journaling of digital documents. It addresses in particular the needs of forensic handwriting data examination, by allowing experts to enter information about writer, material (pen, paper), script and content, and to record chains

  8. Intra-species sequence comparisons for annotating genomes

    Energy Technology Data Exchange (ETDEWEB)

    Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

    2004-07-15

    Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

  9. Annotated Bibliography of Materials for Elementary Foreign Language Programs.

    Science.gov (United States)

    Dobb, Fred

    An annotated bibliography contains about 70 citations of instructional materials and materials concerning curriculum development for elementary school foreign language programs. Citations are included for Arabic, classical languages, French, German, Hebrew, Italian, Japanese, and Spanish. Items on exploratory language courses and general works on…

  10. Sequence-based feature prediction and annotation of proteins

    DEFF Research Database (Denmark)

    Juncker, Agnieszka; Jensen, Lars J.; Pierleoni, Andrea

    2009-01-01

    A recent trend in computational methods for annotation of protein function is that many prediction tools are combined in complex workflows and pipelines to facilitate the analysis of feature combinations, for example, the entire repertoire of kinase-binding motifs in the human proteome....

  11. Annotated Bibliography of Law-Related Pollution Prevention Sources.

    Science.gov (United States)

    Lynch, Holly; Murphy, Elaine

    This annotated bibliography of law-related pollution prevention sources was prepared by the National Pollution Prevention Center for Higher Education. Some topics of the items include waste reduction, hazardous wastes, risk reduction, environmental policy, pollution prevention, environmental protection, environmental leadership, environmental…

  12. An Oral History Annotation Tool for INTER-VIEWs

    NARCIS (Netherlands)

    Heuvel, H. van den; Sanders, E.P.; Rutten, R.; Scagliola, S.; Witkamp, P.

    2012-01-01

    We present a web-based tool for retrieving and annotating audio fragments of e.g. interviews. Our collection contains 250 interviews with veterans of Dutch conflicts and military missions. The audio files of the interviews were disclosed using ASR technology focussed at keyword retrieval. Resulting

  13. A Study of Multimedia Annotation of Web-Based Materials

    Science.gov (United States)

    Hwang, Wu-Yuin; Wang, Chin-Yu; Sharples, Mike

    2007-01-01

    Web-based learning has become an important way to enhance learning and teaching, offering many learning opportunities. A limitation of current Web-based learning is the restricted ability of students to personalize and annotate the learning materials. Providing personalized tools and analyzing some types of learning behavior, such as students'…

  14. Persuasion: Attitude/Behavior Change. A Selected, Annotated Bibliography.

    Science.gov (United States)

    Benoit, William L.

    Designed for teachers, students and researchers of the psychological dimensions of attitude and behavior change, this annotated bibliography lists books, bibliographies and articles on the subject ranging from general introductions and surveys through specific research studies, and from theoretical position essays to literature reviews. The 42…

  15. An Annotated Bibliography of Isotonic Weight-Training Methods.

    Science.gov (United States)

    Wysong, John V.

    This literature study was conducted to compare and evaluate various types and techniques of weight lifting so that a weight lifting program could be selected or devised for a secondary school. Annotations of 32 research reports, journal articles, and monographs on isotonic strength training are presented. The literature in the first part of the…

  16. An Annotated Dataset of 14 Cardiac MR Images

    DEFF Research Database (Denmark)

    Stegmann, Mikkel Bille

    2002-01-01

    This note describes a dataset consisting of 14 annotated cardiac MR images. Points of correspondence are placed on each image at the left ventricle (LV). As such, the dataset can be readily used for building statistical models of shape. Further, format specifications and terms of use are given....

  17. An automated annotation tool for genomic DNA sequences using

    Indian Academy of Sciences (India)

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated ...

  18. MUTAGEN: Multi-user tool for annotating GENomes

    DEFF Research Database (Denmark)

    Brugger, K.; Redder, P.; Skovgaard, Marie

    2003-01-01

    MUTAGEN is a free prokaryotic annotation system. It offers the advantages of genome comparison, graphical sequence browsers, search facilities and open-source for user-specific adjustments. The web-interface allows several users to access the system from standard desktop computers. The Sulfolobus...

  19. The WANDAML Markup Language for Digital Document Annotation

    NARCIS (Netherlands)

    Franke, K.; Guyon, I.; Schomaker, L.; Vuurpijl, L.

    2004-01-01

    WANDAML is an XML-based markup language for the annotation and filter journaling of digital documents. It addresses in particular the needs of forensic handwriting data examination, by allowing experts to enter information about writer, material (pen, paper), script and content, and to record chains

  20. Laughter annotations in conversational speech corpora - possibilities and limitations for phonetic analysis

    NARCIS (Netherlands)

    Truong, Khiet Phuong; Trouvain, Jürgen

    Existing laughter annotations provided with several publicly available conversational speech corpora (both multiparty and dyadic conversations) were investigated and compared. We discuss the possibilities and limitations of these rather coarse and shallow laughter annotations. There are definition

  1. Annotation-based enrichment of Digital Objects using open-source frameworks

    Directory of Open Access Journals (Sweden)

    Marcus Emmanuel Barnes

    2017-07-01

    Full Text Available The W3C Web Annotation Data Model, Protocol, and Vocabulary unify approaches to annotations across the web, enabling their aggregation, discovery and persistence over time. In addition, new javascript libraries provide the ability for users to annotate multi-format content. In this paper, we describe how we have leveraged these developments to provide annotation features alongside Islandora’s existing preservation, access, and management capabilities. We also discuss our experience developing with the Web Annotation Model as an open web architecture standard, as well as our approach to integrating mature external annotation libraries. The resulting software (the Web Annotation Utility Module for Islandora accommodates annotation across multiple formats. This solution can be used in various digital scholarship contexts.

  2. Vegetation resurvey is robust to plot location uncertainty

    Science.gov (United States)

    Kopecký, Martin; Macek, Martin

    2017-01-01

    Aim Resurveys of historical vegetation plots are increasingly used for the assessment of decadal changes in plant species diversity and composition. However, historical plots are usually relocated only approximately. This potentially inflates temporal changes and undermines results. Location Temperate deciduous forests in Central Europe. Methods To explore if robust conclusions can be drawn from resurvey studies despite location uncertainty, we compared temporal changes in species richness, frequency, composition and compositional heterogeneity between exactly and approximately relocated plots. We hypothesized that compositional changes should be lower and changes in species richness should be less variable on exactly relocated plots, because pseudo-turnover inflates temporal changes on approximately relocated plots. Results Temporal changes in species richness were not more variable and temporal changes in species composition and compositional heterogeneity were not higher on approximately relocated plots. Moreover, the frequency of individual species changed similarly on both plot types. Main conclusions The resurvey of historical vegetation plots is robust to uncertainty in original plot location and, when done properly, provides reliable evidence of decadal changes in plant communities. This provides important background for other resurvey studies and opens up the possibility for large-scale assessments of plant community change. PMID:28503083

  3. Essays on robust asset pricing

    NARCIS (Netherlands)

    Horváth, Ferenc

    2017-01-01

    The central concept of this doctoral dissertation is robustness. I analyze how model and parameter uncertainty affect financial decisions of investors and fund managers, and what their equilibrium consequences are. Chapter 1 gives an overview of the most important concepts and methodologies used in

  4. Robust visual hashing via ICA

    International Nuclear Information System (INIS)

    Fournel, Thierry; Coltuc, Daniela

    2010-01-01

    Designed to maximize information transmission in the presence of noise, independent component analysis (ICA) could appear in certain circumstances as a statistics-based tool for robust visual hashing. Several ICA-based scenarios can attempt to reach this goal. A first one is here considered.

  5. Robustness of raw quantum tomography

    Science.gov (United States)

    Asorey, M.; Facchi, P.; Florio, G.; Man'ko, V. I.; Marmo, G.; Pascazio, S.; Sudarshan, E. C. G.

    2011-01-01

    We scrutinize the effects of non-ideal data acquisition on the tomograms of quantum states. The presence of a weight function, schematizing the effects of a finite window or equivalently noise, only affects the state reconstruction procedure by a normalization constant. The results are extended to a discrete mesh and show that quantum tomography is robust under incomplete and approximate knowledge of tomograms.

  6. Robustness of raw quantum tomography

    Energy Technology Data Exchange (ETDEWEB)

    Asorey, M. [Departamento de Fisica Teorica, Facultad de Ciencias, Universidad de Zaragoza, 50009 Zaragoza (Spain); Facchi, P. [Dipartimento di Matematica, Universita di Bari, I-70125 Bari (Italy); INFN, Sezione di Bari, I-70126 Bari (Italy); MECENAS, Universita Federico II di Napoli and Universita di Bari (Italy); Florio, G. [Dipartimento di Fisica, Universita di Bari, I-70126 Bari (Italy); INFN, Sezione di Bari, I-70126 Bari (Italy); MECENAS, Universita Federico II di Napoli and Universita di Bari (Italy); Man' ko, V.I., E-mail: manko@lebedev.r [P.N. Lebedev Physical Institute, Leninskii Prospect 53, Moscow 119991 (Russian Federation); Marmo, G. [Dipartimento di Scienze Fisiche, Universita di Napoli ' Federico II' , I-80126 Napoli (Italy); INFN, Sezione di Napoli, I-80126 Napoli (Italy); MECENAS, Universita Federico II di Napoli and Universita di Bari (Italy); Pascazio, S. [Dipartimento di Fisica, Universita di Bari, I-70126 Bari (Italy); INFN, Sezione di Bari, I-70126 Bari (Italy); MECENAS, Universita Federico II di Napoli and Universita di Bari (Italy); Sudarshan, E.C.G. [Department of Physics, University of Texas, Austin, TX 78712 (United States)

    2011-01-31

    We scrutinize the effects of non-ideal data acquisition on the tomograms of quantum states. The presence of a weight function, schematizing the effects of a finite window or equivalently noise, only affects the state reconstruction procedure by a normalization constant. The results are extended to a discrete mesh and show that quantum tomography is robust under incomplete and approximate knowledge of tomograms.

  7. Aspects of robust linear regression

    NARCIS (Netherlands)

    Davies, P.L.

    1993-01-01

    Section 1 of the paper contains a general discussion of robustness. In Section 2 the influence function of the Hampel-Rousseeuw least median of squares estimator is derived. Linearly invariant weak metrics are constructed in Section 3. It is shown in Section 4 that $S$-estimators satisfy an exact

  8. Manipulation Robustness of Collaborative Filtering

    OpenAIRE

    Benjamin Van Roy; Xiang Yan

    2010-01-01

    A collaborative filtering system recommends to users products that similar users like. Collaborative filtering systems influence purchase decisions and hence have become targets of manipulation by unscrupulous vendors. We demonstrate that nearest neighbors algorithms, which are widely used in commercial systems, are highly susceptible to manipulation and introduce new collaborative filtering algorithms that are relatively robust.

  9. Robustness Regions for Dichotomous Decisions.

    Science.gov (United States)

    Vijn, Pieter; Molenaar, Ivo W.

    1981-01-01

    In the case of dichotomous decisions, the total set of all assumptions/specifications for which the decision would have been the same is the robustness region. Inspection of this (data-dependent) region is a form of sensitivity analysis which may lead to improved decision making. (Author/BW)

  10. Theoretical Framework for Robustness Evaluation

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard

    2011-01-01

    This paper presents a theoretical framework for evaluation of robustness of structural systems, incl. bridges and buildings. Typically modern structural design codes require that ‘the consequence of damages to structures should not be disproportional to the causes of the damages’. However, althou...

  11. Robust control design with MATLAB

    CERN Document Server

    Gu, Da-Wei; Konstantinov, Mihail M

    2013-01-01

    Robust Control Design with MATLAB® (second edition) helps the student to learn how to use well-developed advanced robust control design methods in practical cases. To this end, several realistic control design examples from teaching-laboratory experiments, such as a two-wheeled, self-balancing robot, to complex systems like a flexible-link manipulator are given detailed presentation. All of these exercises are conducted using MATLAB® Robust Control Toolbox 3, Control System Toolbox and Simulink®. By sharing their experiences in industrial cases with minimum recourse to complicated theories and formulae, the authors convey essential ideas and useful insights into robust industrial control systems design using major H-infinity optimization and related methods allowing readers quickly to move on with their own challenges. The hands-on tutorial style of this text rests on an abundance of examples and features for the second edition: ·        rewritten and simplified presentation of theoretical and meth...

  12. Robust Portfolio Optimization Using Pseudodistances

    Science.gov (United States)

    2015-01-01

    The presence of outliers in financial asset returns is a frequently occurring phenomenon which may lead to unreliable mean-variance optimized portfolios. This fact is due to the unbounded influence that outliers can have on the mean returns and covariance estimators that are inputs in the optimization procedure. In this paper we present robust estimators of mean and covariance matrix obtained by minimizing an empirical version of a pseudodistance between the assumed model and the true model underlying the data. We prove and discuss theoretical properties of these estimators, such as affine equivariance, B-robustness, asymptotic normality and asymptotic relative efficiency. These estimators can be easily used in place of the classical estimators, thereby providing robust optimized portfolios. A Monte Carlo simulation study and applications to real data show the advantages of the proposed approach. We study both in-sample and out-of-sample performance of the proposed robust portfolios comparing them with some other portfolios known in literature. PMID:26468948

  13. Facial Symmetry in Robust Anthropometrics

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2012-01-01

    Roč. 57, č. 3 (2012), s. 691-698 ISSN 0022-1198 R&D Projects: GA MŠk(CZ) 1M06014 Institutional research plan: CEZ:AV0Z10300504 Keywords : forensic science * anthropology * robust image analysis * correlation analysis * multivariate data * classification Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 1.244, year: 2012

  14. Sparse and Robust Factor Modelling

    NARCIS (Netherlands)

    C. Croux (Christophe); P. Exterkate (Peter)

    2011-01-01

    textabstractFactor construction methods are widely used to summarize a large panel of variables by means of a relatively small number of representative factors. We propose a novel factor construction procedure that enjoys the properties of robustness to outliers and of sparsity; that is, having

  15. Robust distributed cognitive relay beamforming

    KAUST Repository

    Pandarakkottilil, Ubaidulla; Aissa, Sonia

    2012-01-01

    design takes into account a parameter of the error in the channel state information (CSI) to render the performance of the beamformer robust in the presence of imperfect CSI. Though the original problem is non-convex, we show that the proposed design can

  16. Approximability of Robust Network Design

    NARCIS (Netherlands)

    Olver, N.K.; Shepherd, F.B.

    2014-01-01

    We consider robust (undirected) network design (RND) problems where the set of feasible demands may be given by an arbitrary convex body. This model, introduced by Ben-Ameur and Kerivin [Ben-Ameur W, Kerivin H (2003) New economical virtual private networks. Comm. ACM 46(6):69-73], generalizes the

  17. Supplementary Material for: BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.; Alam, Intikhab; Bajic, Vladimir B.

    2015-01-01

    Abstract Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACONâ s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  18. Combining gene prediction methods to improve metagenomic gene annotation

    Directory of Open Access Journals (Sweden)

    Rosen Gail L

    2011-01-01

    Full Text Available Abstract Background Traditional gene annotation methods rely on characteristics that may not be available in short reads generated from next generation technology, resulting in suboptimal performance for metagenomic (environmental samples. Therefore, in recent years, new programs have been developed that optimize performance on short reads. In this work, we benchmark three metagenomic gene prediction programs and combine their predictions to improve metagenomic read gene annotation. Results We not only analyze the programs' performance at different read-lengths like similar studies, but also separate different types of reads, including intra- and intergenic regions, for analysis. The main deficiencies are in the algorithms' ability to predict non-coding regions and gene edges, resulting in more false-positives and false-negatives than desired. In fact, the specificities of the algorithms are notably worse than the sensitivities. By combining the programs' predictions, we show significant improvement in specificity at minimal cost to sensitivity, resulting in 4% improvement in accuracy for 100 bp reads with ~1% improvement in accuracy for 200 bp reads and above. To correctly annotate the start and stop of the genes, we find that a consensus of all the predictors performs best for shorter read lengths while a unanimous agreement is better for longer read lengths, boosting annotation accuracy by 1-8%. We also demonstrate use of the classifier combinations on a real dataset. Conclusions To optimize the performance for both prediction and annotation accuracies, we conclude that the consensus of all methods (or a majority vote is the best for reads 400 bp and shorter, while using the intersection of GeneMark and Orphelia predictions is the best for reads 500 bp and longer. We demonstrate that most methods predict over 80% coding (including partially coding reads on a real human gut sample sequenced by Illumina technology.

  19. SAS- Semantic Annotation Service for Geoscience resources on the web

    Science.gov (United States)

    Elag, M.; Kumar, P.; Marini, L.; Li, R.; Jiang, P.

    2015-12-01

    There is a growing need for increased integration across the data and model resources that are disseminated on the web to advance their reuse across different earth science applications. Meaningful reuse of resources requires semantic metadata to realize the semantic web vision for allowing pragmatic linkage and integration among resources. Semantic metadata associates standard metadata with resources to turn them into semantically-enabled resources on the web. However, the lack of a common standardized metadata framework as well as the uncoordinated use of metadata fields across different geo-information systems, has led to a situation in which standards and related Standard Names abound. To address this need, we have designed SAS to provide a bridge between the core ontologies required to annotate resources and information systems in order to enable queries and analysis over annotation from a single environment (web). SAS is one of the services that are provided by the Geosematnic framework, which is a decentralized semantic framework to support the integration between models and data and allow semantically heterogeneous to interact with minimum human intervention. Here we present the design of SAS and demonstrate its application for annotating data and models. First we describe how predicates and their attributes are extracted from standards and ingested in the knowledge-base of the Geosemantic framework. Then we illustrate the application of SAS in annotating data managed by SEAD and annotating simulation models that have web interface. SAS is a step in a broader approach to raise the quality of geoscience data and models that are published on the web and allow users to better search, access, and use of the existing resources based on standard vocabularies that are encoded and published using semantic technologies.

  20. NegGOA: negative GO annotations selection using ontology structure.

    Science.gov (United States)

    Fu, Guangyuan; Wang, Jun; Yang, Bo; Yu, Guoxian

    2016-10-01

    Predicting the biological functions of proteins is one of the key challenges in the post-genomic era. Computational models have demonstrated the utility of applying machine learning methods to predict protein function. Most prediction methods explicitly require a set of negative examples-proteins that are known not carrying out a particular function. However, Gene Ontology (GO) almost always only provides the knowledge that proteins carry out a particular function, and functional annotations of proteins are incomplete. GO structurally organizes more than tens of thousands GO terms and a protein is annotated with several (or dozens) of these terms. For these reasons, the negative examples of a protein can greatly help distinguishing true positive examples of the protein from such a large candidate GO space. In this paper, we present a novel approach (called NegGOA) to select negative examples. Specifically, NegGOA takes advantage of the ontology structure, available annotations and potentiality of additional annotations of a protein to choose negative examples of the protein. We compare NegGOA with other negative examples selection algorithms and find that NegGOA produces much fewer false negatives than them. We incorporate the selected negative examples into an efficient function prediction model to predict the functions of proteins in Yeast, Human, Mouse and Fly. NegGOA also demonstrates improved accuracy than these comparing algorithms across various evaluation metrics. In addition, NegGOA is less suffered from incomplete annotations of proteins than these comparing methods. The Matlab and R codes are available at https://sites.google.com/site/guoxian85/neggoa gxyu@swu.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  1. Systematic metabolite annotation and identification in complex biological extracts : combining robust mass spectrometry fragmentation and nuclear magnetic resonance spectroscopy

    NARCIS (Netherlands)

    Hooft, van der J.J.J.

    2012-01-01

    Detailed knowledge of the chemical content of organisms, organs, tissues, and cells is needed to fully characterize complex biological systems. The high chemical variety of compounds present in biological systems is illustrated by the presence of a large variety of compounds, ranging from apolar

  2. OLS Client and OLS Dialog: Open Source Tools to Annotate Public Omics Datasets.

    Science.gov (United States)

    Perez-Riverol, Yasset; Ternent, Tobias; Koch, Maximilian; Barsnes, Harald; Vrousgou, Olga; Jupp, Simon; Vizcaíno, Juan Antonio

    2017-10-01

    The availability of user-friendly software to annotate biological datasets and experimental details is becoming essential in data management practices, both in local storage systems and in public databases. The Ontology Lookup Service (OLS, http://www.ebi.ac.uk/ols) is a popular centralized service to query, browse and navigate biomedical ontologies and controlled vocabularies. Recently, the OLS framework has been completely redeveloped (version 3.0), including enhancements in the data model, like the added support for Web Ontology Language based ontologies, among many other improvements. However, the new OLS is not backwards compatible and new software tools are needed to enable access to this widely used framework now that the previous version is no longer available. We here present the OLS Client as a free, open-source Java library to retrieve information from the new version of the OLS. It enables rapid tool creation by providing a robust, pluggable programming interface and common data model to programmatically access the OLS. The library has already been integrated and is routinely used by several bioinformatics resources and related data annotation tools. Secondly, we also introduce an updated version of the OLS Dialog (version 2.0), a Java graphical user interface that can be easily plugged into Java desktop applications to access the OLS. The software and related documentation are freely available at https://github.com/PRIDE-Utilities/ols-client and https://github.com/PRIDE-Toolsuite/ols-dialog. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Music journals in South Africa 1854-2010: an annotated bibliography

    African Journals Online (AJOL)

    Music journals in South Africa 1854-2010: an annotated bibliography. ... The article focuses on presenting an annotated bibliography of music journalism in South Africa from as early as 1854 until 2010. Most of ... Key words: annotated bibliography, electronic journals, music journals, periodicals, South African music history ...

  4. The Effects of Multimedia Annotations on Iranian EFL Learners’ L2 Vocabulary Learning

    Directory of Open Access Journals (Sweden)

    Saeideh Ahangari

    2010-05-01

    Full Text Available In our modern technological world, Computer-Assisted Language learning (CALL is a new realm towards learning a language in general, and learning L2 vocabulary in particular. It is assumed that the use of multimedia annotations promotes language learners’ vocabulary acquisition. Therefore, this study set out to investigate the effects of different multimedia annotations (still picture annotations, dynamic picture annotations, and written annotations on L2 vocabulary learning. To fulfill this objective, the researchers selected sixty four EFL learners as the participants of this study. The participants were randomly assigned to one of the four groups: a control group that received no annotations and three experimental groups that received:  still picture annotations, dynamic picture annotations, and written annotations. Each participant was required to take a pre-test. A vocabulary post- test was also designed and administered to the participants in order to assess the efficacy of each annotation. First for each group a paired t-test was conducted between their pre and post test scores in order to observe their improvement; then through an ANCOVA test the performance of four groups was compared. The results showed that using multimedia annotations resulted in a significant difference in the participants’ vocabulary learning. Based on the results of the present study, multimedia annotations are suggested as a vocabulary teaching strategy.

  5. Effects of Reviewing Annotations and Homework Solutions on Math Learning Achievement

    Science.gov (United States)

    Hwang, Wu-Yuin; Chen, Nian-Shing; Shadiev, Rustam; Li, Jin-Sing

    2011-01-01

    Previous studies have demonstrated that making annotations can be a meaningful and useful learning method that promote metacognition and enhance learning achievement. A web-based annotation system, Virtual Pen (VPEN), which provides for the creation and review of annotations and homework solutions, has been developed to foster learning process…

  6. Effects of Annotations and Homework on Learning Achievement: An Empirical Study of Scratch Programming Pedagogy

    Science.gov (United States)

    Su, Addison Y. S.; Huang, Chester S. J.; Yang, Stephen J. H.; Ding, T. J.; Hsieh, Y. Z.

    2015-01-01

    In Taiwan elementary schools, Scratch programming has been taught for more than four years. Previous studies have shown that personal annotations is a useful learning method that improve learning performance. An annotation-based Scratch programming (ASP) system provides for the creation, share, and review of annotations and homework solutions in…

  7. Essential Annotation Schema for Ecology (EASE)-A framework supporting the efficient data annotation and faceted navigation in ecology.

    Science.gov (United States)

    Pfaff, Claas-Thido; Eichenberg, David; Liebergesell, Mario; König-Ries, Birgitta; Wirth, Christian

    2017-01-01

    Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only) is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines.

  8. Essential Annotation Schema for Ecology (EASE)—A framework supporting the efficient data annotation and faceted navigation in ecology

    Science.gov (United States)

    Eichenberg, David; Liebergesell, Mario; König-Ries, Birgitta; Wirth, Christian

    2017-01-01

    Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only) is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines. PMID:29023519

  9. Essential Annotation Schema for Ecology (EASE-A framework supporting the efficient data annotation and faceted navigation in ecology.

    Directory of Open Access Journals (Sweden)

    Claas-Thido Pfaff

    Full Text Available Ecology has become a data intensive science over the last decades which often relies on the reuse of data in cross-experimental analyses. However, finding data which qualifies for the reuse in a specific context can be challenging. It requires good quality metadata and annotations as well as efficient search strategies. To date, full text search (often on the metadata only is the most widely used search strategy although it is known to be inaccurate. Faceted navigation is providing a filter mechanism which is based on fine granular metadata, categorizing search objects along numeric and categorical parameters relevant for their discovery. Selecting from these parameters during a full text search creates a system of filters which allows to refine and improve the results towards more relevance. We developed a framework for the efficient annotation and faceted navigation in ecology. It consists of an XML schema for storing the annotation of search objects and is accompanied by a vocabulary focused on ecology to support the annotation process. The framework consolidates ideas which originate from widely accepted metadata standards, textbooks, scientific literature, and vocabularies as well as from expert knowledge contributed by researchers from ecology and adjacent disciplines.

  10. Robust Reliability or reliable robustness? - Integrated consideration of robustness and reliability aspects

    DEFF Research Database (Denmark)

    Kemmler, S.; Eifler, Tobias; Bertsche, B.

    2015-01-01

    products are and vice versa. For a comprehensive understanding and to use existing synergies between both domains, this paper discusses the basic principles of Reliability- and Robust Design theory. The development of a comprehensive model will enable an integrated consideration of both domains...

  11. NoGOA: predicting noisy GO annotations using evidences and sparse representation.

    Science.gov (United States)

    Yu, Guoxian; Lu, Chang; Wang, Jun

    2017-07-21

    Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA .

  12. WImpiBLAST: web interface for mpiBLAST to help biologists perform large-scale annotation using high performance computing.

    Directory of Open Access Journals (Sweden)

    Parichit Sharma

    Full Text Available The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture

  13. WImpiBLAST: web interface for mpiBLAST to help biologists perform large-scale annotation using high performance computing.

    Science.gov (United States)

    Sharma, Parichit; Mantri, Shrikant S

    2014-01-01

    The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC) clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI) are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture, explain design

  14. Robust Watermarking of Video Streams

    Directory of Open Access Journals (Sweden)

    T. Polyák

    2006-01-01

    Full Text Available In the past few years there has been an explosion in the use of digital video data. Many people have personal computers at home, and with the help of the Internet users can easily share video files on their computer. This makes possible the unauthorized use of digital media, and without adequate protection systems the authors and distributors have no means to prevent it.Digital watermarking techniques can help these systems to be more effective by embedding secret data right into the video stream. This makes minor changes in the frames of the video, but these changes are almost imperceptible to the human visual system. The embedded information can involve copyright data, access control etc. A robust watermark is resistant to various distortions of the video, so it cannot be removed without affecting the quality of the host medium. In this paper I propose a video watermarking scheme that fulfills the requirements of a robust watermark. 

  15. Robust Decentralized Formation Flight Control

    Directory of Open Access Journals (Sweden)

    Zhao Weihua

    2011-01-01

    Full Text Available Motivated by the idea of multiplexed model predictive control (MMPC, this paper introduces a new framework for unmanned aerial vehicles (UAVs formation flight and coordination. Formulated using MMPC approach, the whole centralized formation flight system is considered as a linear periodic system with control inputs of each UAV subsystem as its periodic inputs. Divided into decentralized subsystems, the whole formation flight system is guaranteed stable if proper terminal cost and terminal constraints are added to each decentralized MPC formulation of the UAV subsystem. The decentralized robust MPC formulation for each UAV subsystem with bounded input disturbances and model uncertainties is also presented. Furthermore, an obstacle avoidance control scheme for any shape and size of obstacles, including the nonapriorily known ones, is integrated under the unified MPC framework. The results from simulations demonstrate that the proposed framework can successfully achieve robust collision-free formation flights.

  16. Inefficient but robust public leadership.

    OpenAIRE

    Matsumura, Toshihiro; Ogawa, Akira

    2014-01-01

    We investigate endogenous timing in a mixed duopoly in a differentiated product market. We find that private leadership is better than public leadership from a social welfare perspective if the private firm is domestic, regardless of the degree of product differentiation. Nevertheless, the public leadership equilibrium is risk-dominant, and it is thus robust if the degree of product differentiation is high. We also find that regardless of the degree of product differentiation, the public lead...

  17. Testing Heteroscedasticity in Robust Regression

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2011-01-01

    Roč. 1, č. 4 (2011), s. 25-28 ISSN 2045-3345 Grant - others:GA ČR(CZ) GA402/09/0557 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust regression * heteroscedasticity * regression quantiles * diagnostics Subject RIV: BB - Applied Statistics , Operational Research http://www.researchjournals.co.uk/documents/Vol4/06%20Kalina.pdf

  18. Robust power system frequency control

    CERN Document Server

    Bevrani, Hassan

    2014-01-01

    This updated edition of the industry standard reference on power system frequency control provides practical, systematic and flexible algorithms for regulating load frequency, offering new solutions to the technical challenges introduced by the escalating role of distributed generation and renewable energy sources in smart electric grids. The author emphasizes the physical constraints and practical engineering issues related to frequency in a deregulated environment, while fostering a conceptual understanding of frequency regulation and robust control techniques. The resulting control strategi

  19. Evaluation of web-based annotation of ophthalmic images for multicentric clinical trials.

    Science.gov (United States)

    Chalam, K V; Jain, P; Shah, V A; Shah, Gaurav Y

    2006-06-01

    An Internet browser-based annotation system can be used to identify and describe features in digitalized retinal images, in multicentric clinical trials, in real time. In this web-based annotation system, the user employs a mouse to draw and create annotations on a transparent layer, that encapsulates the observations and interpretations of a specific image. Multiple annotation layers may be overlaid on a single image. These layers may correspond to annotations by different users on the same image or annotations of a temporal sequence of images of a disease process, over a period of time. In addition, geometrical properties of annotated figures may be computed and measured. The annotations are stored in a central repository database on a server, which can be retrieved by multiple users in real time. This system facilitates objective evaluation of digital images and comparison of double-blind readings of digital photographs, with an identifiable audit trail. Annotation of ophthalmic images allowed clinically feasible and useful interpretation to track properties of an area of fundus pathology. This provided an objective method to monitor properties of pathologies over time, an essential component of multicentric clinical trials. The annotation system also allowed users to view stereoscopic images that are stereo pairs. This web-based annotation system is useful and valuable in monitoring patient care, in multicentric clinical trials, telemedicine, teaching and routine clinical settings.

  20. Toward a Robust Method of Presenting a Rich, Interconnected Deceptive Network Topology

    Science.gov (United States)

    2015-03-01

    SDN Software - Defined Networking SNOS Systemic Network Obfuscation System SSH Secure Shell TCP Transmission Control Protocol TTL time to...same DeTracer functionality directly on routers. One particularly interesting research area would be the use of Software - Defined Networking ( SDN ) to...Hughes, “Employing deceptive dynamic network topology through software - defined networking ,” M.S. thesis, Dept. Comput. Sci., Naval

  1. Semantic Annotation of Unstructured Documents Using Concepts Similarity

    Directory of Open Access Journals (Sweden)

    Fernando Pech

    2017-01-01

    Full Text Available There is a large amount of information in the form of unstructured documents which pose challenges in the information storage, search, and retrieval. This situation has given rise to several information search approaches. Some proposals take into account the contextual meaning of the terms specified in the query. Semantic annotation technique can help to retrieve and extract information in unstructured documents. We propose a semantic annotation strategy for unstructured documents as part of a semantic search engine. In this proposal, ontologies are used to determine the context of the entities specified in the query. Our strategy for extracting the context is focused on concepts similarity. Each relevant term of the document is associated with an instance in the ontology. The similarity between each of the explicit relationships is measured through the combination of two types of associations: the association between each pair of concepts and the calculation of the weight of the relationships.

  2. Feedback Driven Annotation and Refactoring of Parallel Programs

    DEFF Research Database (Denmark)

    Larsen, Per

    and communication in embedded programs. Runtime checks are developed to ensure that annotations correctly describe observable program behavior. The performance impact of runtime checking is evaluated on several benchmark kernels and is negligible in all cases. The second aspect is compilation feedback. Annotations...... are not effective unless programmers are told how and when they are benecial. A prototype compilation feedback system was developed in collaboration with IBM Haifa Research Labs. It reports issues that prevent further analysis to the programmer. Performance evaluation shows that three programs performes signicantly......This thesis combines programmer knowledge and feedback to improve modeling and optimization of software. The research is motivated by two observations. First, there is a great need for automatic analysis of software for embedded systems - to expose and model parallelism inherent in programs. Second...

  3. Image annotation by deep neural networks with attention shaping

    Science.gov (United States)

    Zheng, Kexin; Lv, Shaohe; Ma, Fang; Chen, Fei; Jin, Chi; Dou, Yong

    2017-07-01

    Image annotation is a task of assigning semantic labels to an image. Recently, deep neural networks with visual attention have been utilized successfully in many computer vision tasks. In this paper, we show that conventional attention mechanism is easily misled by the salient class, i.e., the attended region always contains part of the image area describing the content of salient class at different attention iterations. To this end, we propose a novel attention shaping mechanism, which aims to maximize the non-overlapping area between consecutive attention processes by taking into account the history of previous attention vectors. Several weighting polices are studied to utilize the history information in different manners. In two benchmark datasets, i.e., PASCAL VOC2012 and MIRFlickr-25k, the average precision is improved by up to 10% in comparison with the state-of-the-art annotation methods.

  4. Early experiences with crowdsourcing airway annotations in chest CT

    DEFF Research Database (Denmark)

    Cheplygina, Veronika; Perez-Rovira, Adria; Kuo, Wieying

    2016-01-01

    Measuring airways in chest computed tomography (CT) images is important for characterizing diseases such as cystic fibrosis, yet very time-consuming to perform manually. Machine learning algorithms offer an alternative, but need large sets of annotated data to perform well. We investigate whether...... a number of further research directions and provide insight into the challenges of crowdsourcing in medical images from the perspective of first-time users....

  5. An Annotated Checklist of the Mammals of Kuwait

    Directory of Open Access Journals (Sweden)

    Peter J. Cowan

    2013-12-01

    Full Text Available An annotated checklist of the mammals of Kuwait is presented, based on the literature, personal communications, a Kuwait website and a blog and the author’s observations. Twenty five species occur, a further four are uncommon or rare visitors, six used to occur whilst another two are of doubtful provenance. This list should assist those planning desert rehabilitation, animal reintroduction and protected area projects in Kuwait.

  6. ANNOTATED BIBLIOGRAPHY OF RUSSIAN LITERATURE ON GLACIOLOGY IN 2010

    Directory of Open Access Journals (Sweden)

    V. M. Kotlyakov

    2012-01-01

    Full Text Available This bibliography includes 294 books and articles published mostly in 2010. Annotated books and articles relate to ten thematic sections of glaciology:1. General problems of glaciology2. Physics and chemistry of ice3. Ice in atmosphere4. Snow cover5. Snow avalanches and glacial mudflows6. Sea ice7. River and lake ice8. Icings and ground ice9. Glaciers and ice sheets10. Paleoglaciology

  7. Recruiting and Retaining Army Nurses: An Annotated Bibliography

    OpenAIRE

    Roberts, Benjamin J.; Kocher, Kathryn M.

    1988-01-01

    This listing of annotated references includes studies dealing with the labor market behavior of registered nurses. References describing both the military and the civilian working environments for RNs are contained in the bibliography. Because the Army must recruit and retain nurses in the context of the national labor market for nurses, a broad perspective was maintained in selecting publication. Studies dealing with the factors influential in attracting and retaining Army Active Duty and Re...

  8. Fast Arc-Annotated Subsequence Matching in Linear Space

    DEFF Research Database (Denmark)

    Bille, Philip; Gørtz, Inge Li

    2012-01-01

    for investigating the function of RNA molecules. Gramm et al. (ACM Trans. Algorithms 2(1): 44-65, 2006) gave an algorithm for this problem using O(nm) time and space, where m and n are the lengths of P and Q, respectively. In this paper we present a new algorithm using O(nm) time and O(n+m) space, thereby matching......-annotated strings. © 2010 Springer Science+Business Media, LLC....

  9. Genome sequencing and annotation of Serratia sp. strain TEL.

    Science.gov (United States)

    Lephoto, Tiisetso E; Gray, Vincent M

    2015-12-01

    We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410). This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926) collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000.

  10. Genome sequencing and annotation of Serratia sp. strain TEL

    Directory of Open Access Journals (Sweden)

    Tiisetso E. Lephoto

    2015-12-01

    Full Text Available We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410. This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926 collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000.

  11. Genome sequencing and annotation of Serratia sp. strain TEL

    OpenAIRE

    Lephoto, Tiisetso E.; Gray, Vincent M.

    2015-01-01

    We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410). This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926) collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000.

  12. Grass buffers for playas in agricultural landscapes: An annotated bibliography

    Science.gov (United States)

    Melcher, Cynthia P.; Skagen, Susan K.

    2005-01-01

    This bibliography and associated literature synthesis (Melcher and Skagen, 2005) was developed for the Playa Lakes Joint Venture (PLJV). The PLJV sought compilation and annotation of the literature on grass buffers for protecting playas from runoff containing sediments, nutrients, pesticides, and other contaminants. In addition, PLJV sought information regarding the extent to which buffers may attenuate the precipitation runoff needed to fill playas, and avian use of buffers. We emphasize grass buffers, but we also provide information on other buffer types.

  13. Small molecule annotation for the Protein Data Bank.

    Science.gov (United States)

    Sen, Sanchayita; Young, Jasmine; Berrisford, John M; Chen, Minyu; Conroy, Matthew J; Dutta, Shuchismita; Di Costanzo, Luigi; Gao, Guanghua; Ghosh, Sutapa; Hudson, Brian P; Igarashi, Reiko; Kengaku, Yumiko; Liang, Yuhe; Peisach, Ezra; Persikova, Irina; Mukhopadhyay, Abhik; Narayanan, Buvaneswari Coimbatore; Sahni, Gaurav; Sato, Junko; Sekharan, Monica; Shao, Chenghua; Tan, Lihua; Zhuravleva, Marina A

    2014-01-01

    The Protein Data Bank (PDB) is the single global repository for three-dimensional structures of biological macromolecules and their complexes, and its more than 100,000 structures contain more than 20,000 distinct ligands or small molecules bound to proteins and nucleic acids. Information about these small molecules and their interactions with proteins and nucleic acids is crucial for our understanding of biochemical processes and vital for structure-based drug design. Small molecules present in a deposited structure may be attached to a polymer or may occur as a separate, non-covalently linked ligand. During curation of a newly deposited structure by wwPDB annotation staff, each molecule is cross-referenced to the PDB Chemical Component Dictionary (CCD). If the molecule is new to the PDB, a dictionary description is created for it. The information about all small molecule components found in the PDB is distributed via the ftp archive as an external reference file. Small molecule annotation in the PDB also includes information about ligand-binding sites and about covalent and other linkages between ligands and macromolecules. During the remediation of the peptide-like antibiotics and inhibitors present in the PDB archive in 2011, it became clear that additional annotation was required for consistent representation of these molecules, which are quite often composed of several sequential subcomponents including modified amino acids and other chemical groups. The connectivity information of the modified amino acids is necessary for correct representation of these biologically interesting molecules. The combined information is made available via a new resource called the Biologically Interesting molecules Reference Dictionary, which is complementary to the CCD and is now routinely used for annotation of peptide-like antibiotics and inhibitors. © The Author(s) 2014. Published by Oxford University Press.

  14. Guidelines for visualizing and annotating rule-based models†

    Science.gov (United States)

    Chylek, Lily A.; Hu, Bin; Blinov, Michael L.; Emonet, Thierry; Faeder, James R.; Goldstein, Byron; Gutenkunst, Ryan N.; Haugh, Jason M.; Lipniacki, Tomasz; Posner, Richard G.; Yang, Jin; Hlavacek, William S.

    2011-01-01

    Rule-based modeling provides a means to represent cell signaling systems in a way that captures site-specific details of molecular interactions. For rule-based models to be more widely understood and (re)used, conventions for model visualization and annotation are needed. We have developed the concepts of an extended contact map and a model guide for illustrating and annotating rule-based models. An extended contact map represents the scope of a model by providing an illustration of each molecule, molecular component, direct physical interaction, post-translational modification, and enzyme-substrate relationship considered in a model. A map can also illustrate allosteric effects, structural relationships among molecular components, and compartmental locations of molecules. A model guide associates elements of a contact map with annotation and elements of an underlying model, which may be fully or partially specified. A guide can also serve to document the biological knowledge upon which a model is based. We provide examples of a map and guide for a published rule-based model that characterizes early events in IgE receptor (FcεRI) signaling. We also provide examples of how to visualize a variety of processes that are common in cell signaling systems but not considered in the example model, such as ubiquitination. An extended contact map and an associated guide can document knowledge of a cell signaling system in a form that is visual as well as executable. As a tool for model annotation, a map and guide can communicate the content of a model clearly and with precision, even for large models. PMID:21647530

  15. Guidelines for visualizing and annotating rule-based models.

    Science.gov (United States)

    Chylek, Lily A; Hu, Bin; Blinov, Michael L; Emonet, Thierry; Faeder, James R; Goldstein, Byron; Gutenkunst, Ryan N; Haugh, Jason M; Lipniacki, Tomasz; Posner, Richard G; Yang, Jin; Hlavacek, William S

    2011-10-01

    Rule-based modeling provides a means to represent cell signaling systems in a way that captures site-specific details of molecular interactions. For rule-based models to be more widely understood and (re)used, conventions for model visualization and annotation are needed. We have developed the concepts of an extended contact map and a model guide for illustrating and annotating rule-based models. An extended contact map represents the scope of a model by providing an illustration of each molecule, molecular component, direct physical interaction, post-translational modification, and enzyme-substrate relationship considered in a model. A map can also illustrate allosteric effects, structural relationships among molecular components, and compartmental locations of molecules. A model guide associates elements of a contact map with annotation and elements of an underlying model, which may be fully or partially specified. A guide can also serve to document the biological knowledge upon which a model is based. We provide examples of a map and guide for a published rule-based model that characterizes early events in IgE receptor (FcεRI) signaling. We also provide examples of how to visualize a variety of processes that are common in cell signaling systems but not considered in the example model, such as ubiquitination. An extended contact map and an associated guide can document knowledge of a cell signaling system in a form that is visual as well as executable. As a tool for model annotation, a map and guide can communicate the content of a model clearly and with precision, even for large models.

  16. Cultural Influences on Intertemporal Reasoning: An Annotated Bibliography

    Science.gov (United States)

    2011-11-30

    relative. Loans have moral connotations. Both parties are “loved by Allah” (p. 117). But to ask for a loan implies that one is not self-sufficient...to Pukhtun. Debts to Pukhtuns must be repaid. When debts are owed to outsiders, there is no moral obligation to repay. Among themselves, Pukhtun...what truth is.” (p. 25) Annotated Bibliography 29 NOTE: In other sections of the book, Shuon discusses the relativism of Christianity in

  17. Protein Annotation from Protein Interaction Networks and Gene Ontology

    OpenAIRE

    Nguyen, Cao D.; Gardiner, Katheleen J.; Cios, Krzysztof J.

    2011-01-01

    We introduce a novel method for annotating protein function that combines Naïve Bayes and association rules, and takes advantage of the underlying topology in protein interaction networks and the structure of graphs in the Gene Ontology. We apply our method to proteins from the Human Protein Reference Database (HPRD) and show that, in comparison with other approaches, it predicts protein functions with significantly higher recall with no loss of precision. Specifically, it achieves 51% precis...

  18. A topic modeling approach for web service annotation

    Directory of Open Access Journals (Sweden)

    Leandro Ordóñez-Ante

    2014-06-01

    Full Text Available The actual implementation of semantic-based mechanisms for service retrieval has been restricted, given the resource-intensive procedure involved in the formal specification of services, which generally comprises associating semantic annotations to their documentation sources. Typically, developer performs such a procedure by hand, requiring specialized knowledge on models for semantic description of services (e.g. OWL-S, WSMO, SAWSDL, as well as formal specifications of knowledge. Thus, this semantic-based service description procedure turns out to be a cumbersome and error-prone task. This paper introduces a proposal for service annotation, based on processing web service documentation for extracting information regarding its offered capabilities. By uncovering the hidden semantic structure of such information through statistical analysis techniques, we are able to associate meaningful annotations to the services operations/resources, while grouping those operations into non-exclusive semantic related categories. This research paper belongs to the TelComp 2.0 project, which Colciencas and University of Cauca founded in cooperation.

  19. Functional Annotation of Ion Channel Structures by Molecular Simulation.

    Science.gov (United States)

    Trick, Jemma L; Chelvaniththilan, Sivapalan; Klesse, Gianni; Aryal, Prafulla; Wallace, E Jayne; Tucker, Stephen J; Sansom, Mark S P

    2016-12-06

    Ion channels play key roles in cell membranes, and recent advances are yielding an increasing number of structures. However, their functional relevance is often unclear and better tools are required for their functional annotation. In sub-nanometer pores such as ion channels, hydrophobic gating has been shown to promote dewetting to produce a functionally closed (i.e., non-conductive) state. Using the serotonin receptor (5-HT 3 R) structure as an example, we demonstrate the use of molecular dynamics to aid the functional annotation of channel structures via simulation of the behavior of water within the pore. Three increasingly complex simulation analyses are described: water equilibrium densities; single-ion free-energy profiles; and computational electrophysiology. All three approaches correctly predict the 5-HT 3 R crystal structure to represent a functionally closed (i.e., non-conductive) state. We also illustrate the application of water equilibrium density simulations to annotate different conformational states of a glycine receptor. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  20. Annotating Cancer Variants and Anti-Cancer Therapeutics in Reactome

    Energy Technology Data Exchange (ETDEWEB)

    Milacic, Marija; Haw, Robin, E-mail: robin.haw@oicr.on.ca; Rothfels, Karen; Wu, Guanming [Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON, M5G0A3 (Canada); Croft, David; Hermjakob, Henning [European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD (United Kingdom); D’Eustachio, Peter [Department of Biochemistry, NYU School of Medicine, New York, NY 10016 (United States); Stein, Lincoln [Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON, M5G0A3 (Canada)

    2012-11-08

    Reactome describes biological pathways as chemical reactions that closely mirror the actual physical interactions that occur in the cell. Recent extensions of our data model accommodate the annotation of cancer and other disease processes. First, we have extended our class of protein modifications to accommodate annotation of changes in amino acid sequence and the formation of fusion proteins to describe the proteins involved in disease processes. Second, we have added a disease attribute to reaction, pathway, and physical entity classes that uses disease ontology terms. To support the graphical representation of “cancer” pathways, we have adapted our Pathway Browser to display disease variants and events in a way that allows comparison with the wild type pathway, and shows connections between perturbations in cancer and other biological pathways. The curation of pathways associated with cancer, coupled with our efforts to create other disease-specific pathways, will interoperate with our existing pathway and network analysis tools. Using the Epidermal Growth Factor Receptor (EGFR) signaling pathway as an example, we show how Reactome annotates and presents the altered biological behavior of EGFR variants due to their altered kinase and ligand-binding properties, and the mode of action and specificity of anti-cancer therapeutics.

  1. iPad: Semantic annotation and markup of radiological images.

    Science.gov (United States)

    Rubin, Daniel L; Rodriguez, Cesar; Shah, Priyanka; Beaulieu, Chris

    2008-11-06

    Radiological images contain a wealth of information,such as anatomy and pathology, which is often not explicit and computationally accessible. Information schemes are being developed to describe the semantic content of images, but such schemes can be unwieldy to operationalize because there are few tools to enable users to capture structured information easily as part of the routine research workflow. We have created iPad, an open source tool enabling researchers and clinicians to create semantic annotations on radiological images. iPad hides the complexity of the underlying image annotation information model from users, permitting them to describe images and image regions using a graphical interface that maps their descriptions to structured ontologies semi-automatically. Image annotations are saved in a variety of formats,enabling interoperability among medical records systems, image archives in hospitals, and the Semantic Web. Tools such as iPad can help reduce the burden of collecting structured information from images, and it could ultimately enable researchers and physicians to exploit images on a very large scale and glean the biological and physiological significance of image content.

  2. Gene annotation from scientific literature using mappings between keyword systems.

    Science.gov (United States)

    Pérez, Antonio J; Perez-Iratxeta, Carolina; Bork, Peer; Thode, Guillermo; Andrade, Miguel A

    2004-09-01

    The description of genes in databases by keywords helps the non-specialist to quickly grasp the properties of a gene and increases the efficiency of computational tools that are applied to gene data (e.g. searching a gene database for sequences related to a particular biological process). However, the association of keywords to genes or protein sequences is a difficult process that ultimately implies examination of the literature related to a gene. To support this task, we present a procedure to derive keywords from the set of scientific abstracts related to a gene. Our system is based on the automated extraction of mappings between related terms from different databases using a model of fuzzy associations that can be applied with all generality to any pair of linked databases. We tested the system by annotating genes of the SWISS-PROT database with keywords derived from the abstracts linked to their entries (stored in the MEDLINE database of scientific references). The performance of the annotation procedure was much better for SWISS-PROT keywords (recall of 47%, precision of 68%) than for Gene Ontology terms (recall of 8%, precision of 67%). The algorithm can be publicly accessed and used for the annotation of sequences through a web server at http://www.bork.embl.de/kat

  3. Annotating cancer variants and anti-cancer therapeutics in reactome.

    Science.gov (United States)

    Milacic, Marija; Haw, Robin; Rothfels, Karen; Wu, Guanming; Croft, David; Hermjakob, Henning; D'Eustachio, Peter; Stein, Lincoln

    2012-11-08

    Reactome describes biological pathways as chemical reactions that closely mirror the actual physical interactions that occur in the cell. Recent extensions of our data model accommodate the annotation of cancer and other disease processes. First, we have extended our class of protein modifications to accommodate annotation of changes in amino acid sequence and the formation of fusion proteins to describe the proteins involved in disease processes. Second, we have added a disease attribute to reaction, pathway, and physical entity classes that uses disease ontology terms. To support the graphical representation of "cancer" pathways, we have adapted our Pathway Browser to display disease variants and events in a way that allows comparison with the wild type pathway, and shows connections between perturbations in cancer and other biological pathways. The curation of pathways associated with cancer, coupled with our efforts to create other disease-specific pathways, will interoperate with our existing pathway and network analysis tools. Using the Epidermal Growth Factor Receptor (EGFR) signaling pathway as an example, we show how Reactome annotates and presents the altered biological behavior of EGFR variants due to their altered kinase and ligand-binding properties, and the mode of action and specificity of anti-cancer therapeutics.

  4. Annotating and Interpreting Linear and Cyclic Peptide Tandem Mass Spectra.

    Science.gov (United States)

    Niedermeyer, Timo Horst Johannes

    2016-01-01

    Nonribosomal peptides often possess pronounced bioactivity, and thus, they are often interesting hit compounds in natural product-based drug discovery programs. Their mass spectrometric characterization is difficult due to the predominant occurrence of non-proteinogenic monomers and, especially in the case of cyclic peptides, the complex fragmentation patterns observed. This makes nonribosomal peptide tandem mass spectra annotation challenging and time-consuming. To meet this challenge, software tools for this task have been developed. In this chapter, the workflow for using the software mMass for the annotation of experimentally obtained peptide tandem mass spectra is described. mMass is freely available (http://www.mmass.org), open-source, and the most advanced and user-friendly software tool for this purpose. The software enables the analyst to concisely annotate and interpret tandem mass spectra of linear and cyclic peptides. Thus, it is highly useful for accelerating the structure confirmation and elucidation of cyclic as well as linear peptides and depsipeptides.

  5. Annotating the biomedical literature for the human variome.

    Science.gov (United States)

    Verspoor, Karin; Jimeno Yepes, Antonio; Cavedon, Lawrence; McIntosh, Tara; Herten-Crabb, Asha; Thomas, Zoë; Plazzer, John-Paul

    2013-01-01

    This article introduces the Variome Annotation Schema, a schema that aims to capture the core concepts and relations relevant to cataloguing and interpreting human genetic variation and its relationship to disease, as described in the published literature. The schema was inspired by the needs of the database curators of the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) database, but is intended to have application to genetic variation information in a range of diseases. The schema has been applied to a small corpus of full text journal publications on the subject of inherited colorectal cancer. We show that the inter-annotator agreement on annotation of this corpus ranges from 0.78 to 0.95 F-score across different entity types when exact matching is measured, and improves to a minimum F-score of 0.87 when boundary matching is relaxed. Relations show more variability in agreement, but several are reliable, with the highest, cohort-has-size, reaching 0.90 F-score. We also explore the relevance of the schema to the InSiGHT database curation process. The schema and the corpus represent an important new resource for the development of text mining solutions that address relationships among patient cohorts, disease and genetic variation, and therefore, we also discuss the role text mining might play in the curation of information related to the human variome. The corpus is available at http://opennicta.com/home/health/variome.

  6. USI: a fast and accurate approach for conceptual document annotation.

    Science.gov (United States)

    Fiorini, Nicolas; Ranwez, Sylvie; Montmain, Jacky; Ranwez, Vincent

    2015-03-14

    Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task is time-consuming and requires a high-level of expertise in both the domain and the related ontology. Different strategies have thus been proposed to ease this indexing process, each one taking advantage from the features of the document. In this paper we present USI (User-oriented Semantic Indexer), a fast and intuitive method for indexing tasks. We introduce a solution to suggest a conceptual annotation for new entities based on related already indexed documents. Our results, compared to those obtained by previous authors using the MeSH thesaurus and a dataset of biomedical papers, show that the method surpasses text-specific methods in terms of both quality and speed. Evaluations are done via usual metrics and semantic similarity. By only relying on neighbor documents, the User-oriented Semantic Indexer does not need a representative learning set. Yet, it provides better results than the other approaches by giving a consistent annotation scored with a global criterion - instead of one score per concept.

  7. Annotating Cancer Variants and Anti-Cancer Therapeutics in Reactome

    International Nuclear Information System (INIS)

    Milacic, Marija; Haw, Robin; Rothfels, Karen; Wu, Guanming; Croft, David; Hermjakob, Henning; D’Eustachio, Peter; Stein, Lincoln

    2012-01-01

    Reactome describes biological pathways as chemical reactions that closely mirror the actual physical interactions that occur in the cell. Recent extensions of our data model accommodate the annotation of cancer and other disease processes. First, we have extended our class of protein modifications to accommodate annotation of changes in amino acid sequence and the formation of fusion proteins to describe the proteins involved in disease processes. Second, we have added a disease attribute to reaction, pathway, and physical entity classes that uses disease ontology terms. To support the graphical representation of “cancer” pathways, we have adapted our Pathway Browser to display disease variants and events in a way that allows comparison with the wild type pathway, and shows connections between perturbations in cancer and other biological pathways. The curation of pathways associated with cancer, coupled with our efforts to create other disease-specific pathways, will interoperate with our existing pathway and network analysis tools. Using the Epidermal Growth Factor Receptor (EGFR) signaling pathway as an example, we show how Reactome annotates and presents the altered biological behavior of EGFR variants due to their altered kinase and ligand-binding properties, and the mode of action and specificity of anti-cancer therapeutics

  8. Robustness Analysis of Timber Truss Structure

    DEFF Research Database (Denmark)

    Rajčić, Vlatka; Čizmar, Dean; Kirkegaard, Poul Henning

    2010-01-01

    The present paper discusses robustness of structures in general and the robustness requirements given in the codes. Robustness of timber structures is also an issues as this is closely related to Working group 3 (Robustness of systems) of the COST E55 project. Finally, an example of a robustness...... evaluation of a widespan timber truss structure is presented. This structure was built few years ago near Zagreb and has a span of 45m. Reliability analysis of the main members and the system is conducted and based on this a robustness analysis is preformed....

  9. NeXML: rich, extensible, and verifiable representation of comparative data and metadata.

    Science.gov (United States)

    Vos, Rutger A; Balhoff, James P; Caravas, Jason A; Holder, Mark T; Lapp, Hilmar; Maddison, Wayne P; Midford, Peter E; Priyam, Anurag; Sukumaran, Jeet; Xia, Xuhua; Stoltzfus, Arlin

    2012-07-01

    In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input-output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML.

  10. Soliton robustness in optical fibers

    International Nuclear Information System (INIS)

    Menyuk, C.R.

    1993-01-01

    Simulations and experiments indicate that solitons in optical fibers are robust in the presence of Hamiltonian deformations such as higher-order dispersion and birefringence but are destroyed in the presence of non-Hamiltonian deformations such as attenuation and the Raman effect. Two hypotheses are introduced that generalize these observations and give a recipe for when deformations will be Hamiltonian. Concepts from nonlinear dynamics are used to make these two hypotheses plausible. Soliton stabilization with frequency filtering is also briefly discussed from this point of view

  11. Robust and Sparse Factor Modelling

    DEFF Research Database (Denmark)

    Croux, Christophe; Exterkate, Peter

    Factor construction methods are widely used to summarize a large panel of variables by means of a relatively small number of representative factors. We propose a novel factor construction procedure that enjoys the properties of robustness to outliers and of sparsity; that is, having relatively few...... nonzero factor loadings. Compared to the traditional factor construction method, we find that this procedure leads to a favorable forecasting performance in the presence of outliers and to better interpretable factors. We investigate the performance of the method in a Monte Carlo experiment...

  12. Robustness Evaluation of Timber Structures

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning; Sørensen, John Dalsgaard; čizmar, D.

    2010-01-01

    The present paper outlines results from working group 3 (WG3) in the EU COST Action E55 – ‘Modelling of the performance of timber structures’. The objectives of the project are related to the three main research activities: the identification and modelling of relevant load and environmental...... exposure scenarios, the improvement of knowledge concerning the behaviour of timber structural elements and the development of a generic framework for the assessment of the life-cycle vulnerability and robustness of timber structures....

  13. Sustainable Resilient, Robust & Resplendent Enterprises

    DEFF Research Database (Denmark)

    Edgeman, Rick

    to their impact. Resplendent enterprises are introduced with resplendence referring not to some sort of public or private façade, but instead refers to organizations marked by dual brilliance and nobility of strategy, governance and comportment that yields superior and sustainable triple bottom line performance....... Herein resilience, robustness, and resplendence (R3) are integrated with sustainable enterprise excellence (Edgeman and Eskildsen, 2013) or SEE and social-ecological innovation (Eskildsen and Edgeman, 2012) to aid progress of a firm toward producing continuously relevant performance that proceed from...

  14. Models of alien species richness show moderate predictive accuracy and poor transferability

    Directory of Open Access Journals (Sweden)

    César Capinha

    2018-06-01

    Full Text Available Robust predictions of alien species richness are useful to assess global biodiversity change. Nevertheless, the capacity to predict spatial patterns of alien species richness remains largely unassessed. Using 22 data sets of alien species richness from diverse taxonomic groups and covering various parts of the world, we evaluated whether different statistical models were able to provide useful predictions of absolute and relative alien species richness, as a function of explanatory variables representing geographical, environmental and socio-economic factors. Five state-of-the-art count data modelling techniques were used and compared: Poisson and negative binomial generalised linear models (GLMs, multivariate adaptive regression splines (MARS, random forests (RF and boosted regression trees (BRT. We found that predictions of absolute alien species richness had a low to moderate accuracy in the region where the models were developed and a consistently poor accuracy in new regions. Predictions of relative richness performed in a superior manner in both geographical settings, but still were not good. Flexible tree ensembles-type techniques (RF and BRT were shown to be significantly better in modelling alien species richness than parametric linear models (such as GLM, despite the latter being more commonly applied for this purpose. Importantly, the poor spatial transferability of models also warrants caution in assuming the generality of the relationships they identify, e.g. by applying projections under future scenario conditions. Ultimately, our results strongly suggest that predictability of spatial variation in richness of alien species richness is limited. The somewhat more robust ability to rank regions according to the number of aliens they have (i.e. relative richness, suggests that models of aliens species richness may be useful for prioritising and comparing regions, but not for predicting exact species numbers.

  15. Robust and rapid algorithms facilitate large-scale whole genome sequencing downstream analysis in an integrative framework.

    Science.gov (United States)

    Li, Miaoxin; Li, Jiang; Li, Mulin Jun; Pan, Zhicheng; Hsu, Jacob Shujui; Liu, Dajiang J; Zhan, Xiaowei; Wang, Junwen; Song, Youqiang; Sham, Pak Chung

    2017-05-19

    Whole genome sequencing (WGS) is a promising strategy to unravel variants or genes responsible for human diseases and traits. However, there is a lack of robust platforms for a comprehensive downstream analysis. In the present study, we first proposed three novel algorithms, sequence gap-filled gene feature annotation, bit-block encoded genotypes and sectional fast access to text lines to address three fundamental problems. The three algorithms then formed the infrastructure of a robust parallel computing framework, KGGSeq, for integrating downstream analysis functions for whole genome sequencing data. KGGSeq has been equipped with a comprehensive set of analysis functions for quality control, filtration, annotation, pathogenic prediction and statistical tests. In the tests with whole genome sequencing data from 1000 Genomes Project, KGGSeq annotated several thousand more reliable non-synonymous variants than other widely used tools (e.g. ANNOVAR and SNPEff). It took only around half an hour on a small server with 10 CPUs to access genotypes of ∼60 million variants of 2504 subjects, while a popular alternative tool required around one day. KGGSeq's bit-block genotype format used 1.5% or less space to flexibly represent phased or unphased genotypes with multiple alleles and achieved a speed of over 1000 times faster to calculate genotypic correlation. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Robust Instrumentation[Water treatment for power plant]; Robust Instrumentering

    Energy Technology Data Exchange (ETDEWEB)

    Wik, Anders [Vattenfall Utveckling AB, Stockholm (Sweden)

    2003-08-01

    Cementa Slite Power Station is a heat recovery steam generator (HRSG) with moderate steam data; 3.0 MPa and 420 deg C. The heat is recovered from Cementa, a cement industry, without any usage of auxiliary fuel. The Power station commenced operation in 2001. The layout of the plant is unusual, there are no similar in Sweden and very few world-wide, so the operational experiences are limited. In connection with the commissioning of the power plant a R and D project was identified with the objective to minimise the manpower needed for chemistry management of the plant. The lean chemistry management is based on robust instrumentation and chemical-free water treatment plant. The concept with robust instrumentation consists of the following components; choice of on-line instrumentation with a minimum of O and M and a chemical-free water treatment. The parameters are specific conductivity, cation conductivity, oxygen and pH. In addition to that, two fairly new on-line instruments were included; corrosion monitors and differential pH calculated from specific and cation conductivity. The chemical-free water treatment plant consists of softening, reverse osmosis and electro-deionisation. The operational experience shows that the cycle chemistry is not within the guidelines due to major problems with the operation of the power plant. These problems have made it impossible to reach steady state and thereby not viable to fully verify and validate the concept with robust instrumentation. From readings on the panel of the online analysers some conclusions may be drawn, e.g. the differential pH measurements have fulfilled the expectations. The other on-line analysers have been working satisfactorily apart from contamination with turbine oil, which has been noticed at least twice. The corrosion monitors seem to be working but the lack of trend curves from the mainframe computer system makes it hard to draw any clear conclusions. The chemical-free water treatment has met all

  17. Perspective: Evolution and detection of genetic robustness

    NARCIS (Netherlands)

    Visser, de J.A.G.M.; Hermisson, J.; Wagner, G.P.; Ancel Meyers, L.; Bagheri-Chaichian, H.; Blanchard, J.L.; Chao, L.; Cheverud, J.M.; Elena, S.F.; Fontana, W.; Gibson, G.; Hansen, T.F.; Krakauer, D.; Lewontin, R.C.; Ofria, C.; Rice, S.H.; Dassow, von G.; Wagner, A.; Whitlock, M.C.

    2003-01-01

    Robustness is the invariance of phenotypes in the face of perturbation. The robustness of phenotypes appears at various levels of biological organization, including gene expression, protein folding, metabolic flux, physiological homeostasis, development, and even organismal fitness. The mechanisms

  18. Robust lyapunov controller for uncertain systems

    KAUST Repository

    Laleg-Kirati, Taous-Meriem; Elmetennani, Shahrazed

    2017-01-01

    Various examples of systems and methods are provided for Lyapunov control for uncertain systems. In one example, a system includes a process plant and a robust Lyapunov controller configured to control an input of the process plant. The robust

  19. Robust distributed cognitive relay beamforming

    KAUST Repository

    Pandarakkottilil, Ubaidulla

    2012-05-01

    In this paper, we present a distributed relay beamformer design for a cognitive radio network in which a cognitive (or secondary) transmit node communicates with a secondary receive node assisted by a set of cognitive non-regenerative relays. The secondary nodes share the spectrum with a licensed primary user (PU) node, and each node is assumed to be equipped with a single transmit/receive antenna. The interference to the PU resulting from the transmission from the cognitive nodes is kept below a specified limit. The proposed robust cognitive relay beamformer design seeks to minimize the total relay transmit power while ensuring that the transceiver signal-to-interference- plus-noise ratio and PU interference constraints are satisfied. The proposed design takes into account a parameter of the error in the channel state information (CSI) to render the performance of the beamformer robust in the presence of imperfect CSI. Though the original problem is non-convex, we show that the proposed design can be reformulated as a tractable convex optimization problem that can be solved efficiently. Numerical results are provided and illustrate the performance of the proposed designs for different network operating conditions and parameters. © 2012 IEEE.

  20. Supporting Keyword Search for Image Retrieval with Integration of Probabilistic Annotation

    Directory of Open Access Journals (Sweden)

    Tie Hua Zhou

    2015-05-01

    Full Text Available The ever-increasing quantities of digital photo resources are annotated with enriching vocabularies to form semantic annotations. Photo-sharing social networks have boosted the need for efficient and intuitive querying to respond to user requirements in large-scale image collections. In order to help users formulate efficient and effective image retrieval, we present a novel integration of a probabilistic model based on keyword query architecture that models the probability distribution of image annotations: allowing users to obtain satisfactory results from image retrieval via the integration of multiple annotations. We focus on the annotation integration step in order to specify the meaning of each image annotation, thus leading to the most representative annotations of the intent of a keyword search. For this demonstration, we show how a probabilistic model has been integrated to semantic annotations to allow users to intuitively define explicit and precise keyword queries in order to retrieve satisfactory image results distributed in heterogeneous large data sources. Our experiments on SBU (collected by Stony Brook University database show that (i our integrated annotation contains higher quality representatives and semantic matches; and (ii the results indicating annotation integration can indeed improve image search result quality.